Atom feed of this document
 
 
 

 B.2. Agent Check Types

 B.2.1. agent.filesystem

 Summary

The agent.filesystem check exposes file system related metrics (free space, used space, etc.)

Table B.22. Attributes
FieldDescriptionValidation
target The mount point to check (eg '/var' or 'C:')
  • String between 1 and 512 characters long

Table B.23. Metrics
MetricDescriptionType
avail Available space on the filesystem in kilobytes, not including reserved space. Int64
free Free space available on the filesystem in kilobytes, including reserved space. Int64
used Used space on the filesystem, in kilobytes. Int64
total Total space on the filesystem, in kilobytes. Int64
files Number of inodes on the filesystem. Note: this metric is not available on Windows. Int64
free_files Number of free inodes on the filesystem. Note: this metric is not available on Windows. Int64

 B.2.2. agent.memory

Table B.24. Attributes
FieldDescriptionValidation
No fields are present for this particular check type.
Table B.25. Metrics
MetricDescriptionType
free The amount of memory free, in bytes. Note: system buffers are not considered "free", and for this reason it is generally a better idea to alert on "actual_free" instead. Int64
actual_free The amount of memory free, including system buffers, in bytes. Because most operating systems will make use of otherwise free memory, this is a this is generally a better metric to alert on than "free". Int64
used The amount of memory used, in bytes. Note: system buffers are considered "used", and for this reason it is generally a better idea to alert on "actual_used" instead. Int64
actual_used The amount of memory used, excluding system buffers, in bytes. Because most operating systems will make use of otherwise used memory, this is a this is generally a better metric to alert on than "used". Int64
total The total amount of memory the system has access to, in bytes. This will be slightly less than the "ram" value as it excludes certain reserved memory and the space used by the kernel binary. Int64
ram The total amount of RAM in the system, in megabytes. Int64
swap_free The amount of swap free, in bytes. Int64
swap_used The amount of swap used, in bytes. Int64
swap_total The total amount of swap the system has, in bytes. Int64
swap_page_in The number of pages swapped in since last boot. Int64
swap_page_out The number of pages swapped out soutce last boot. Int64

 B.2.3. agent.load_average

 Summary

The agent.load_average check will attempt to measure the Unix-style Load Average on a host.

Table B.26. Attributes
FieldDescriptionValidation
No fields are present for this particular check type.
Table B.27. Metrics
MetricDescriptionType
1m One minute load average. Double
5m Five minute load average. Double
15m Fifteen minute load average. Double

 B.2.4. agent.cpu

 Summary

The agent.cpu check will attempt to measure the usage of the CPU on a host.

Table B.28. Attributes
FieldDescriptionValidation
No fields are present for this particular check type.
Table B.29. Metrics
MetricDescriptionType
max_cpu_usage Recent percentage utilization of the most-utilized CPU. This is useful to detect when some CPUs are "pegged" while others are idle. Double
min_cpu_usage Recent percentage utilization of the least-utilized CPU. This is useful to detect when some CPUs are "pegged" while others are idle. Double
user_percent_average Recent percentage of CPU time utilized by user mode processes. Double
wait_percent_average Recent percentage of CPU time utilized by processes in a "wait" state. Double
sys_percent_average Recent percentage of CPU time utilized by kernel mode processes. Double
irq_percent_average Recent percentage of CPU time spent handling hardware interrupts. Double
stolen_percent_average Recent percentage of CPU time spent waiting for the CPU to service other virtual CPUs. Double
idle_percent_average Recent percentage of CPU time spent idle. Double
usage_average Recent percentage of CPU time utilized by all processes. Double

 B.2.5. agent.disk

 Summary

The agent.disk check exposes disk related metrics (service time, wait time, etc.).

Table B.30. Attributes
FieldDescriptionValidation
target The disk to check (eg '/dev/xvda1')
  • String between 1 and 512 characters long

Table B.31. Metrics
MetricDescriptionType
qtime The number of milliseconds that IO operations have spent queued or being serviced since the system booted. Tracking the rate of change of this metric can give an indication of how many operations are typically queued or being serviced during the sampling interval. Int64
queue Approximate current size of the IO operation queue. Double
read_bytes The number of bytes read from the disk. Int64
reads The number of read operations performed. Int64
rtime The amount of time, in milliseconds, spent reading. Int64
service_time Average time, in millseconds, required to service recent IO operations. Double
time Total number of millseconds spent servicing IO operations. Int64
write_bytes The number of bytes written. Int64
writes The number of disk writes. Int64
wtime The amount of time, in milliseconds, spent writing Int64

 B.2.6. agent.network

 Summary

The agent.network check will attempt to measure the usage of network devices on a host.

Table B.32. Attributes
FieldDescriptionValidation
target The network device to check (eg 'eth0')
  • String between 1 and 512 characters long

Table B.33. Metrics
MetricDescriptionType
rx_bytes The number of bytes received. Int64
rx_dropped The number of packets received and subsequently dropped. Int64
rx_errors The number of receive errors detectede. Int64
rx_frame The number of frames received. Int64
rx_overruns The number of overruns received. Int64
rx_packets The number of packets received. Int64
tx_bytes The number of bytes transmitted. Int64
tx_carrier The number of carrier losses detected. Int64
tx_collisions The number of transmit collisions detected. Int64
tx_dropped The number of packets dropped during transmission. Int64
tx_errors The number of trasmit errors detected. Int64
tx_overruns The number of transmit overruns detected. Int64
tx_packets The total number of packets transmitted. Int64

 B.2.7. agent.plugin

 Summary

The agent.plugin check will attempt to run a custom plugin on a host.

 Installing Custom Plugins

Custom plugins are simply executable files which report metrics via stdout. Plugins are placed on the server to be monitored at an installation path that depends on the operating system:

Operating System Installation Path
Linux /usr/lib/rackspace-monitoring-agent/plugins/
Windows (32-bit agent installed on a 64-bit system ) C:\Program Files (x86)\Rackspace Monitoring\plugins
Windows (64-bit agent installed on a 64-bit system or 32-bit agent installed on a 32-bit system) C:\Program Files\Rackspace Monitoring\plugins

On Windows, the agent location depends on the version of the agent installed and the architecture of the operating system, and should be prefixed with "& " if you are using Power Shell.

Once the plugin has been installed to the server, create an agent.plugin check that specifies the name of the executable file, and the plugin will begin reporting metrics to the monitoring system, just like any other check. If the plugin requires any command line arguments, these may be specified using the optional args array.

Table B.34. Attributes
FieldDescriptionValidation
file Name of the plugin file
  • String matching the regex //[a-zA-Z0-9\.\-_]+//

args Command-line arguments which are passed to the plugin
  • Optional

  • Array [Non-empty string]

  • Array or object with number of items between 0 and 10

timeout Plugin execution timeout in milliseconds
  • Optional

  • Integer

Table B.35. Metrics
MetricDescriptionType
Available metrics are determined by the plugin.

 Community Plugin Repository

A curated repository of plugins created by Cloud Monitoring users is avaliable on GitHub. Contributions are welcome!

[Note]Note

The Cloud Monitoring Agent is also capable of executing Cloudkick plugins, so if you are a Cloudkick user you can just drop in any existing plugin and it should just work.

 Creating Custom Plugins

Creating custom plugins is as simple as writing a script that prints a status and up to 10 metrics to standard out. The format of the status line is:

status <status>

The status string should describe whether the check was able to successfully gather metrics. It could be as simple as "success" to incidate that metrics were successfully gathered. When an error occurs that prevents metrics from being gathered, plugins should print a status that describes the error, then should exit non-zero without printing any metric lines.

The status line may be followed by up to 10 metric lines. The format of a metric line is:

metric <name> <type> <value>

where:

name

is the name of the metric. No spaces are allowed. Example: memory_free.

type

is the type of the metric. This must be one of:

int32

Signed 32 bit integer value.

uint32

Unsigned 32 bit integer value.

int64

Signed 64 bit integer value.

uint64

Unsigned 64 bit integer value.

double

Floating point values.

gauge

An integer that should be graphed as the first order derivative. This is useful if you have a metric that always increases, and you want to see the rate of growth.

string

A string value.Note: the monitoring system records string metrics every time they change. String metrics are designed for recording an enumerated state which infrequently changes (for example an HTTP response code which is always 200 during normal operation). You should not store arbitrary, frequently changing values in a string metric.

value

is the value of the metric.

Putting it all together, the output of a plugin that has successfully executed might look something like:

status Turkey thermometer returned valid response
metric internal_temperature uint32 165
metric ambient_temperature uint32 325

If the plugin failed, it might print the following before exiting non-zero:

status Turkey thermometer not responding


loading table of contents...