Field and Check Types Reference

The Check type and fields reference provides details about the following agent check types supported by the Rackspace Monitoring service.

Note

Most check types include some example metrics. This helps you better understand creating successful alarm criteria.

Remote check types

Rackspace Monitoring supports the following remote check types.

remote.dns

The remote.dns check run a DNS check against a given target. This check should assist in verifying functionality of a DNS server, for example ensuring that it is publishing the domains you think that it should be publishing.

Attributes

FieldDescriptionValidation
querySpecifies the DNS query.String, valid hostname
record_typeSpecifies the DNS record type.String matching the regex /^(A|AAAA| TXT|MX|SOA|CNAME|PTR|NS|MB|MD| MF|MG|MR)$/
portSpecifies the port number. The default is 53.Optional, whole number (may be zero padded), must be an integer between 1-65535 inclusive

Metrics

MetricDescriptionType
answerThe list of space-separated IP addresses for the specified name resolution.String
rttThe roundtrip time to execute a remote.dns check.Double
ttlSpecifies the port number. The default is 53.Uint32

remote.ftp-banner

The remote.ftp-banner check will attempt to connect to a FTP server and verify that it re- sponds to the connection.

Attributes

FieldDescriptionValidation
portSpecifies the port number. The default is 21.This field is optional. Must be a whole number (may be zero padded). This value must be an integer between 1-65535 inclusive

Metrics

MetricDescriptionType
bannerThe string sent from the server on connectString
banner_matchThe matched string from the banner_match regular expression specified during check creation.String
body_matchThe string representing the body match specified in a remote.ftp-banner check.String
durationThe time it took to finish executing the check in milliseconds.Uint32
tt_bodyThe time to the body measured in milliseconds.Uint32
tt_connectThe time to connect measured in milliseconds.Uint32
tt_firstbyteThe time to first byte measured in milliseconds.Uint32

remote.http

The remote.http check will try to connect to the server and retrieve the specified URL using the specified method, optionally with the password and user for authentication, using SSL, and checking the body with a regex. This can be used to test that a web application running on a server is responding without generating error messages. It can also test if the SSL certificate is valid.

Note: The maximum size of the content returned in a remote.http check is 500k, with overhead and compression taken into account. This limitation helps monitoring remain responsive.

Attributes

FieldDescriptionValidation
urlSpecifies the target URL.String between 1 and 8096 characters long
auth_passwordOptional auth passwordOptional. String between 1 and 255 characters long
auth_userOptional auth userOptional. String between 1 and 255 characters long
bodyBody match regular expression used to run against HTTP response content and generate metric body_match (see Metrics table below). Body is limited to 100k and match is truncated to 80 characters.Optional. String between 1 and 255 characters long
body_matchesA map of key/regular-expression pairs used to run against HTTP response content and generate one metric body_match_<key> for each key/regular-expression pair (see Metrics table below). Body is limited to 100k and match is truncated to 80 characters.Optional. Hash [String,String between 1 and 50 characters long, String matching the regex /^[-_ a-z0-9]+$/i: String,String between 1 and 255 characters long]. Array or object with number of items between 0 and 4.
follow_redirectsFollow redirects (default:true)Optional. Boolean.
headersArbitrary headers which are sent with the request.Optional. Hash [String,String between 1 and 50 characters long: String,String between 1 and 50 characters long]. Array or object with number of items between 0 and 10. A value which is not one of: content-length, user-agent, host, connection, keep-alive, transfer-encoding, upgrade.
methodHTTP method. The default is GET.Optional. String. One of (HEAD, GET, POST, PUT, DELETE, INFO)
payloadSpecify a request body (limited to 1024 characters). If a redirect is set, the payload is only sent to the first location.Optional. String between 1 and 1024 characters long

Note

When you set up a website and the check always returns unknown content-encoding: it is because of the HTTP body check limit of 100. This limit is the amount of space for the Monitoring Pollers (where the site is checked from). If the amount of space required to do the HTTP(S) check is greater than 100k, then only the first 100k can be checked.

If the customer uses Compression on the pages, such as compress or gzip Content-Encoding, then the full compressed page must be less than or equal to 100k. This is because the full page must be downloaded and uncompressed before it can verify the check.

This is also the reason why you can only check against strings within the first 100k of the web page.

Metrics

MetricDescriptionType
body_matchThe string representing the any matched string from HTTP response content using the regular expression specified in body attribute in check.String
body_match_The metric is generated for each key specified in body_matches check attribute. For example, a body_matches value of {“register”:”Register Now!”, “contact”:”Contact Us”} will generate two metrics: body_match_register and body_match_contact.String
bytesThe number of bytes returned from a response payload.Int32
cert_endThe absolute timestamp in seconds for the certificate expiration. This is only available when performing a check on an HTTPS server.Uint32
cert_end_inThe relative timestamp in seconds until certification expiration. This is only available when performing a check on an HTTPS server.Int32
cert_errorA string describing a certificate error in our validation. This is only available when performing a check on an HTTPS server.String
cert_issuerThe issue string for the certificate. This is only available when performing a check on an HTTPS server.String
cert_startThe absolute timestamp of the issue of the certificate. This is only available when performing a check on an HTTPS server.Uint32
cert_subjectThe subject of the certificate. This is only available when performing a check on an HTTPS server.String
cert_subject_alternative_namesThe alternative name for the subject of the certificate. This is only available when performing a check on an HTTPS server. (See an example alarm following this table.)String
codeThe status code returned.String
durationThe time it took to finish executing the check in milliseconds.Uint32
truncatedThe number of bytes that the result was truncated by.Uint32
tt_connectThe time to connect measured in milliseconds.Uint32
tt_firstbyteThe time to first byte measured in milliseconds.Uint32

Note

The following is an example alarm for cert_subject_alternative_names, where you would replace example.com with an expected host name on the certificate’s SAN list:

if (metric['cert_subject_alternative_names'] nregex '.*example.com.*') {
return new AlarmStatus(CRITICAL, 'Missing expected SAN');
}

remote.imap-banner

The remote.imap-banner check will attempt to connect to an IMAP server and verify that it response to the connection

Attributes

FieldDescriptionValidation
portPort number (default: 143)Optional. Whole number (may be zero padded). Integer between 1-65535 inclusive.
sslEnable SSLOptional. Boolean.

remote.mssql-banner

The remote.mssql-banner check will attempt to connect to a Microsoft SQL database server and verify that it is accepting connections.

Attributes

FieldDescriptionValidation
portPort number (default: 1433)Optional. Whole number (may be zero padded). Integer between 1-65535 inclusive.
sslEnable SSLOptional. Boolean.

remote.mysql-banner

The remote.mysql-banner check will attempt to connect to a MySQL database server and verify that it is accepting connections.

Attributes

FieldDescriptionValidation
portPort number (default: 3306)Optional. Whole number (may be zero padded). Integer between 1-65535 inclusive.
sslEnable SSLOptional. Boolean.

remote.ping

The remote.ping check will attempt to ping a server.

Attributes

FieldDescriptionValidation
countNumber of pings to send within a single check.This field is optional. Must be a whole number (may be zero padded). This value must be an integer between 1-15 inclusive

Metrics

MetricDescriptionType
availableThe whole number representing the percent of pings that returned back for a remote.ping check.Double
averageThe average response time in milliseconds for all ping packets sent out and later retrieved.Double
countThe number of pings (ICMP packets) sent.Int32
maximumThe maximum roundtrip time in milliseconds of an ICMP packet.Double
minimumThe minimum roundtrip time in milliseconds of an ICMP packet.Double

remote.pop3-banner

The remote.pop3-banner check will attempt to connect to a POP3 mailbox server and verify that it responds to the connection.

Attributes

FieldDescriptionValidation
portPort number (default: 110)Optional. Whole number (may be zero padded). Integer between 1-65535 inclusive.
sslEnable SSLOptional. Boolean.

remote.postgresql-banner

The remote.postgresql-banner check will attempt to connect to a PostgreSQL database server and verify that it is accepting connections.

Attributes

FieldDescriptionValidation
portPort number (default: 5432)Optional. Whole number (may be zero padded). Integer between 1-65535 inclusive.
sslEnable SSLOptional. Boolean.

remote.smtp-banner

The remote.smtp-banner check will attempt to connect to a SMTP mail server and verify that a HELO/EHLO is received.

Attributes

FieldDescriptionValidation
portPort number (default: 25)Optional. Whole number (may be zero padded). Integer between 1-65535 inclusive.
sslEnable SSLOptional. Boolean.

Metrics

MetricDescriptionType
bannerThe string sent from the server on connect.String
banner_matchThe matched string from the banner_match regular expression specified during check creation.String
bytesThe number of bytes returned from a response payload.Int32
cert_endThe absolute timestamp in seconds for the certificate expiration. This is only available when performing a check on an HTTPS server.Uint32
cert_end_inThe relative timestamp in seconds until certification expiration. This is only available when performing a check on an HTTPS server.Int32
cert_errorA string describing a certificate error in our validation. This is only available when performing a check on an HTTPS server.String
cert_issuerThe issue string for the certificate. This is only available when performing a check on an HTTPS server.String
cert_startThe absolute timestamp of the issue of the certificate. This is only available when performing a check on an HTTPS server.Uint32
cert_subjectThe subject of the certificate. This is only available when performing a check on an HTTPS server.String
cert_subject_alternative_namesThe alternative name for the subject of the certificate. This is only available when performing a check on an HTTPS server. (See an example alarm following this table.)String
durationThe time it took to finish executing the check in milliseconds..Uint32
tt_connectThe time to connect measured in milliseconds.Uint32
tt_firstbyteThe time to first byte measured in milliseconds.Uint32

Note

The following is an example alarm for cert_subject_alternative_names, where you would replace example.com with an expected host name on the certificate’s SAN list:

if (metric['cert_subject_alternative_names'] nregex '.*example.com.*') {
return new AlarmStatus(CRITICAL, 'Missing expected SAN');
}

remote.smtp

The remote.smtp check will attempt to connect to a SMTP mail server, send an email from the ‘from’ parameter, to the ‘to’ parameter, with a payload specified by the ‘payload’ parameter setting the EHLO from host to the value in ‘ehlo’.

Attributes

FieldDescriptionValidation
ehloSpecifies the EHLO parameter.Optional. String between 1 and 255 characters long.
fromSpecifies the From parameter.Optional. String between 1 and 255 characters long.
payloadSpecifies the payload.Optional. String between 1 and 1024 characters long.
portSpecifies the port number.Optional. Whole number (may be zero padded). Integer between 1-65535 inclusive.
starttlsSpecifies whether the connection should be upgraded to TLS/ SSL.Optional. Boolean.
toSpecifies the To parameter. If this field is blank, a “quit” is issued before sending a to line, and the connection is terminated.Optional. String between 1 and 255 characters long.

remote.ssh

The remote.ssh check will attempt to SSH to a target.

Attributes

FieldDescriptionValidation
portSpecifies the port number. The default is 22.This field is optional. Must be a whole number (may be zero padded). This value must be an integer between 1-65535 inclusive

Metrics

MetricDescriptionType
durationSpecifies the time it took to finish executing the check in milliseconds.Uint32
fingerprintSpecifies the ssh fingerprint used to verify identity.String

remote.tcp

The remote.tcp check will attempt to connect to a host and port, and optionally issue a banner match to ensure that the service is responding as specified. This can be used to test services that are not covered by the existing HTTP, SMTP, SSH, MySQL, etc. checks.

Attributes

FieldDescriptionValidation
portSpecifies the port number.Whole number (may be zero padded). Integer between 1-65535 inclusive.
banner_matchSpecifies the banner match regex.Optional. String between 1 and 255 characters long.
body_matchSpecifies the body match regex. Key/Values are captured when matches are specified within the regex. Note: Maximum body size is 1024 bytes.Optional. String between 1 and 255 characters long.
send_bodySend a body. If a banner is provided the body is sent after the banner is verified.Optional. String between 1 and 1024 characters long.
sslSpecifies whether SSL is enabled.Optional. Boolean.

Metrics

MetricDescriptionType
bannerSpecifies the string that is sent from the server on connect.String
banner_matchSpecifies the matched string from the banner_match regular expression specified during check creation.String
durationSpecifies the time it took to finish executing the check in milliseconds.Uint32
tt_connectSpecifies the time to connect measured in milliseconds.Uint32
tt_firstbyteSpecifies the time to first byte measured in milliseconds.Uint32

remote.telnet-banner

The remote.telnet-banner check will attempt to connect to a Telnet (or similar protocol) server and verify that an appropriate banner is received.

Attributes

FieldDescription | Validation
portSpecifies the port number. (Default: 23) | Optional. Whole number (may be zero padded). Integer between 1-65535 inclusive.
banner_matchSpecifies the banner match check. | Optional. String between 1 and 255 characters long.
sslSpecifies whether SSL is enabled. | Optional. Boolean.

Agent check types

Rackspace Monitoring supports the following agent check types.

agent.apache check

The agent.apache check will retrieve Apache HTTP server metrics

Attributes

FieldDescriptionValidation
timeoutSpecifies the plugin execution timeout in milliseconds.Optional. Integer.
urlSpecifies the URL. Defaults to http://127.0.0.1/server-status.Optional. URL.

Metrics

MetricDescriptionType
busy_workersSpecifies the number of workers serving requestsInt64
bytes_per_requestAverages giving the number of request per second, the number of bytes served secondInt64
bytes_per_secondAverages giving the number of requests per second, the number of bytes per requestInt64
closingThe number of workers closing the connectionInt64
cpu_loadTotal percentage of CPU used by workersDouble
dnsThe number of workers performing DNS lookupInt64
gracefully_fishingThe number of workers gracefully fishingInt64
idleThe number of idle cleanup workersInt64
idle_workersThe number of idle workersInt64
keepaliveThe number of workers kept alive (reading)Int64
loggingThe number of workers loggingInt64
openThe number of workers with no current processInt64
readingThe number of workers reading the requestInt64
requests_per_secondThe number of requests per secondInt64
sendingThe number of workers sending a replyInt64
startingThe number of workers starting upInt64
total_accessTotal number of accesses servedInt64
total_kbytesTotal kilobytes servedInt64
uptimeTime since the last start/restart in millisecondsInt64
waitingThe number of workers waiting for connectionInt64

agent.cpu

The agent.cpu check will attempt to measure the usage of the CPU on a host.

Attributes

No fields are present for this particular check type.

Metrics

MetricDescriptionType
idle_percent_averageRecent percentage of CPU time spent idle.Double
irq_percent_averageRecent percentage of CPU time spent handling hardware interrupts.Double
max_cpu_usageRecent percentage utilization of the most-utilized CPU. This is useful to detect when some CPUs are pegged while others are idle.Double
min_cpu_usageRecent percentage utilization of the least-utilized CPU. This is useful to detect when some CPUs are pegged while others are idle.Double
stolen_percent_averageRecent percentage of CPU time spent waiting for the CPU to service other virtual CPUs.Double
sys_percent_averageRecent percentage of CPU time utilized by kernel mode processes.Double
usage_averageRecent percentage of CPU time utilized by all processes. processes.Double
user_percent_averageRecent percentage of CPU time utilized by user mode processes in a “wait” state.Double
wait_percent_averageRecent percentage of CPU time utilized by processes in a “wait” state.Double

agent.disk

The agent.disk check exposes disk related metrics (service time, wait time, etc.).

Attributes

FieldDescription | Validation
targetThe disk to check (eg ‘/dev/xvda1’) | String between 1 and 512 characters long

Metrics

MetricDescriptionType
queueMeasured in seconds, this is the current disk queue length, which is an instantaneous measurement of the I/O queue for the given disk/partition.Int64
qtimeMeasured in milliseconds, this is the weighted number of milliseconds spent doing I/Os. This field is incremented at each I/O start, I/O completion, I/O merge, or read of these stats by the number of I/Os in progress times the number of milliseconds spent doing I/O since the last update of this field. This can provide an easy measure of both I/O completion time and the backlog that might be accumulating.Int64
read_bytesThe number of physical disk bytes read, the prefix / will change depending on the mount points discovered.Int64
readsThe number of physical disk reads, the prefix / will change depending on the mount points discovered.Int64
rtimeThe amount of time spent reading, the prefix / will change depending on the mount points discovered.Int64
write_bytesThe number of physical disk bytes written, the prefix / will change depending on the mount points discovered.Int64
writesThe number of physical disk writes, the prefix / will change depending on the mount points discovered.Int64
wtimeThe amount of time spent writing, the prefix / will change dependending on the mount points discovered.Int64

agent.filesystem

The agent.filesystem check exposes file system related metrics (free space, used space, etc.)

Attributes

FieldDescriptionValidation
targetThe mount point to check, either /var or C:\String between 1 and 512 characters long.

Metrics

MetricDescriptionType
availAvailable space on the filesystem in kilobytes for the current user, which is root, that is running the agent.Int64
freeFree space available on the filesystem in kilobytes including reserved space. This is calculated as number of free file blocks x block sizeInt64
optionsThe option used to mount the device to the filesystem. Includes the rw f which indicates the device is in read/write mode.Int64
totalTotal space on the filesystem, in kilobytes, including reserved space. This is calculated as number of total file blocks x block sizeInt64
usedUsed space on the filesystem, in kilobytes. This number does not include the reserved space. This is calculated as total - freeInt64
filesNumber of inodes on the filesystem.Int64
free_filesNumber of free inodes on the filesystem.Int64

Note

The reserved space only applies to Linux systems. It is the space saved for important root processes and possible rescue actions. In some systems the reserved space can be used for fragmentation allocation. For more information about Ext3 and Ext4: https://www.redhat.com/archives/ext3-users/2009-January/msg00026.html.

The files and free_files metrics only apply to Linux systems.

agent.filesystem_state

The agent.filesystem_state check exposes filesystem metrics for read-write/read-only system mounts.

Attributes

No fields are present for this particular check type.

Metrics

MetricDescriptionType
total_roTotal number of filesystems mounted read-only.Int64
total_rwTotal number of filesystems mounted read-writeInt64
devices_roComma delimited list of devices mounted read-only.String
devices_rwComma delimited list of devices mounted read-write.String

agent.load_average

The agent.load_average check attempts to measure the UNIX style load average on a host.

For more information about the commands used to get the load average, see Check the System Load on Linux.

Attributes

No fields are present for this particular check type.

Metrics

MetricDescriptionType
1mOne minute load average.Double
5mFive minute load average.Double
15mFifteen minute load average.Double

agent.memory

Attributes

No fields are present for this particular check type.

Metrics

The memory available to the system is used in three different ways:

  • Used by the processese running in the system, this value is under “actual_used” metric.
  • Used by the kernel, this value is not returned from the check but can be deduced.
  • Not used by either the running processes or kernel, this value is under “free” metric.

For convenience, the system returns the value of used/free memory for the case of including kernel and excluding kernel so that you don’t have to do the calculation in your head.

MetricDescriptionType
actual_freeThe amount of free memory, ‘free’ plus kernel memory.Int64
actual_usedThe actual amount of used memory excluding kernel memory.Int64
freeThe amount of free memory not including kernel memory.Int64
ramThe amount of RAM.Int64
swap_freeThe amount of free SWAP memory.Int64
swap_page_inThe number of SWAP-in pages.Int64
swap_page_outThe number of SWAP-out pages.Int64
swap_totalThe total amount of SWAP memory.Int64
swap_usedThe amount of used SWAP memory.Int64
totalThe total amount of memory.Int64
usedThe total amount of used memory, ‘actual_used’ plus kernel memoryInt64

agent.mysql

The agent.mysql check retrieves MySQL server metrics.

Note

Except for the replication.slave_running’ metric, all metrics starting with replication do not show up if there is no slave running.

If the libmysqlclient-dev package is not already present, you should install it on the host where the agent.mysql plug-in runs.

Attributes

FieldDescriptionValidation
hostMysql server hostname (default: 127.0.0.1).Optional. Valid hostname, IPv4 or IPv6 address
mycnfSpecifies whether my.cnf should be loaded.Optional. Boolean
passwordSpecifies the server password.Optional. String between 1 and 255 characters long
portSpecifies the Mysql server port (default: 3306).Optional. Integer between 1-65535 inclusive
socketSpecifies the path to the domain socket.Optional. String between 1 and 255 characters long
timeoutSpecifies the plugin execution timeout in millisecondsOptional. Integer
usernameSpecifies the username.Optional. String between 1 and 16 characters long

Metrics

MetricDescriptionType
bytes_receivedThe number of bytes received from all clients. (statvar_Bytes_received)Cumulative
bytes_sentThe number of bytes sent to all clients. (statvar_Bytes_sent)Cumulative
core.aborted_clientsThe number of connections that were aborted because the client died without closing the connection properly. (statvar_Aborted_clients)Instantaneous
core.connectionsThe number of connection attempts (successful or not) to the MySQL server. (statvar_Connections)Cumulative
core.queriesThe number of statements executed by the server. (statvar_Queries)Cumulative
core.uptimeThe number of seconds that the server has been up. (statvar_Uptime)Instantaneous
handler.commitThe number of internal COMMIT statements. (statvar_Handler_commit)Cumulative
handler.deleteThe number of times that rows have been deleted from tables. (statvar_Handler_delete)Cumulative
handler.read_firstThe number of times that rows have been deleted from tables. (statvar_Handler_delete)Cumulative
handler.read_firstThe number of times the first entry in an index was read. (statvar_Handler_read_first)Cumulative
handler.read_keyThe number of requests to read a row based on a key. If this value is high, it is a good indication that your tables are properly indexed for your queries. (statvar_Handler_read_key)Cumulative
handler.read_nextThe number of requests to read the next row in key order. This value is incremented if you are querying an index column with a range constraint or if you are doing an index scan. (statvar_Handler_read_next)Cumulative
handler.read_prevThe number of requests to read the previous row in key order. This read method is mainly used to optimize ORDER BY … DESC. (statvar_Handler_read_prev)Cumulative
handler.read_rndThe number of requests to read a row based on a fixed position. This value is high if you are doing a lot of queries that require sorting of the result. You probably have a lot of queries that require MySQL to scan entire tables or you have joins that do not use keys properly. (statvar_Handler_read_rnd)Cumulative
handler.rollbackThe number of requests for a storage engine to perform a rollback operation. (statvar_Handler_rollback)Instantaneous
handler.savepointThe number of requests for a storage engine to place a savepoint. (statvar_Handler_savepoint)Instantaneous
handler.savepoint_rollbackThe number of requests for a storage engine to roll back to a savepoint. (statvar_Handler_savepoint_rollback)Instantaneous
handler.updateThe number of requests to update a row in a table. (statvar_Handler_update)Cumulative
handler.writeThe number of requests to insert a row in a table. (statvar_Handler_write)Cumulative
innodb.buffer_pool_pages_dataThe number of pages containing data (dirty or clean). (statvar_Innodb_buffer_pool_pages_data)Instantaneous
innodb.buffer_pool_pages_dirtyThe number of pages currently dirty. (statvar_Innodb_buffer_pool_pages_dirty)Instantaneous
innodb.buffer_pool_pages_flushedThe number of buffer pool page-flush requests. (statvar_Innodb_buffer_pool_pages_flushed)Instantaneous
innodb.buffer_pool_pages_freeThe number of free pages. (statvar_Innodb_buffer_pool_pages_free)Instantaneous
innodb.buffer_pool_pages_totalThe total size of the buffer pool, in pages. (statvar_Innodb_buffer_pool_pages_total)Instantaneous
innodb.buffer_pool_read_requestsThe number of logical read requests. (statvar_Innodb_buffer_pool_read_requests)Cumulative
innodb.buffer_pool_readsThe number of logical reads that InnoDB could not satisfy from the buffer pool, and had to read directly from the disk. (statvar_Innodb_buffer_pool_reads)Cumulative
innodb.buffer_pool_sizeThe size in bytes of the memory buffer InnoDB uses to cache data and indexes of its tables. (sysvar_innodb_buffer_pool_size)Instantaneous
innodb.data_pending_fsyncsThe current number of pending fsync() operations. (statvar_Innodb_data_pending_fsyncs)Instantaneous
innodb.data_pending_readsThe current number of pending reads. (statvar_Innodb_data_pending_reads)Instantaneous
innodb.data_pending_writesThe current number of pending writes. (statvar_Innodb_data_pending_writes)Instantaneous
innodb.pages_createdThe number of pages created. (statvar_Innodb_pages_created)Cumulative
innodb.pages_readThe number of pages read. (statvar_Innodb_pages_read)Cumulative
innodb.pages_writtenThe number of pages written. (statvar_Innodb_pages_written)Cumulative
innodb.row_lock_timeThe total time spent in acquiring row locks, in milliseconds. (statvar_Innodb_row_lock_time)Cumulative
innodb.row_lock_time_avgThe average time to acquire a row lock, in milliseconds. (statvar_Innodb_row_lock_time_avg)Instantaneous
innodb.row_lock_time_maxThe maximum time to acquire a row lock, in milliseconds. (statvar_Innodb_row_lock_time_max)Instantaneous
innodb.row_lock_waitsThe number of times a row lock had to be waited for. (statvar_Innodb_row_lock_waits)Cumulative
innodb.rows_deletedThe number of rows deleted from InnoDB tables. (statvar_Innodb_rows_deleted)Cumulative
innodb.rows_insertedThe number of rows inserted into InnoDB tables. (statvar_Innodb_rows_inserted)Cumulative
innodb.rows_readThe number of rows read from InnoDB tables. (statvar_Innodb_rows_read)Cumulative
innodb.rows_updatedThe number of rows updated in InnoDB tables. (statvar_Innodb_rows_updated)Cumulative
key.buffer_sizeIndex blocks for MyISAM tables are buffered and are shared by all threads. (sysvar_key_buffer_size)Instantaneous
max.connectionsThe maximum permitted number of simultaneous client connections. (sysvar_max_connections)Instantaneous
qcache.free_blocksThe number of free memory blocks in the query cache. (statvar_Qcache_free_blocks)Instantaneous
qcache.free_memoryThe amount of free memory for the query cache. (statvar_Qcache_free_memory)Instantaneous
qcache.hitsThe number of query cache hits. (statvar_Qcache_hits)Cumulative
qcache.insertsThe number of queries added to the query cache. (statvar_Qcache_inserts)Cumulative
qcache.lowmem_prunesThe number of queries that were deleted from the query cache because of low memory. (statvar_Qcache_lowmem_prunes)Instantaneous
qcache.not_cachedThe number of noncached queries (not cacheable, or not cached due to the query_cache_type setting). (statvar_Qcache_not_cached)Instantaneous
qcache.queries_in_cacheThe number of queries registered in the query cache. (statvar_Qcache_queries_in_cache)Cumulative
qcache.sizeThe amount of memory allocated for caching query results. (sysvar_query_cache_size)Instantaneous
qcache.total_blocksThe total number of blocks in the query cache. (statvar_Qcache_total_blocks)Cumulative
replication.exec_master_log_posThe position in the current master binary log file to which the SQL thread has read and executed, marking the start of the next transaction or event to be processed. (show-slave-status.html).Instantaneous
replication.last_errnoThe error number returned by the most recently executed statement. (show-slave-status.html).Instantaneous
replication.last_io_errorThe error message of the most recent error that caused the I/O thread to stop (show-slave-status.html).String
replication.max_relay_log_sizeIf a write by a replication slave to its relay log causes the current log file size to exceed the value of this variable, the slave rotates the relay logs (closes the current file and opens the next one). (sysvar_max_relay_log_size)Instantaneous
replication.read_master_log_posThe position in the current master binary log file up to which the I/O thread has read. (show-slave-status.html)Instantaneous
replication.relay_log_posThe position in the current relay log file up to which the SQL thread has read and executed. (show-slave-status.html)Instantaneous
replication.seconds_behind_masterIn essence, this field measures the time difference in seconds between the slave SQL thread and the slave I/O thread. (show-slave-status.html)Instantaneous
replication.slave_io_runningWhether the I/O thread is started and has connected successfully to the master. Internally, the state of this thread is represented by one of the following three values: MYSQL_SLAVE_NOT_RUN, MYSQL_SLAVE_RUN_NOT_CONNECT, MYSQL_SLAVE_RUN_CONNECT (show-slave- status.html)Boolean
replication.slave_io_stateA copy of the State field of the SHOW PROCESSLIST output for the slave I/O thread. This tells you what the thread is doing: trying to connect to the master, waiting for events from the master, reconnecting to the master, and so on. (show-slave-status.html).String
replication.slave_open_temp_tablesThe number of temporary tables that the slave SQL thread currently has open. If the value is greater than zero, it is not safe to shut down the slave. (statvar_Slave_open_temp_tables).Instantaneous
replication.slave_retried_transactionsThe total number of times since startup that the replication slave SQL thread has retried transactions. (statvar_Slave_retried_transactions)Instantaneous
replication.slave_runningThis is ON if this server is a replication slave that is connected to a replication master, and both the I/O and SQL threads are running; otherwise, it is OFF. (statvar_Slave_running)String
replication.slave_sql_runningWhether the SQL thread is started. (show- slave-status.html)Boolean
thread.cache_sizeHow many threads the server should cache for reuse. (sysvar_thread_cache_size)Instantaneous
threads.connectedThe number of currently open connections. (statvar_Threads_connected)Instantaneous
threads.createdThe number of threads created to handle connections. (statvar_Threads_created)Cumulative
threads.runningThe number of threads that are not sleeping. (statvar_Threads_running)Instantaneous

agent.network

The agent.network check will attempt to measure the usage of network devices on a host.

Attributes

FieldDescription | Validation
targetThe network device to check (eg ‘eth0) | String between 1 and 512 characters long

Metrics

MetricDescriptionType
rx_bytesThe number of bytes received over the interface.Int64
rx_droppedThe number of packets received and subsequently dropped over the interface.Int64
rx_errorsThe number of errors received over the interface.Int64
rx_packetsThe number of packets received over the interface.Int64
speedThe speed at which the bytes were transmitted over the interface.Int64
tx_bytesThe number of bytes transmitted over the interface.Int64
tx_droppedThe number of packets attempted transmitting and subsequently dropped over the interface.Int64
tx_errorThe number of errors while transmitting over the interface.Int64
tx_packetsThe number of packets transmitted over the interface.Int64

agent.mssql_database

The agent.mssql_database check returns metrics for a Microsoft SQL Server database.

Attributes

FieldDescriptionValidation
dbMS SQL Server database nameString between 1 and 255 characters long
hostnameMS SQL Server hostnameOptional. Valid hostname, IPv4 or IPv6 address
passwordMS SQL Server passwordOptional. String between 1 and 255 characters long
serverinstanceMS SQL Server instance to queryOptional. String between 1 and 255 characters long
usernameMS SQL Server usernameOptional. String between 1 and 255 characters long

agent.mssql_buffer_manager

The agent.mssql_buffer_manager check returns metrics for the Microsoft SQL Server buffer manager.

Attributes

FieldDescriptionValidation
computerMS SQL Server computer nameOptional. Valid hostname, IPv4 or IPv6 address
serverinstanceMS SQL Server instance to queryOptional. String between 1 and 255 characters long

agent.mssql_sql_statistics

The agent.mssql_sql_statistics check returns metrics for the Microsoft SQL Server SQL statistics.

Attributes

FieldDescriptionValidation
computerMS SQL Server computer nameOptional. Valid hostname, IPv4 or IPv6 address
serverinstanceMS SQL Server instance to queryOptional. String between 1 and 255 characters long

agent.mssql_plan_cache

The agent.mssql_plan_cache check returns metrics for the Microsoft SQL Server plan cache.

Attributes

FieldDescriptionValidation
computerMS SQL Server computer nameOptional. Valid hostname, IPv4 or IPv6 address
serverinstanceMS SQL Server instance to queryOptional. String between 1 and 255 characters long

agent.mssql_memory_manager

The agent.mssql_memory_manager check returns metrics for the Microsoft SQL Server memory manager.

Attributes

FieldDescriptionValidation
computerMS SQL Server computer nameOptional. Valid hostname, IPv4 or IPv6 address
serverinstanceMS SQL Server instance to queryOptional. String between 1 and 255 characters long

agent.mssql_version

The agent.mssql_version check returns version information for Microsoft SQL Server.

Attributes

FieldDescriptionValidation
hostnameMS SQL Server hostnameOptional. Valid hostname, IPv4 or IPv6 address
passwordMS SQL Server passwordOptional. String between 1 and 255 characters long
serverinstanceMS SQL Server instance to queryOptional. String between 1 and 255 characters long
usernameMS SQL Server usernameOptional. String between 1 and 255 characters long

agent.plugin

The agent.plugin check will attempt to run a custom plugin on a host.

Custom plugins are simply executable files which report metrics via stdout. Plugins are placed on the server to be monitored at an installation path that depends on the operating system:

Operating SystemInstallation Path
Linux/usr/lib/rackspace-monitoring-agent/plugins/
Windows (32-bit agent installed on a 64-bit system )C:\Program Files (x86)\Rackspace Monitoring\plugins
Windows (64-bit agent installed on a 64-bit system or 32-bit agent installed on a 32-bit system)C:\Program Files\Rackspace Monitoring\plugins

After the plugin has been installed on the server, create an agent.plugin check that specifies the name of the executable file so that the plugin can begin reporting metrics to the monitoring system, like any other check. If the plugin requires any command line arguments, you can specify them using the optional args array.

Attributes

FieldDescriptionValidation
fileName of the plugin fileString matching the regex //[a-zA-Z0-9.- _]+//
argsCommand-line arguments which are passed to the pluginOptional. Array [Non-empty string]. Array or object with number of items between 0 and 10
timeoutPlugin execution timeout in millisecondsOptional. Integer

Metrics

The metrics returned are defined in the plugin script. A plugin can send up to fifty unique metrics at a time.

Community Plugin Repository

A curated repository of plugins created by Rackspace Monitoring users is avaliable on GitHub. Contributions are welcome!

Note

The Rackspace Monitoring Agent is also capable of executing Cloudkick plugins, so if you are a Cloudkick user you can just drop in any existing plugin and it should just work.

Creating Custom Plugins

Creating custom plugins is as simple as writing a script that prints a status and up to fifty metrics to standard out. The format of the status line is:

status

The status string should describe whether the check was able to successfully gather metrics. It could be as simple as “success” to incidate that metrics were successfully gathered. When an error occurs that prevents metrics from being gathered, plugins should print a status that describes the error, then should exit non-zero without printing any metric lines.

The status line can be followed by up to fifty metric lines. Each line is output in the following format:

metric

The following descriptions provide information about parameter values.

ParameterDescription
nameThe name of the metric. Spaces are not supported. The format is alpha numeric with colon (:), underscore (_) and dot (.) allowed. Example: memory_free.
typeThe metric can be any of the following types:

int32 Signed 32 bit integer value.

uint32 Unsigned 32 bit integer value.

int64 Signed 64 bit integer value.

uint64 Unsigned 64 bit integer value.

double Floating point values.

string

A string value.

Note: the monitoring system records string metrics every time they change. String metrics are designed for recording an enumerated state which infrequently changes (for example an HTTP response code which is always 200 during normal operation). You should not store arbitrary, frequently changing values in a string metric.
valueThe value assigned to the metric.

Putting it all together, the output of a plugin that has successfully executed might look something like:

status Turkey thermometer returned valid response
metric internal_temperature uint32 165
metric ambient_temperature uint32 325

If the plugin failed, it might print the following before exiting non-zero:

status Turkey thermometer not responding

agent.redis

The agent.redis check will retrieve Redis server metrics

Attributes

FieldDescriptionValidation
hostnameRedis server hostnameValid hostname, IPv4 or IPv6 address
passwordOptional Redis server passwordOptional. String between 1 and 255 characters long
portRedis server portInteger between 1-65535 inclusive
timeoutConnection timeout in millisecondsOptional. Integer

Metrics

MetricDescriptionType
bgrewriteaof_in_progress(Redis 2.4.16 only) Flag indicating a RDB save is on-goingInt32
bgsave_in_progress(Redis 2.4.16 only) Flag indicating a RDB save is on-goingInt32
blocked_clientsNumber of clients pending on a blocking call (BLPOP, BRPOP, BRPOPLPUSH)Int32
changes_since_last_save(Redis 2.4.16 only) Number of changes since the last dumpInt32
connected_clientsNumber of client connections (excluding connections from slaves)Int32
evicted_keysNumber of evicted keys due to maxmemory limitInt32
pubsub_patternsGlobal number of pub/sub pattern with client subscriptionsInt32
total_commands_processedTotal number of commands processed by the serverGauge
total_connections_receivedTotal number of connections accepted by the serverGauge
uptime_in_secondsNumber of seconds since Redis server startInt32
used_memoryTotal number of bytes allocated by Redis using its allocator (either standard libc, jemalloc, or an alternative allocator such as tcmalloc.Int32
versionVersion of the serverString

agent.windows_perfos

The agent.windows_perfos check returns metrics regarding windows performance data. This check is only available on Windows platforms.

Attributes

No fields are present for this particular check type.

Metrics

MetricDescriptionType
AlignmentFixupsPersecShows the rate, in incidents per second, at which alignment faults, were fixed by the system.Uint32
ContextSwitchesPersecShows the combined rate, in incidents per second, at which all processors on the computer were switched from one thread to another. It is the sum of the values of Thread Context Switches/sec for each thread running on all processors on the computer, and is measured in numbers of switches. Context switches occur when a running thread voluntarily relinquishes the processor, or is preempted by a higher priority, ready thread.Uint32
ExceptionDispatchesPersecShows the rate, in incidents per second, at which exceptions were dispatched by the system.Uint64
FileControlBytesPersecShows the overall rate, in incidents per second, at which bytes were transferred for all file system operations that were neither read nor write operations, such as file system control requests and requests for information about device characteristics or status.Uint32
FileControlOperationsPersecShows the combined rate, in incidents per second, of file system operations that were neither read nor write operations, such as file system control requests and requests for information about device characteristics or status. This is the inverse of FileDataOperationsPersec.Int32
FileReadBytesPersecShows the overall rate, in incidents per second, at which bytes were read to satisfy file system read requests to all devices on the computer, including read operations from the file system cache.Uint64
FileReadOperationsPersecThe number of errors while transmitting over the interface.Uint32
FileWriteBytesPersecShows the overall rate, in incidents per second, at which bytes were written to satisfy file system write requests to all devices on the computer, including write operations to the file system cache.Uint64
FloatingEmulationsPersecShows the rate, in incidents per second, of floating emulations performed by the system.Uint32
PercentRegistryQuotaInUsePercentage of the total registry quota allowed that is currently being used by the system. This property displays the current percentage value only; it is not an average.Uint32
ProcessesShows the number of processes in the computer at the time of data collection. This is an instantaneous count, not an average over the time interval. Each process represents a program that is running.Uint32
ProcessorQueueLengthShows the number of threads in the processor queue. Unlike the disk counters, this counter shows ready threads only, not threads that are running. There is a single queue for processor time, even on computers with multiple processors.Therefore, if a computer has multiple processors, you need to divide this value by the number of processors servicing the workload. A sustained processor queue of greater than two threads generally indicates processor congestion.Uint32
SystemCallsPersecShows the combined rate, in incidents per second, of calls to operating system service routines by all processes running on the computer. These routines perform all of the basic scheduling and synchronization of activities on the computer, and provide access to non-graphic devices, memory management, and name space management.Uint32
SystemUpTimeShows the total time, in seconds, that the computer has been operational since it was last started.Uint64
ThreadsShows the number of threads in the computer at the time of data collection. This is an instantaneous count, not an average over the time interval. A thread is the basic executable entity that can execute instructions in a processor.Uint32

Hostinfo checks

Hostinfo checks are a special class of checks that run on demand.

In contrast to the remote and agent check types which enable you to schedule alarms or alerts for remote and agent-type checks and run them on a regular schedule, you cannot schedule Hostinfo checks or create alarms or alerts for them.

Create Hostinfo checks to perform tasks like the following:

  • Fetch data on demand. For example, you can use a Hostinfo check to pipe data about the host to other services or applications.
  • Run occasional checks to troubleshoot an issue.
  • Periodically fetch data from large clusters of servers with the granularity of fetching from an individual computer. For example, use a Hostinfo check to retrieve information from a dashboard built on Kibana.
  • Use in conjunction with service helper software to generate suggestions that are based on the status of a system or piece information that is required by support technicians.

The following table provides a list of the Hostinfo checks supported by the monitoring service.

Hostinfo checks supported by Rackspace Monitoring

Hostinfo typeDescription
connectionsRuns the arp -an and netstate -naten commands and retrieves information about any open listening ports and any connections to them.
iptablesRuns the iptables -S command to retrieve data about IPv4 policies.
ip6tablesRuns the iptables -S command to retrieve data about IPv6 policies.
autoupdatesChecks if automatic updates are enabled on a Linux distribution.
passwdReads /etc/passwd and then runs the passwd -S command for every user. Obtains password-related
pamReads /etc/pam.d and retrieves data about pluggable authentication modules.
cronReads files in the crontabs directory and retrieves information about scheduled Cron jobs.
kernel_modulesReads the /proc/modules virtual directory and retrieves data about the modules that are loaded into the kernel.
cpuRetrieves information about the host’s CPU.
diskRetrieves information about the host’s hard disks.
filesystemRetrieves information about the host’s filesystem.
filesystem_stateRetrieves information about the read-only/read-write filesystems.
loginReads /etc/login.defs and retrieves data about the login shell. This check does not retrieve any password information or any other sensitive data.
memoryRetrieve information about the host’s memory.
networkRetrieves information about the host’s network interface.
nilReturns no information. This Hostinfo check is mainly used within the monitoring agent code itself.
packagesRuns either the dpkg-query or rpm -qa command and retrieves a list of package names and versions.
procsRetrieves information about the processes that are running on the host.
systemRetrieves information about the host’s operating system.
whoRetrieves information about the user, device, time and host.
dateRetrieves the date and time on the host.
sysctlRuns the sysctl -A command and retrieves all possible key-value pairs of the kernel parameters that can be set at runtime.
sshdRuns the sshd -T command and retrieves the configuration parameters for the open SSH daemon.
fstabReads /etc/fstab and retrieves information about the file system configuration.
filepermsReads a pre-specified list of files and checks and retrieves their permissions.
servicesReads a few folders and files and generates a list of startup services.
deleted_libsGreps through the output of lsof -nnP to find deleted or missing libraries for running processes.
cveRetrieves a unique sorted list of common vulnerabilities and exposures that have been patched on the host system.
last_loginsRuns last to get information about previous icurrent logged-in user, bootups and when last started logging.
remote_servicesRuns the netstat -tlpen command to obtain a list of active internet connections to servers and underlying programs that are using them.
ip4routesRuns the netstat -nr4 command and retrieves information about the kernel’s IPv4 routing tables.
ip6routesRuns the netstat -nr6 command and retrieves information about the kernel’s IPv6 routing tables
apache2Retrieves information about the host’s apache2 instance and installation if it exists.
fail2banRetrieves information about the host’s fail2ban instance and installation.
lsyncdChecks the status of the live syncing daemon or lsyncd.
nginx_configReturns vhosts, version, includes, status (0 if everything is ok when nginx -t is run), configuration path, prefix and configure arguments for local nginx.
wordpressReturns the path, version and edition of local Wordpress instances found via the apache2 and nginx configurations.
magentoReturns the path, version and edition of local Magento instances found via the apache2 and nginx configurations.
phpReturns information such as version, type (HHVM/PHP), and errors related to PHP. Uses the CLI and log files to to extract this information.
postfixChecks the status of the postfix mail server.

You can use the Rackspace Monitoring API to run Hostinfo checks. To run a hostinfo check, issue the following cURL request:

Use the following cURL request to run Hostinfo checks by using the monitoring service.

curl -H 'X-Auth-Token: $token' '
https://monitoring.api.rackspacecloud.com/v1.0/
/agents/<agent_id>/host_info/<hostinfo_type>

For more information about how to work with checks using the Rackspace Monitoring API, see the Checks section in the Rackspace Monitoring Developer Guide. For more information working with Hostinfo checks, see the Agent host information.

Check status codes

This section provides a list of a set of status messages and codes that can be returned by various check types.

The following table lists the status messages for various check types and provides a resolution for the issue.

Status code or messageDescriptionResolution
prevented by ACL 'global'The HTTP remote check is attempting to access an IP Address that is a private address (127.x.x.x, 192.168.x.x, etc).Private IP addresses are not supported. Specify a public IP address instead.