Check types reference

The Check type and fields reference provides details about the following agent check types supported by the Rackspace Monitoring service.

Note

Most check types include some example metrics. This helps you better understand creating successful alarm criteria.

Remote check types

Rackspace Monitoring supports the following remote check types.

remote.dns

The remote.dns check run a DNS check against a given target. This check should assist in verifying functionality of a DNS server, for example ensuring that it is publishing the domains you think that it should be publishing.

Attributes

Field

Description

Validation

query

Specifies the DNS query.

String, valid hostname

record_type

Specifies the DNS record type.

String matching the regex /^(A|AAAA| TXT|MX|SOA|CNAME|PTR|NS|MB|MD| MF|MG|MR)$/

port

Specifies the port number. The default is 53.

Optional, whole number (may be zero padded), must be an integer between 1-65535 inclusive

Metrics

Metric

Description

Type

answer

The list of space-separated IP addresses for the specified name resolution.

String

rtt

The roundtrip time to execute a remote.dns check.

Double

ttl

Specifies the port number. The default is 53.

Uint32

remote.ftp-banner

The remote.ftp-banner check will attempt to connect to a FTP server and verify that it re- sponds to the connection.

Attributes

Field

Description

Validation

port

Specifies the port number. The default is 21.

This field is optional. Must be a whole number (may be zero padded). This value must be an integer between 1-65535 inclusive

Metrics

Metric

Description

Type

banner

The string sent from the server on connect

String

banner_match

The matched string from the banner_match regular expression specified during check creation.

String

body_match

The string representing the body match specified in a remote.ftp-banner check.

String

duration

The time it took to finish executing the check in milliseconds.

Uint32

tt_body

The time to the body measured in milliseconds.

Uint32

tt_connect

The time to connect measured in milliseconds.

Uint32

tt_firstbyte

The time to first byte measured in milliseconds.

Uint32

remote.http

The remote.http check will try to connect to the server and retrieve the specified URL using the specified method, optionally with the password and user for authentication, using SSL, and checking the body with a regex. This can be used to test that a web application running on a server is responding without generating error messages. It can also test if the SSL certificate is valid.

Note

The maximum size of the content returned in a remote.http check is 32k, with overhead and compression taken into account. This limitation helps monitoring remain responsive.

Attributes

Field

Description

Validation

url

Specifies the target URL.

String between 1 and 8096 characters long

auth_password

Optional auth password

Optional. String between 1 and 255 characters long

auth_user

Optional auth user

Optional. String between 1 and 255 characters long

body

Body match regular expression used to run against HTTP response content and generate metric body_match (see Metrics table below). Body is limited to 100k and match is truncated to 80 characters.

Optional. String between 1 and 255 characters long

body_matches

A map of key/regular-expression pairs used to run against HTTP response content and generate one metric body_match_<key> for each key/regular-expression pair (see Metrics table below). Body is limited to 100k and match is truncated to 80 characters.

Optional. Hash [String,String between 1 and 50 characters long, String matching the regex /^[-_ a-z0-9]+$/i: String,String between 1 and 255 characters long]. Array or object with number of items between 0 and 4.

follow_redirects

Follow redirects (default:true)

Optional. Boolean.

headers

Arbitrary headers which are sent with the request.

Optional. Hash [String,String between 1 and 50 characters long: String,String between 1 and 50 characters long]. Array or object with number of items between 0 and 10. A value which is not one of: content-length, user-agent, host, connection, keep-alive, transfer-encoding, upgrade.

method

HTTP method. The default is GET.

Optional. String. One of (HEAD, GET, POST, PUT, DELETE, INFO)

payload

Specify a request body (limited to 1024 characters). If a redirect is set, the payload is only sent to the first location.

Optional. String between 1 and 1024 characters long

Note

When you set up a website and the check always returns unknown content-encoding: it is because of the HTTP body check limit of 100. This limit is the amount of space for the Monitoring Pollers (where the site is checked from). If the amount of space required to do the HTTP(S) check is greater than 100k, then only the first 100k can be checked.

If the customer uses Compression on the pages, such as compress or gzip Content-Encoding, then the full compressed page must be less than or equal to 100k. This is because the full page must be downloaded and uncompressed before it can verify the check.

This is also the reason why you can only check against strings within the first 100k of the web page.

Metrics

Metric

Description

Type

body_match

The string representing the any matched string from HTTP response content using the regular expression specified in body attribute in check.

String

body_match_<key>

The metric is generated for each key specified in body_matches check attribute. For example, a body_matches value of {“register”:”Register Now!”, “contact”:”Contact Us”} will generate two metrics: body_match_register and body_match_contact.

String

bytes

The number of bytes returned from a response payload.

Int32

cert_end

The absolute timestamp in seconds for the certificate expiration. This is only available when performing a check on an HTTPS server.

Uint32

cert_end_in

The relative timestamp in seconds until certification expiration. This is only available when performing a check on an HTTPS server.

Int32

cert_error

A string describing a certificate error in our validation. This is only available when performing a check on an HTTPS server.

String

cert_issuer

The issue string for the certificate. This is only available when performing a check on an HTTPS server.

String

cert_start

The absolute timestamp of the issue of the certificate. This is only available when performing a check on an HTTPS server.

Uint32

cert_subject

The subject of the certificate. This is only available when performing a check on an HTTPS server.

String

cert_subject_alternative_names

The alternative name for the subject of the certificate. This is only available when performing a check on an HTTPS server. (See an example alarm following this table.)

String

code

The status code returned.

String

duration

The time it took to finish executing the check in milliseconds.

Uint32

truncated

The number of bytes that the result was truncated by.

Uint32

tt_connect

The time to connect measured in milliseconds.

Uint32

tt_firstbyte

The time to first byte measured in milliseconds.

Uint32

Note

The following is an example alarm for cert_subject_alternative_names, where you would replace example.com with an expected host name on the certificate’s SAN list:

if (metric['cert_subject_alternative_names'] nregex '.*example.com.*') {
  return new AlarmStatus(CRITICAL, 'Missing expected SAN');
}

remote.imap-banner

The remote.imap-banner check will attempt to connect to an IMAP server and verify that it response to the connection

Attributes

Field

Description

Validation

port

Port number (default: 143)

Optional. Whole number (may be zero padded). Integer between 1-65535 inclusive.

ssl

Enable SSL

Optional. Boolean.

remote.mssql-banner

The remote.mssql-banner check will attempt to connect to a Microsoft SQL database server and verify that it is accepting connections.

Attributes

Field

Description

Validation

port

Port number (default: 1433)

Optional. Whole number (may be zero padded). Integer between 1-65535 inclusive.

ssl

Enable SSL

Optional. Boolean.

remote.mysql-banner

The remote.mysql-banner check will attempt to connect to a MySQL database server and verify that it is accepting connections.

Attributes

Field

Description

Validation

port

Port number (default: 3306)

Optional. Whole number (may be zero padded). Integer between 1-65535 inclusive.

ssl

Enable SSL

Optional. Boolean.

remote.ping

The remote.ping check will attempt to ping a server.

Attributes

Field

Description

Validation

count

Number of pings to send within a single check.

This field is optional. Must be a whole number (may be zero padded). This value must be an integer between 1-15 inclusive

Metrics

Metric

Description

Type

available

The whole number representing the percent of pings that returned back for a remote.ping check.

Double

average

The average response time in milliseconds for all ping packets sent out and later retrieved.

Double

count

The number of pings (ICMP packets) sent.

Int32

maximum

The maximum roundtrip time in milliseconds of an ICMP packet.

Double

minimum

The minimum roundtrip time in milliseconds of an ICMP packet.

Double

remote.pop3-banner

The remote.pop3-banner check will attempt to connect to a POP3 mailbox server and verify that it responds to the connection.

Attributes

Field

Description

Validation

port

Port number (default: 110)

Optional. Whole number (may be zero padded). Integer between 1-65535 inclusive.

ssl

Enable SSL

Optional. Boolean.

remote.postgresql-banner

The remote.postgresql-banner check will attempt to connect to a PostgreSQL database server and verify that it is accepting connections.

Attributes

Field

Description

Validation

port

Port number (default: 5432)

Optional. Whole number (may be zero padded). Integer between 1-65535 inclusive.

ssl

Enable SSL

Optional. Boolean.

remote.smtp-banner

The remote.smtp-banner check will attempt to connect to a SMTP mail server and verify that a HELO/EHLO is received.

Attributes

Field

Description

Validation

port

Port number (default: 25)

Optional. Whole number (may be zero padded). Integer between 1-65535 inclusive.

ssl

Enable SSL

Optional. Boolean.

Metrics

Metric

Description

Type

banner

The string sent from the server on connect.

String

banner_match

The matched string from the banner_match regular expression specified during check creation.

String

bytes

The number of bytes returned from a response payload.

Int32

cert_end

The absolute timestamp in seconds for the certificate expiration. This is only available when performing a check on an HTTPS server.

Uint32

cert_end_in

The relative timestamp in seconds until certification expiration. This is only available when performing a check on an HTTPS server.

Int32

cert_error

A string describing a certificate error in our validation. This is only available when performing a check on an HTTPS server.

String

cert_issuer

The issue string for the certificate. This is only available when performing a check on an HTTPS server.

String

cert_start

The absolute timestamp of the issue of the certificate. This is only available when performing a check on an HTTPS server.

Uint32

cert_subject

The subject of the certificate. This is only available when performing a check on an HTTPS server.

String

cert_subject_alternative_names

The alternative name for the subject of the certificate. This is only available when performing a check on an HTTPS server. (See an example alarm following this table.)

String

duration

The time it took to finish executing the check in milliseconds..

Uint32

tt_connect

The time to connect measured in milliseconds.

Uint32

tt_firstbyte

The time to first byte measured in milliseconds.

Uint32

Note

The following is an example alarm for cert_subject_alternative_names, where you would replace example.com with an expected host name on the certificate’s SAN list:

if (metric['cert_subject_alternative_names'] nregex '.*example.com.*') {
  return new AlarmStatus(CRITICAL, 'Missing expected SAN');
}

remote.smtp

The remote.smtp check will attempt to connect to a SMTP mail server, send an email from the ‘from’ parameter, to the ‘to’ parameter, with a payload specified by the ‘payload’ parameter setting the EHLO from host to the value in ‘ehlo’.

Attributes

Field

Description

Validation

ehlo

Specifies the EHLO parameter.

Optional. String between 1 and 255 characters long.

from

Specifies the From parameter.

Optional. String between 1 and 255 characters long.

payload

Specifies the payload.

Optional. String between 1 and 1024 characters long.

port

Specifies the port number.

Optional. Whole number (may be zero padded). Integer between 1-65535 inclusive.

starttls

Specifies whether the connection should be upgraded to TLS/ SSL.

Optional. Boolean.

to

Specifies the To parameter. If this field is blank, a “quit” is issued before sending a to line, and the connection is terminated.

Optional. String between 1 and 255 characters long.

remote.ssh

The remote.ssh check will attempt to SSH to a target.

Attributes

Field

Description

Validation

port

Specifies the port number. The default is 22.

This field is optional. Must be a whole number (may be zero padded). This value must be an integer between 1-65535 inclusive

Metrics

Metric

Description

Type

duration

Specifies the time it took to finish executing the check in milliseconds.

Uint32

fingerprint

Specifies the ssh fingerprint used to verify identity.

String

remote.tcp

The remote.tcp check will attempt to connect to a host and port, and optionally issue a banner match to ensure that the service is responding as specified. This can be used to test services that are not covered by the existing HTTP, SMTP, SSH, MySQL, etc. checks.

Attributes

Field

Description

Validation

port

Specifies the port number.

Whole number (may be zero padded). Integer between 1-65535 inclusive.

banner_match

Specifies the banner match regex.

Optional. String between 1 and 255 characters long.

body_match

Specifies the body match regex. Key/Values are captured when matches are specified within the regex. Note: Maximum body size is 1024 bytes.

Optional. String between 1 and 255 characters long.

send_body

Send a body. If a banner is provided the body is sent after the banner is verified.

Optional. String between 1 and 1024 characters long.

ssl

Specifies whether SSL is enabled.

Optional. Boolean.

Metrics

Metric

Description

Type

banner

Specifies the string that is sent from the server on connect.

String

banner_match

Specifies the matched string from the banner_match regular expression specified during check creation.

String

duration

Specifies the time it took to finish executing the check in milliseconds.

Uint32

tt_connect

Specifies the time to connect measured in milliseconds.

Uint32

tt_firstbyte

Specifies the time to first byte measured in milliseconds.

Uint32

remote.telnet-banner

The remote.telnet-banner check will attempt to connect to a Telnet (or similar protocol) server and verify that an appropriate banner is received.

Attributes

Field

Description | Validation

port

Specifies the port number. (Default: 23) | Optional. Whole number (may be zero padded). Integer between 1-65535 inclusive.

banner_match

Specifies the banner match check. | Optional. String between 1 and 255 characters long.

ssl

Specifies whether SSL is enabled. | Optional. Boolean.

Agent check types

Rackspace Monitoring supports the following agent check types.

agent.apache check

The agent.apache check will retrieve Apache HTTP server metrics

Attributes

Field

Description

Validation

timeout

Specifies the plugin execution timeout in milliseconds.

Optional. Integer.

url

Specifies the URL. Defaults to http://127.0.0.1/server-status.

Optional. URL.

Metrics

Metric

Description

Type

busy_workers

Specifies the number of workers serving requests

Int64

bytes_per_request

Averages giving the number of request per second, the number of bytes served second

Int64

bytes_per_second

Averages giving the number of requests per second, the number of bytes per request

Int64

closing

The number of workers closing the connection

Int64

cpu_load

Total percentage of CPU used by workers

Double

dns

The number of workers performing DNS lookup

Int64

gracefully_fishing

The number of workers gracefully fishing

Int64

idle

The number of idle cleanup workers

Int64

idle_workers

The number of idle workers

Int64

keepalive

The number of workers kept alive (reading)

Int64

logging

The number of workers logging

Int64

open

The number of workers with no current process

Int64

reading

The number of workers reading the request

Int64

requests_per_second

The number of requests per second

Int64

sending

The number of workers sending a reply

Int64

starting

The number of workers starting up

Int64

total_access

Total number of accesses served

Int64

total_kbytes

Total kilobytes served

Int64

uptime

Time since the last start/restart in milliseconds

Int64

waiting

The number of workers waiting for connection

Int64

agent.cpu

The agent.cpu check will attempt to measure the usage of the CPU on a host.

Attributes

No fields are present for this particular check type.

Metrics

Metric

Description

Type

idle_percent_average

Recent percentage of CPU time spent idle.

Double

irq_percent_average

Recent percentage of CPU time spent handling hardware interrupts.

Double

max_cpu_usage

Recent percentage utilization of the most-utilized CPU. This is useful to detect when some CPUs are pegged while others are idle.

Double

min_cpu_usage

Recent percentage utilization of the least-utilized CPU. This is useful to detect when some CPUs are pegged while others are idle.

Double

stolen_percent_average

Recent percentage of CPU time spent waiting for the CPU to service other virtual CPUs.

Double

sys_percent_average

Recent percentage of CPU time utilized by kernel mode processes.

Double

usage_average

Recent percentage of CPU time utilized by all processes. processes.

Double

user_percent_average

Recent percentage of CPU time utilized by user mode processes in a “wait” state.

Double

wait_percent_average

Recent percentage of CPU time utilized by processes in a “wait” state.

Double

agent.disk

The agent.disk check exposes disk related metrics (service time, wait time, etc.).

Attributes

Field

Description | Validation

target

The disk to check (eg ‘/dev/xvda1’) | String between 1 and 512 characters long

Metrics

Metric

Description

Type

queue

Measured in seconds, this is the current disk queue length, which is an instantaneous measurement of the I/O queue for the given disk/partition.

Int64

qtime

Measured in milliseconds, this is the weighted number of milliseconds spent doing I/Os. This field is incremented at each I/O start, I/O completion, I/O merge, or read of these stats by the number of I/Os in progress times the number of milliseconds spent doing I/O since the last update of this field. This can provide an easy measure of both I/O completion time and the backlog that might be accumulating.

Int64

read_bytes

The number of physical disk bytes read, the prefix / will change depending on the mount points discovered.

Int64

reads

The number of physical disk reads, the prefix / will change depending on the mount points discovered.

Int64

rtime

The amount of time spent reading, the prefix / will change depending on the mount points discovered.

Int64

write_bytes

The number of physical disk bytes written, the prefix / will change depending on the mount points discovered.

Int64

writes

The number of physical disk writes, the prefix / will change depending on the mount points discovered.

Int64

wtime

The amount of time spent writing, the prefix / will change dependending on the mount points discovered.

Int64

agent.filesystem

The agent.filesystem check exposes file system related metrics (free space, used space, etc.)

Attributes

Field

Description

Validation

target

The mount point to check, either /var or C:\

String between 1 and 512 characters long.

Metrics

Metric

Description

Type

avail

Available space on the filesystem in kilobytes for the current user, which is root, that is running the agent.

Int64

free

Free space available on the filesystem in kilobytes including reserved space. This is calculated as number of free file blocks x block size

Int64

options

The option used to mount the device to the filesystem. Includes the rw f which indicates the device is in read/write mode.

Int64

total

Total space on the filesystem, in kilobytes, including reserved space. This is calculated as number of total file blocks x block size

Int64

used

Used space on the filesystem, in kilobytes. This number does not include the reserved space. This is calculated as total - free

Int64

files

Number of inodes on the filesystem.

Int64

free_files

Number of free inodes on the filesystem.

Int64

Note

The reserved space only applies to Linux systems. It is the space saved for important root processes and possible rescue actions. In some systems the reserved space can be used for fragmentation allocation. For more information about Ext3 and Ext4: https://www.redhat.com/archives/ext3-users/2009-January/msg00026.html.

The files and free_files metrics only apply to Linux systems.

agent.filesystem_state

The agent.filesystem_state check exposes filesystem metrics for read-write/read-only system mounts.

Attributes

No fields are present for this particular check type.

Metrics

Metric

Description

Type

total_ro

Total number of filesystems mounted read-only.

Int64

total_rw

Total number of filesystems mounted read-write

Int64

devices_ro

Comma delimited list of devices mounted read-only.

String

devices_rw

Comma delimited list of devices mounted read-write.

String

agent.load_average

The agent.load_average check attempts to measure the UNIX style load average on a host.

For more information about the commands used to get the load average, see Check the System Load on Linux.

Attributes

No fields are present for this particular check type.

Metrics

Metric

Description

Type

1m

One minute load average.

Double

5m

Five minute load average.

Double

15m

Fifteen minute load average.

Double

agent.memory

Attributes

No fields are present for this particular check type.

Metrics

The memory available to the system is used in three different ways:

  • Used by the processese running in the system, this value is under “actual_used” metric.

  • Used by the kernel, this value is not returned from the check but can be deduced.

  • Not used by either the running processes or kernel, this value is under “free” metric.

For convenience, the system returns the value of used/free memory for the case of including kernel and excluding kernel so that you don’t have to do the calculation in your head.

Metric

Description

Type

actual_free

The amount of free memory, ‘free’ plus kernel memory.

Int64

actual_used

The actual amount of used memory excluding kernel memory.

Int64

free

The amount of free memory not including kernel memory.

Int64

ram

The amount of RAM.

Int64

swap_free

The amount of free SWAP memory.

Int64

swap_page_in

The number of SWAP-in pages.

Int64

swap_page_out

The number of SWAP-out pages.

Int64

swap_total

The total amount of SWAP memory.

Int64

swap_used

The amount of used SWAP memory.

Int64

total

The total amount of memory.

Int64

used

The total amount of used memory, ‘actual_used’ plus kernel memory

Int64

agent.mysql

The agent.mysql check retrieves MySQL server metrics.

Note

Except for the replication.slave_running’ metric, all metrics starting with replication do not show up if there is no slave running.

If the libmysqlclient-dev package is not already present, you should install it on the host where the agent.mysql plug-in runs.

Attributes

Field

Description

Validation

host

Mysql server hostname (default: 127.0.0.1).

Optional. Valid hostname, IPv4 or IPv6 address

mycnf

Specifies whether my.cnf should be loaded.

Optional. Boolean

password

Specifies the server password.

Optional. String between 1 and 255 characters long

port

Specifies the Mysql server port (default: 3306).

Optional. Integer between 1-65535 inclusive

socket

Specifies the path to the domain socket.

Optional. String between 1 and 255 characters long

timeout

Specifies the plugin execution timeout in milliseconds

Optional. Integer

username

Specifies the username.

Optional. String between 1 and 16 characters long

Metrics

Metric

Description

Type

bytes_received

The number of bytes received from all clients. (statvar_Bytes_received)

Cumulative

bytes_sent

The number of bytes sent to all clients. (statvar_Bytes_sent)

Cumulative

core.aborted_clients

The number of connections that were aborted because the client died without closing the connection properly. (statvar_Aborted_clients)

Instantaneous

core.connections

The number of connection attempts (successful or not) to the MySQL server. (statvar_Connections)

Cumulative

core.queries

The number of statements executed by the server. (statvar_Queries)

Cumulative

core.uptime

The number of seconds that the server has been up. (statvar_Uptime)

Instantaneous

handler.commit

The number of internal COMMIT statements. (statvar_Handler_commit)

Cumulative

handler.delete

The number of times that rows have been deleted from tables. (statvar_Handler_delete)

Cumulative

handler.read_first

The number of times that rows have been deleted from tables. (statvar_Handler_delete)

Cumulative

handler.read_first

The number of times the first entry in an index was read. (statvar_Handler_read_first)

Cumulative

handler.read_key

The number of requests to read a row based on a key. If this value is high, it is a good indication that your tables are properly indexed for your queries. (statvar_Handler_read_key)

Cumulative

handler.read_next

The number of requests to read the next row in key order. This value is incremented if you are querying an index column with a range constraint or if you are doing an index scan. (statvar_Handler_read_next)

Cumulative

handler.read_prev

The number of requests to read the previous row in key order. This read method is mainly used to optimize ORDER BY … DESC. (statvar_Handler_read_prev)

Cumulative

handler.read_rnd

The number of requests to read a row based on a fixed position. This value is high if you are doing a lot of queries that require sorting of the result. You probably have a lot of queries that require MySQL to scan entire tables or you have joins that do not use keys properly. (statvar_Handler_read_rnd)

Cumulative

handler.rollback

The number of requests for a storage engine to perform a rollback operation. (statvar_Handler_rollback)

Instantaneous

handler.savepoint

The number of requests for a storage engine to place a savepoint. (statvar_Handler_savepoint)

Instantaneous

handler.savepoint_rollback

The number of requests for a storage engine to roll back to a savepoint. (statvar_Handler_savepoint_rollback)

Instantaneous

handler.update

The number of requests to update a row in a table. (statvar_Handler_update)

Cumulative

handler.write

The number of requests to insert a row in a table. (statvar_Handler_write)

Cumulative

innodb.buffer_pool_pages_data

The number of pages containing data (dirty or clean). (statvar_Innodb_buffer_pool_pages_data)

Instantaneous

innodb.buffer_pool_pages_dirty

The number of pages currently dirty. (statvar_Innodb_buffer_pool_pages_dirty)

Instantaneous

innodb.buffer_pool_pages_flushed

The number of buffer pool page-flush requests. (statvar_Innodb_buffer_pool_pages_flushed)

Instantaneous

innodb.buffer_pool_pages_free

The number of free pages. (statvar_Innodb_buffer_pool_pages_free)

Instantaneous

innodb.buffer_pool_pages_total

The total size of the buffer pool, in pages. (statvar_Innodb_buffer_pool_pages_total)

Instantaneous

innodb.buffer_pool_read_requests

The number of logical read requests. (statvar_Innodb_buffer_pool_read_requests)

Cumulative

innodb.buffer_pool_reads

The number of logical reads that InnoDB could not satisfy from the buffer pool, and had to read directly from the disk. (statvar_Innodb_buffer_pool_reads)

Cumulative

innodb.buffer_pool_size

The size in bytes of the memory buffer InnoDB uses to cache data and indexes of its tables. (sysvar_innodb_buffer_pool_size)

Instantaneous

innodb.data_pending_fsyncs

The current number of pending fsync() operations. (statvar_Innodb_data_pending_fsyncs)

Instantaneous

innodb.data_pending_reads

The current number of pending reads. (statvar_Innodb_data_pending_reads)

Instantaneous

innodb.data_pending_writes

The current number of pending writes. (statvar_Innodb_data_pending_writes)

Instantaneous

innodb.pages_created

The number of pages created. (statvar_Innodb_pages_created)

Cumulative

innodb.pages_read

The number of pages read. (statvar_Innodb_pages_read)

Cumulative

innodb.pages_written

The number of pages written. (statvar_Innodb_pages_written)

Cumulative

innodb.row_lock_time

The total time spent in acquiring row locks, in milliseconds. (statvar_Innodb_row_lock_time)

Cumulative

innodb.row_lock_time_avg

The average time to acquire a row lock, in milliseconds. (statvar_Innodb_row_lock_time_avg)

Instantaneous

innodb.row_lock_time_max

The maximum time to acquire a row lock, in milliseconds. (statvar_Innodb_row_lock_time_max)

Instantaneous

innodb.row_lock_waits

The number of times a row lock had to be waited for. (statvar_Innodb_row_lock_waits)

Cumulative

innodb.rows_deleted

The number of rows deleted from InnoDB tables. (statvar_Innodb_rows_deleted)

Cumulative

innodb.rows_inserted

The number of rows inserted into InnoDB tables. (statvar_Innodb_rows_inserted)

Cumulative

innodb.rows_read

The number of rows read from InnoDB tables. (statvar_Innodb_rows_read)

Cumulative

innodb.rows_updated

The number of rows updated in InnoDB tables. (statvar_Innodb_rows_updated)

Cumulative

key.buffer_size

Index blocks for MyISAM tables are buffered and are shared by all threads. (sysvar_key_buffer_size)

Instantaneous

max.connections

The maximum permitted number of simultaneous client connections. (sysvar_max_connections)

Instantaneous

qcache.free_blocks

The number of free memory blocks in the query cache. (statvar_Qcache_free_blocks)

Instantaneous

qcache.free_memory

The amount of free memory for the query cache. (statvar_Qcache_free_memory)

Instantaneous

qcache.hits

The number of query cache hits. (statvar_Qcache_hits)

Cumulative

qcache.inserts

The number of queries added to the query cache. (statvar_Qcache_inserts)

Cumulative

qcache.lowmem_prunes

The number of queries that were deleted from the query cache because of low memory. (statvar_Qcache_lowmem_prunes)

Instantaneous

qcache.not_cached

The number of noncached queries (not cacheable, or not cached due to the query_cache_type setting). (statvar_Qcache_not_cached)

Instantaneous

qcache.queries_in_cache

The number of queries registered in the query cache. (statvar_Qcache_queries_in_cache)

Cumulative

qcache.size

The amount of memory allocated for caching query results. (sysvar_query_cache_size)

Instantaneous

qcache.total_blocks

The total number of blocks in the query cache. (statvar_Qcache_total_blocks)

Cumulative

replication.exec_master_log_pos

The position in the current master binary log file to which the SQL thread has read and executed, marking the start of the next transaction or event to be processed. (show-slave-status.html).

Instantaneous

replication.last_errno

The error number returned by the most recently executed statement. (show-slave-status.html).

Instantaneous

replication.last_io_error

The error message of the most recent error that caused the I/O thread to stop (show-slave-status.html).

String

replication.max_relay_log_size

If a write by a replication slave to its relay log causes the current log file size to exceed the value of this variable, the slave rotates the relay logs (closes the current file and opens the next one). (sysvar_max_relay_log_size)

Instantaneous

replication.read_master_log_pos

The position in the current master binary log file up to which the I/O thread has read. (show-slave-status.html)

Instantaneous

replication.relay_log_pos

The position in the current relay log file up to which the SQL thread has read and executed. (show-slave-status.html)

Instantaneous

replication.seconds_behind_master

In essence, this field measures the time difference in seconds between the slave SQL thread and the slave I/O thread. (show-slave-status.html)

Instantaneous

replication.slave_io_running

Whether the I/O thread is started and has connected successfully to the master. Internally, the state of this thread is represented by one of the following three values: MYSQL_SLAVE_NOT_RUN, MYSQL_SLAVE_RUN_NOT_CONNECT, MYSQL_SLAVE_RUN_CONNECT (show-slave- status.html)

Boolean

replication.slave_io_state

A copy of the State field of the SHOW PROCESSLIST output for the slave I/O thread. This tells you what the thread is doing: trying to connect to the master, waiting for events from the master, reconnecting to the master, and so on. (show-slave-status.html).

String

replication.slave_open_temp_tables

The number of temporary tables that the slave SQL thread currently has open. If the value is greater than zero, it is not safe to shut down the slave. (statvar_Slave_open_temp_tables).

Instantaneous

replication.slave_retried_transactions

The total number of times since startup that the replication slave SQL thread has retried transactions. (statvar_Slave_retried_transactions)

Instantaneous

replication.slave_running

This is ON if this server is a replication slave that is connected to a replication master, and both the I/O and SQL threads are running; otherwise, it is OFF. (statvar_Slave_running)

String

replication.slave_sql_running

Whether the SQL thread is started. (show- slave-status.html)

Boolean

thread.cache_size

How many threads the server should cache for reuse. (sysvar_thread_cache_size)

Instantaneous

threads.connected

The number of currently open connections. (statvar_Threads_connected)

Instantaneous

threads.created

The number of threads created to handle connections. (statvar_Threads_created)

Cumulative

threads.running

The number of threads that are not sleeping. (statvar_Threads_running)

Instantaneous

agent.network

The agent.network check will attempt to measure the usage of network devices on a host.

Attributes

Field

Description | Validation

target

The network device to check (eg ‘eth0) | String between 1 and 512 characters long

Metrics

Metric

Description

Type

rx_bytes

The number of bytes received over the interface.

Int64

rx_dropped

The number of packets received and subsequently dropped over the interface.

Int64

rx_errors

The number of errors received over the interface.

Int64

rx_packets

The number of packets received over the interface.

Int64

speed

The speed at which the bytes were transmitted over the interface.

Int64

tx_bytes

The number of bytes transmitted over the interface.

Int64

tx_dropped

The number of packets attempted transmitting and subsequently dropped over the interface.

Int64

tx_error

The number of errors while transmitting over the interface.

Int64

tx_packets

The number of packets transmitted over the interface.

Int64

agent.mssql_database

The agent.mssql_database check returns metrics for a Microsoft SQL Server database.

Attributes

Field

Description

Validation

db

MS SQL Server database name

String between 1 and 255 characters long

hostname

MS SQL Server hostname

Optional. Valid hostname, IPv4 or IPv6 address

password

MS SQL Server password

Optional. String between 1 and 255 characters long

serverinstance

MS SQL Server instance to query

Optional. String between 1 and 255 characters long

username

MS SQL Server username

Optional. String between 1 and 255 characters long

agent.mssql_buffer_manager

The agent.mssql_buffer_manager check returns metrics for the Microsoft SQL Server buffer manager.

Attributes

Field

Description

Validation

computer

MS SQL Server computer name

Optional. Valid hostname, IPv4 or IPv6 address

serverinstance

MS SQL Server instance to query

Optional. String between 1 and 255 characters long

agent.mssql_sql_statistics

The agent.mssql_sql_statistics check returns metrics for the Microsoft SQL Server SQL statistics.

Attributes

Field

Description

Validation

computer

MS SQL Server computer name

Optional. Valid hostname, IPv4 or IPv6 address

serverinstance

MS SQL Server instance to query

Optional. String between 1 and 255 characters long

agent.mssql_plan_cache

The agent.mssql_plan_cache check returns metrics for the Microsoft SQL Server plan cache.

Attributes

Field

Description

Validation

computer

MS SQL Server computer name

Optional. Valid hostname, IPv4 or IPv6 address

serverinstance

MS SQL Server instance to query

Optional. String between 1 and 255 characters long

agent.mssql_memory_manager

The agent.mssql_memory_manager check returns metrics for the Microsoft SQL Server memory manager.

Attributes

Field

Description

Validation

computer

MS SQL Server computer name

Optional. Valid hostname, IPv4 or IPv6 address

serverinstance

MS SQL Server instance to query

Optional. String between 1 and 255 characters long

agent.mssql_version

The agent.mssql_version check returns version information for Microsoft SQL Server.

Attributes

Field

Description

Validation

hostname

MS SQL Server hostname

Optional. Valid hostname, IPv4 or IPv6 address

password

MS SQL Server password

Optional. String between 1 and 255 characters long

serverinstance

MS SQL Server instance to query

Optional. String between 1 and 255 characters long

username

MS SQL Server username

Optional. String between 1 and 255 characters long

agent.plugin

The agent.plugin check will attempt to run a custom plugin on a host.

Custom plugins are simply executable files which report metrics via stdout. Plugins are placed on the server to be monitored at an installation path that depends on the operating system:

Operating System

Installation Path

Linux

/usr/lib/rackspace-monitoring-agent/plugins/

Windows (32-bit agent installed on a 64-bit system )

C:\Program Files (x86)\Rackspace Monitoring\plugins

Windows (64-bit agent installed on a 64-bit system or 32-bit agent installed on a 32-bit system)

C:\Program Files\Rackspace Monitoring\plugins

After the plugin has been installed on the server, create an agent.plugin check that specifies the name of the executable file so that the plugin can begin reporting metrics to the monitoring system, like any other check. If the plugin requires any command line arguments, you can specify them using the optional args array.

Attributes

Field

Description

Validation

file

Name of the plugin file

String matching the regex //[a-zA-Z0-9.- _]+//

args

Command-line arguments which are passed to the plugin

Optional. Array [Non-empty string]. Array or object with number of items between 0 and 10

timeout

Plugin execution timeout in milliseconds

Optional. Integer

Metrics

The metrics returned are defined in the plugin script. A plugin can send up to fifty unique metrics at a time.

Community Plugin Repository

A curated repository of plugins created by Rackspace Monitoring users is avaliable on GitHub. Contributions are welcome!

Note

The Rackspace Monitoring Agent is also capable of executing Cloudkick plugins, so if you are a Cloudkick user you can just drop in any existing plugin and it should just work.

Creating Custom Plugins

Creating custom plugins is as simple as writing a script that prints a status and up to fifty metrics to standard out. The format of the status line is:

status <status>

The status string should describe whether the check was able to successfully gather metrics. It could be as simple as “success” to incidate that metrics were successfully gathered. When an error occurs that prevents metrics from being gathered, plugins should print a status that describes the error, then should exit non-zero without printing any metric lines.

The status line can be followed by up to fifty metric lines. Each line is output in the following format:

metric <name> <type> <value>

The following descriptions provide information about parameter values.

Capacity management

Parameter

Description

name

The name of the metric. Spaces are not supported. The format is alpha numeric with colon (:), underscore (_) and dot (.) allowed. Example: memory_free.

type

The metric can be any of the following types:

int32 Signed 32 bit integer value.

uint32 Unsigned 32 bit integer value.

int64 Signed 64 bit integer value.

uint64 Unsigned 64 bit integer value.

double Floating point values.

string

A string value.

Note: the monitoring system records string metrics every time they change. String metrics are designed for recording an enumerated state which infrequently changes (for example an HTTP response code which is always 200 during normal operation). You should not store arbitrary, frequently changing values in a string metric.

value

The value assigned to the metric.

Putting it all together, the output of a plugin that has successfully executed might look something like:

status Turkey thermometer returned valid response
metric internal_temperature uint32 165
metric ambient_temperature uint32 325

If the plugin failed, it might print the following before exiting non-zero:

status Turkey thermometer not responding

agent.redis

The agent.redis check will retrieve Redis server metrics

Attributes

Field

Description

Validation

hostname

Redis server hostname

Valid hostname, IPv4 or IPv6 address

password

Optional Redis server password

Optional. String between 1 and 255 characters long

port

Redis server port

Integer between 1-65535 inclusive

timeout

Connection timeout in milliseconds

Optional. Integer

Metrics

Metric

Description

Type

bgrewriteaof_in_progress

(Redis 2.4.16 only) Flag indicating a RDB save is on-going

Int32

bgsave_in_progress

(Redis 2.4.16 only) Flag indicating a RDB save is on-going

Int32

blocked_clients

Number of clients pending on a blocking call (BLPOP, BRPOP, BRPOPLPUSH)

Int32

changes_since_last_save

(Redis 2.4.16 only) Number of changes since the last dump

Int32

connected_clients

Number of client connections (excluding connections from slaves)

Int32

evicted_keys

Number of evicted keys due to maxmemory limit

Int32

pubsub_patterns

Global number of pub/sub pattern with client subscriptions

Int32

total_commands_processed

Total number of commands processed by the server

Gauge

total_connections_received

Total number of connections accepted by the server

Gauge

uptime_in_seconds

Number of seconds since Redis server start

Int32

used_memory

Total number of bytes allocated by Redis using its allocator (either standard libc, jemalloc, or an alternative allocator such as tcmalloc.

Int32

version

Version of the server

String

agent.windows_perfos

The agent.windows_perfos check returns metrics regarding windows performance data. This check is only available on Windows platforms.

Attributes

No fields are present for this particular check type.

Metrics

Metric

Description

Type

AlignmentFixupsPersec

Shows the rate, in incidents per second, at which alignment faults, were fixed by the system.

Uint32

ContextSwitchesPersec

Shows the combined rate, in incidents per second, at which all processors on the computer were switched from one thread to another. It is the sum of the values of Thread Context Switches/sec for each thread running on all processors on the computer, and is measured in numbers of switches. Context switches occur when a running thread voluntarily relinquishes the processor, or is preempted by a higher priority, ready thread.

Uint32

ExceptionDispatchesPersec

Shows the rate, in incidents per second, at which exceptions were dispatched by the system.

Uint64

FileControlBytesPersec

Shows the overall rate, in incidents per second, at which bytes were transferred for all file system operations that were neither read nor write operations, such as file system control requests and requests for information about device characteristics or status.

Uint32

FileControlOperationsPersec

Shows the combined rate, in incidents per second, of file system operations that were neither read nor write operations, such as file system control requests and requests for information about device characteristics or status. This is the inverse of FileDataOperationsPersec.

Int32

FileReadBytesPersec

Shows the overall rate, in incidents per second, at which bytes were read to satisfy file system read requests to all devices on the computer, including read operations from the file system cache.

Uint64

FileReadOperationsPersec

The number of errors while transmitting over the interface.

Uint32

FileWriteBytesPersec

Shows the overall rate, in incidents per second, at which bytes were written to satisfy file system write requests to all devices on the computer, including write operations to the file system cache.

Uint64

FloatingEmulationsPersec

Shows the rate, in incidents per second, of floating emulations performed by the system.

Uint32

PercentRegistryQuotaInUse

Percentage of the total registry quota allowed that is currently being used by the system. This property displays the current percentage value only; it is not an average.

Uint32

Processes

Shows the number of processes in the computer at the time of data collection. This is an instantaneous count, not an average over the time interval. Each process represents a program that is running.

Uint32

ProcessorQueueLength

Shows the number of threads in the processor queue. Unlike the disk counters, this counter shows ready threads only, not threads that are running. There is a single queue for processor time, even on computers with multiple processors.Therefore, if a computer has multiple processors, you need to divide this value by the number of processors servicing the workload. A sustained processor queue of greater than two threads generally indicates processor congestion.

Uint32

SystemCallsPersec

Shows the combined rate, in incidents per second, of calls to operating system service routines by all processes running on the computer. These routines perform all of the basic scheduling and synchronization of activities on the computer, and provide access to non-graphic devices, memory management, and name space management.

Uint32

SystemUpTime

Shows the total time, in seconds, that the computer has been operational since it was last started.

Uint64

Threads

Shows the number of threads in the computer at the time of data collection. This is an instantaneous count, not an average over the time interval. A thread is the basic executable entity that can execute instructions in a processor.

Uint32

Hostinfo checks

Hostinfo checks are a special class of checks that run on demand.

In contrast to the remote and agent check types which enable you to schedule alarms or alerts for remote and agent-type checks and run them on a regular schedule, you cannot schedule Hostinfo checks or create alarms or alerts for them.

Create Hostinfo checks to perform tasks like the following:

  • Fetch data on demand. For example, you can use a Hostinfo check to pipe data about the host to other services or applications.

  • Run occasional checks to troubleshoot an issue.

  • Periodically fetch data from large clusters of servers with the granularity of fetching from an individual computer. For example, use a Hostinfo check to retrieve information from a dashboard built on Kibana.

  • Use in conjunction with service helper software to generate suggestions that are based on the status of a system or piece information that is required by support technicians.

The following table provides a list of the Hostinfo checks supported by the monitoring service.

Hostinfo checks supported by Rackspace Monitoring

Hostinfo type

Description

connections

Runs the arp -an and netstate -naten commands and retrieves information about any open listening ports and any connections to them.

iptables

Runs the iptables -S command to retrieve data about IPv4 policies.

ip6tables

Runs the iptables -S command to retrieve data about IPv6 policies.

autoupdates

Checks if automatic updates are enabled on a Linux distribution.

passwd

Reads /etc/passwd and then runs the passwd -S command for every user. Obtains password-related

pam

Reads /etc/pam.d and retrieves data about pluggable authentication modules.

cron

Reads files in the crontabs directory and retrieves information about scheduled Cron jobs.

kernel_modules

Reads the /proc/modules virtual directory and retrieves data about the modules that are loaded into the kernel.

cpu

Retrieves information about the host’s CPU.

disk

Retrieves information about the host’s hard disks.

filesystem

Retrieves information about the host’s filesystem.

filesystem_state

Retrieves information about the read-only/read-write filesystems.

login

Reads /etc/login.defs and retrieves data about the login shell. This check does not retrieve any password information or any other sensitive data.

memory

Retrieve information about the host’s memory.

network

Retrieves information about the host’s network interface.

nil

Returns no information. This Hostinfo check is mainly used within the monitoring agent code itself.

packages

Runs either the dpkg-query or rpm -qa command and retrieves a list of package names and versions.

procs

Retrieves information about the processes that are running on the host.

system

Retrieves information about the host’s operating system.

who

Retrieves information about the user, device, time and host.

date

Retrieves the date and time on the host.

sysctl

Runs the sysctl -A command and retrieves all possible key-value pairs of the kernel parameters that can be set at runtime.

sshd

Runs the sshd -T command and retrieves the configuration parameters for the open SSH daemon.

fstab

Reads /etc/fstab and retrieves information about the file system configuration.

fileperms

Reads a pre-specified list of files and checks and retrieves their permissions.

services

Reads a few folders and files and generates a list of startup services.

deleted_libs

Greps through the output of lsof -nnP to find deleted or missing libraries for running processes.

cve

Retrieves a unique sorted list of common vulnerabilities and exposures that have been patched on the host system.

last_logins

Runs last to get information about previous icurrent logged-in user, bootups and when last started logging.

remote_services

Runs the netstat -tlpen command to obtain a list of active internet connections to servers and underlying programs that are using them.

ip4routes

Runs the netstat -nr4 command and retrieves information about the kernel’s IPv4 routing tables.

ip6routes

Runs the netstat -nr6 command and retrieves information about the kernel’s IPv6 routing tables

apache2

Retrieves information about the host’s apache2 instance and installation if it exists.

fail2ban

Retrieves information about the host’s fail2ban instance and installation.

lsyncd

Checks the status of the live syncing daemon or lsyncd.

nginx_config

Returns vhosts, version, includes, status (0 if everything is ok when nginx -t is run), configuration path, prefix and configure arguments for local nginx.

wordpress

Returns the path, version and edition of local Wordpress instances found via the apache2 and nginx configurations.

magento

Returns the path, version and edition of local Magento instances found via the apache2 and nginx configurations.

php

Returns information such as version, type (HHVM/PHP), and errors related to PHP. Uses the CLI and log files to to extract this information.

postfix

Checks the status of the postfix mail server.

You can use the Rackspace Monitoring API to run Hostinfo checks. To run a hostinfo check, issue the following cURL request:

Use the following cURL request to run Hostinfo checks by using the monitoring service.

curl -H 'X-Auth-Token: $token' '\
     https://monitoring.api.rackspacecloud.com/v1.0/ \
     <tenandID>/agents/<agent_id>/host_info/<hostinfo_type>

For more information about how to work with checks using the Rackspace Monitoring API, see the Checks section in the Rackspace Monitoring Developer Guide. For more information working with Hostinfo checks, see the Agent host information.

Check status codes

This section provides a list of a set of status messages and codes that can be returned by various check types.

The following table lists the status messages for various check types and provides a resolution for the issue.

Remote http check metrics

Status code or message

Description

Resolution

prevented by ACL 'global'

The HTTP remote check is attempting to access an IP Address that is a private address (127.x.x.x, 192.168.x.x, etc).

Private IP addresses are not supported. Specify a public IP address instead.

unknown content encoding

The content could not be copied fully to the monitoring zone endpoint.

Be sure that the content body of your web page is equal to or below the 100k limit. If you are using compression, be sure that the compressed page is less than or equal to 100k. See remote.http.

zlib: out-of-memory

The size of the webpage is too large (> 500k) for the content check.

Reduce the page content or use a custom ‘healthcheck’ page that is less than 500k in size.

attempt to concatenate local *ameth* (a nil value)

Occurs when the target of the remote check is using an unsupported authentication method. Currently supported authentication methods are basic and digest. NTLM is currently not supported.

Change the authentication method to a supported method. |