Last updated on: 2019-01-29
Authored by: Stephanie Fillmon
Rackspace Monitoring is an API-driven cloud service built for infrastructure monitoring. It provides a simple yet powerful feature set that allows flexibility in configuration and execution. Rackspace Monitoring helps keep your applications up and running. Remote monitoring tests connectivity from regional zones deployed throughout our global data centers, and agent-based monitoring gathers information from inside each resource.
Rackspace Monitoring provides you with a set of tools that monitor, analyze and report on the availability and performance of your websites, servers and other cloud resources.
Rackspace Monitoring is released through the cloud without need for software installation on servers or computers. By eliminating the need for installation, Rackspace can upgrade the monitoring service without involving the customer. This process hides the complexity of the upgrade and maintenance processes from the customer, giving them a simple and reliable experience.
There are six monitoring zones:
Using multiple monitoring zones eliminates the need for maintenance and upgrade downtime, and ensures that your monitoring services remain uninterrupted even if a data center failure occurs.
See the Concepts section in the Rackspace Monitoring Getting Started Guide.
You can set up Rackspace Monitoring by configuring one or more checks that monitor the internal performance of your cloud server (agent checks) as well as the availability of your website from different points on the Internet (remote service checks). Use these checks to ensure consistent improvement and optimization of your application’s code and infrastructure as well as the ability to maintain high availability for your customers.
Rackspace Monitoring is an API-based system, so you can start creating monitoring checks by using the following methods:
The service currently supports email, Short Message Service (SMS), PagerDuty, VictorOps®, and webhook notifications. For more information, see the Notification types section in the Rackspace Monitoring API Reference.
As the complexity of your business increases with the number of products, customers, and websites, the possibility that one or more of your resources will fail also increases. Learning about a problem from your customers means that you’ve already lost business, and your customers are already having a negative experience using your website or application. Rackspace Monitoring prevents these types of problems from occurring
Yes, you can use both US and UK accounts. This is a global system that works with both identities. Use the identity server where your tenant lives and pass that token and tenant ID to the Rackspace Monitoring system.
Because we provide monitoring as a service (MaaS) hosted in the cloud, we are able to keep that service up and running without any downtime. Even if we take a region offline for an upgrade, or lose a data center because of a localized disaster or event, Rackspace Monitoring continues to monitor your resources and send you notifications.
File a ticket or contact your account team.
At this time, Rackspace Monitoring does not support Simple Network Management Protocol (SNMP) traps.
Rackspace Monitoring bills by hourly usage based on how many checks were running in that hour, and how many monitoring zones were involved. Adjusting your usage is quick and easy, and this flexibility can help reduce unnecessary costs.
A notification plan defines the actions that are performed when a certain status is returned by the check. You may have multiple notification plans in your cloud account.
Each monitoring check can reference one notification plan. When an alarm for that check triggers the critical state, the notification plan associated with the check is used.
If you do not set up a custom notification plan, then email is sent to all of the technical contacts on your account. If your account lists no technical contacts, then the primary contact is emailed. You can view the list of contacts for your account on the User Management page in the Cloud Control Panel.
Rackspace Managed Notifications creates a support ticket within your account. This feature is available only to customers with a Managed Operations service level.
Information about Rackspace Monitoring notification plans is in the Rackspace Monitoring API Reference.
To create a custom service plan that covers your monitoring needs and fits your budget, contact our sales department.
Yes, but you need a Cloud account. You configure your own notifications, so alerts might go only to you (the user). Rackers can’t respond to your alarms unless they are included in the notifications.
Rackspace Monitoring is a global product supported in both the US and the UK. Our UK data center can process alerts on its own if the link between the US and UK goes offline, and other data centers act as safety nets in case of localized data center failure. No matter what, your monitoring service remains functional.
A monitoring zone is the launch point of a check. You can launch checks from multiple monitoring zones.
An alarm is a set of rules that determine what status is returned based on the result of the check.
At this time, Rackspace doesn’t use synthetic transactions (a simulated set of actions). However, we do support checking the HTML of the response. We follow redirects but don’t check content within a frame or iframe.
A notification defines how the customer wants to be contacted in the case of a system failure.
The Rackspace Monitoring system is extremely secure. Before releasing the product, an independent firm assessed the level of security of the Rackspace Monitoring systems and API, and all reported issues have been addressed.
Remote service checks monitor the availability of your website from different points on the Internet.
The following list briefly describes the available remote service checks:
HTTP Check (Website): This check monitors the availability of your website either by URL or by IP address and alerts you if the site becomes unavailable for more than 30 seconds.
TCP Check (Port): This check monitors the response from a specific port on your server to determine if the process that is bound to that port is running.
Ping Check (Server): Ping is a network utility that checks the availability of a computer (node) on a network. If the node responds, the Ping utility also measures how long it take for a small packet of information to make a round trip from your computer to that remote system. This check monitors the general responsiveness of your server on the network and alerts you if it fails to respond.
You need to install a monitoring agent on your cloud server to use agent checks. If you have a cloud account with a managed service level agreement, the build process installs the monitoring agent for you as part of the build process. If you have an infrastructure account, you need to install the agent manually.
After the agent is installed, you can see current and historical performance information about a cloud server from it’s Details screen. Agent checks enable you to set specific thresholds that trigger notifications.
The following list briefly describes the available Agent Checks:
Memory Check: Your server has a finite amount of memory. Running low on memory negatively affects the performance of your entire server and might cause it to be unresponsive. This check alerts you when your cloud server’s memory utilization surpasses 80%, but you can change that value to meet your needs.
CPU Check: By default, this check returns a warning when 90% of the CPU is used and a critical warning when 95% of the CPU is used. You can configure these thresholds to suit your needs.
Load Average (Linux® Only): Unique to UNIX® systems, a server’s load average represents the average amount of system work (CPU, disk, memory, and so on) that a computer has performed over a period of time. This alarm triggers when your server becomes heavily loaded. By default, it returns a warning when load average exceeds 1 times the number of vCPUs and a critical warning when it exceeds 1.5 times the number of vCPUs.
Filesystem: Your server needs a certain amount of free disk space to operate. This check monitors your server’s disk utilization and alerts you when used space reaches a set threshold on the default mount point. By default, it returns a warning when the server reaches 80% of capacity and a critical warning when the server reaches 90% of capacity.
Network: Even if your server is operating properly, it does little good if it cannot communicate over the network. This check monitors the rate at which your server is sending and receiving data. It sends a warning or alert if either rate drops below a value that you configure.
An entity is the resource (for example, website or server) that you want to monitor
The service provides an on-demand simulation feature that you can use to test the functionality of the monitoring system by simulating a normal operating situation.
Scalability has been a priority from the beginning. Even if a customer adds thousands of cloud servers in minutes, Rackspace Monitoring can instantly begin monitoring all of them.
Rackspace Monitoring is intended to replace these types of tools. Although Rackspace Monitoring doesn’t offer all the features of Nagios®, it is hosted as a service, API driven, and built for the cloud. Rackspace Monitoring also provides geographically redundant checks, which is generally difficult to get with any solution. Customers can leverage our large data center footprint, incredible scalability, and our continuous release feature. Future improvements to the service are released as they become functional, so there is no need to wait for an upgrade package, and no need for downtime.
Not at this time.
Email notifications are sent via Mailgun. They ensure that email is sent properly and not placed into spam folders.
To build a state-of-the-art monitoring platform, data collection should be separate from the thresholds. On the CLI and API level, this is a more complex user experience, but it provides the most flexibility. The UI simplifies the process for those users who don’t want to work with a CLI.
Yes. Check out the Raxmon project. It uses the Apache® Libcloud framework for building a reliable API implementation that functions well.
To avoid repeating the Raxmon installation on each new cloud server, install it on your workstation and not on your server.
Note: Raxmon requires Python 2.5, 2.6, or 2.7.
Yes. Visit our Rackspace Monitoring Getting Started Guide, which can guide you through the steps in creating your Rackspace Monitoring setup from scratch.
Monitoring from multiple monitoring zones allows you to monitor the experience of customers from many locations, which is important for companies that do business in more than one region. Consider that your website might be working fine in the western United States, but users from the eastern half of the country are experiencing high load times and unresponsive web pages. Monitoring from just a western data center would report an OK status, but if you also monitor from an eastern data center, you are alerted to this problem before it affects your customers.
If you’re only monitoring from a single zone, and for some reason there is an error with a check, you might get false notifications that your site has a problem, when in reality it is working fine. Using multiple monitoring zones helps prevent false alarms by verifying a system failure from multiple sources before alerting you. Or, you can have it send you a message if even just one check returns a failure status.
Through the API by using a check or through the Raxmon CLI. For more information, see the Checks section of the API refrence.
As with any commands you submit to your cloud resources through the API, you must first authenticate through the API for the commands to be correctly processed.
In the Rackspace Monitoring Developer Guide, provides the detailed configuration options available with this service offering and the necessary components to build functioning monitoring checks.
Listing monitoring zones gives you the CIDRs of the set of collectors in that zone.
©2020 Rackspace US, Inc.
Except where otherwise noted, content on this site is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License