Rackspace VM Management for Private Cloud


Introduction

VM Management is an add-on service that leverages a set of automation processes and tools to enable traditional hosting style services such as (1) OS administration, (2) monitoring, (3) patching, and (4) antivirus services for the operating systems of customer selected guest VMs that have been created in or migrated to the private cloud. The purpose of this handbook is to provide our customers with the information they need to extract value from the platform and achieve the desired outcomes.


Getting Started

Pricing

Pricing for each service can be found at the following location. [Pricing]

Prerequisites

Before proceeding, verify that you have the necessary resources to complete integration between your cloud and Rackspace.

Review the following list of prerequisites:

  • You have administrator access to your Rackspace Customer Portal.
  • You have any necessary permissions at your company.
  • You have administrator access at your cloud provider.
  • Your VM operating system is on the compatibility list. Windows | Linux
  • The following software is required on each server where the SSM Agent will be installed:
    o Linux - curl and/or wget, python3 or python
    o Windows - PowerShell v3 or greater

Network Connectivity Requirements(Egress):

EndpointDescription
https//:ssm.REGION.amazonaws.comAccess to the core Systems Manager API endpoints
https://ssmmessages.REGION.amazonaws.comAccess to API operations for AWS Session Manager
https://ec2messages.REGION.amazonaws.comAccess to API operations for Amazon Message Delivery Service
https://s3.amazonaws.comAccess to S3 for installation binaries
https://s3.REGION.amazonaws.com

https://s3-REGION.amazonaws.com

https://*.s3.REGION.amazonaws.com
Access to S3 APIs used to log Systems Manager operations
https://amazon-ssm-REGION.s3.amazonaws.comHosts the Systems Manager Agent installer
https://add-ons.api.manage.rackspace.comThis endpoint manages the deployment of agents to supported devices and triggers enrollment into various services (e.g.Passport)
https://add-ons.manage.rackspace.comThis endpoint hosts automation scripts that are used during the device enrollment process.
https://logs.REGION.amazonaws.comStoring SSM agent logs for commands run on a server
https://kms.REGION.amazonaws.comEnabling KMS encryption for AWS Sessions Manager

In order to determine the region for the above follow the table below:

Rackspace DC/VCD*AWS Region
ORDus-east-2
IADus-east-1
DFWus-west-2
SYDap-southeast-2
HKGap-southeast-1
LONeu-west-2
STOeu-north-1
us1.rsvc.rackspace.comus-east-1
us2.rsvc.rackspace.comus-east-1
gb1.rsvc.rackspace.comeu-west-2
gb2.rsvc.rackspace.comeu-west-2
de1.rsvc.rackspace.comeu-central-1

Virtual Datacenter for SDDC Flex


Service Enrollment

SDDC Portfolio

The SDDC portfolio of products comes ready for device enrollment. This can be achieved by logging into the customer portal and following the add-on enrollment section in this guide.

Public Cloud

If you are a public cloud customer looking to use VM Management on devices within that environment, please contact your sales executive for more information.

VMC on AWS

If you are a VMC on AWS customer looking to use VM Management on devices within that environment, please contact your sales executive for more information.


Agent Installation

Rackspace Provided Images

Rackspace-provided images come with the agent preinstalled. If you still see the ‘Agent Installation’ warning within the Resource UI then run the scripts found in the next section, titled ‘Customer Provided Images on VMware based platforms.’

Customer Provided Images on VMware based platforms

To utilize VMM on a non-Rackspace image, install the Rackspace Agent on the VM or where the Agent is not installed.

The following install scripts should be run on each VM that has been created with the custom image that was imported.

Script (Linux): https://add-ons.manage.rackspace.com/bootstrap/vmware/ssm_install.sh

Script (Windows): https://add-ons.manage.rackspace.com/bootstrap/vmware/ssm_install.ps1


Add-On Enrollment

  • From the My Accounts page, click on “My Accounts” drop-down on the left side of the page.

  • Select, “My Resources” to open the Resource UI

  • The Resource UI will show all the virtual machines currently located in your environment.

📘

Note: If the VM does not have the option to “Enroll” in a feature. This is most likely due to a missing agent and would be evidenced by the ‘Management Agent’ column showing ‘Agent Installation’ as seen below. The instructions to resolve this issue can be found in the Agent Installation section of this handbook.

  • To enroll a VM in service, click the enroll button in the corresponding VMM Add-On column.

  • This will bring up the enrollment confirmation screen.

From there Click the Enroll Server button.

  • This process is the same for OS Admin, Monitoring, and Anti-Virus.
  • For patching you will need to click “Enroll Server”

  • And select a Patching group, that corresponds with your needed patch window and OS.

📘

Note: The field titled: ‘next run’ is given in local server time.


Patching UI

If you’re enrolled into patching specifically you will get access to the ‘Patching UI’ and it can be found by navigating on the top left of the customer portal and selecting ‘Patching’ as seen below:

Within this portal you will have several different views to choose from, allowing you to take different actions and export reporting data from that section. The sections are:

  • Patch Groups
  • Instances
  • Maintenance Windows

Within the ‘Patch Groups’ you’re able to see the way your instances are grouped and then assigned to various maintenance windows. Here you will also see instance status, pending updates, baselines etc. all at the group level.

Within the ‘Instances’ section you can select individual instances to drill down into updates at that level. The information here ranges from missing KBs to updates about the last run. This view can be seen below:

In the ‘Maintenance Windows’ section you’re able to see the existing maintenance groups, their schedules, next run and more. That view is previewed below:

The portal experience is primarily self-service and as such there are many options which can be toggled in all of the above windows. For any activities that you do not want to take through self-service you’re able to put a ticket in and Rackspace engineers would be more than happy to assist.


Unenrollment

  • To unenroll, open the resource UI.

  • Click on the icon next to the VM you want to unenroll.

  • Click “Unenroll from” followed by the name of the service you would like to remove. “IE Unenroll From OS Admin”


Features and Functionality

This section provides information about each of the following areas of the VM Management experience.

  • Managed OS Administration.

  • Managed OS Patching

  • Managed OS Monitoring

  • Antivirus Licensing


Managed OS Administration

When a VM is enrolled in OS Administration, Rackspace Technology creates a configuration management database (CMDB) record of the VM, and securely stores Customer- provided OS login credentials so that our OS system administrators can log in to the OS and perform the desired OS Services upon request.

This service enables a customer to initiate a request that would trigger a Rackspace administrator to log in to the guest OS of a virtual machine in the private cloud. Rackspace support engineers will utilize secure, time-limited, and audited access to the environment to provide troubleshooting services for supported systems.

Spheres of Support

OS Administration spheres of support can be found here:


Managed OS Patching

Rackspace Technology provides a managed OS patching service for supported operating systems. The patching schedule is set by the Customer and Rackspace Technology configures the guest OS to use Rackspace Technology-provided patching sources so that only approved patches are delivered and installed on customer machines.


Managed OS Monitoring

Cloudwatch Monitoring

Rackspace Technology installs, configures, and responds to monitoring alerts from an installed OS agent for OS and application alerts and conditions on VMs. It enables monitoring of guest OS service availability on a network, internal OS system resources, OS services operational status, and error conditions.

The default monitoring thresholds are as follows;

Windows

Windows metrics are located in the System/Window CW metric namespace

Linux

Linux metrics are located in the System/Linux CW metric namespace

Notes:

CPU, Memory, and Disk Percent alarms are configured to trigger when the given metric exceeds the threshold for 6 consecutive 5-minute averages. They are configured to clear when any subsequent 5-minute average of the metric is below the threshold.

  • The Disk Free Space alarm is configured to trigger when the given metric falls below the threshold for 5 consecutive 1-minute averages. It is configured to clear when any subsequent 1-minute average of the metric is above the threshold.
  • Disk alarms are NOT created for the following volume file system types:
    • devtmpfs
    • tmpfs
    • devfs
    • rootfs
    • squashfs
    • overlay
  • Disk alarms are NOT created for Kubernetes container volumes: any volume path starting with “/var/lib/kubelet/”.

Advanced Monitoring - Observability

This section will provide insights into Advanced Monitoring - Observability and dive deep into the context of Rackspace Services for Datadog. This partnership enables Rackspace Technology to leverage the Datadog platform to provide a suite of observability services to diagnose problems in customers' environments and minimize the impact of business-disrupting events on customers’ operations.

What is Observability
Observability

Observability

About Datadog

Datadog is the monitoring and security platform for infrastructure and applications. The SaaS platform integrates and automates infrastructure monitoring, application performance monitoring and log management to provide unified, real-time observability of our customers’ entire technology stack. Datadog is used by organizations of all sizes and across a wide range of industries to enable digital transformation, drive collaboration among development, operations, security and business teams, accelerate time to market for applications, reduce Mean Time to Repair.

Datadog Integrations

An integration, at the highest level, is when you assemble a unified system from units that are usually considered separately. At Datadog, you can use integrations to bring together all of the metrics and logs from your infrastructure and gain insight into the unified system as a whole—you can see pieces individually and also how individual pieces are impacting the whole. Datadog supports more than 700 built-in integrations across different systems, applications, and services.

Additional details about Datadog integrations can be found here -> Integrations (datadoghq.com)

How is Rackspace using Advanced Monitoring - Observability?

Rackspace is a top tier GOLD Partner with Datadog globally

  • Resell Partner
  • Managed Services Provider
  • System Integration and Professional Services

Rackspace is integrating Datadog into our overall VMM offering in terms of Observability. The following attributes are enabled by default for any customer enrolled in Advanced Monitoring - Observability.

  • Default host metrics provided by Datadog
  • Live process monitoring
  • Log collection enabled with system event logs configured by default
What is included in Advanced Observability Monitoring?

Rackspace will leverage Advanced Monitoring - Observability to monitor the following components:

  • CDM (CPU, Disk, and Memory) monitoring
CDM Monitoring Thresholds
TypeMetricDefault Threshold
CPUCPU_Active
Composite (combo) monitor
Windows CPU utilization: > 95% based on cpu.idle < 5
Windows CPU proc queue length: > 10

Linux CPU utilization: > 95% based on cpu.idle < 5
Linux CPU load average: > 4 based on system.load.norm.1 (load average normalized per core over 1 minute)
MemoryMemory_USED
Composite (combo) monitor
Windows

Memory utilization: > 95%
AND
Anomaly - Page reads/sec: 3 deviations outside of normal range for 15 minutes

Query: avg(last_4h):anomalies(avg:Win_mem_page_reads_sec{*} by {host}, 'agile', 3, direction='above', interval=60, alert_window='last_15m', timezone='utc', count_default_zero='true', seasonality='hourly') >= 1

Linux

Memory utilization: > 95%
AND
Anomaly – Swap out: 3 deviations outside of normal range for 15 minutes

Query: avg(last_4h):anomalies(avg:system.swap.swap_out{*} by {host}, 'agile', 3, direction='above', interval=60, alert_window='last_15m', timezone='utc', seasonality='hourly', count_default_zero='true') >= 1
Diskdisk_percent
disk_free_space
Windows: Root Disk < 1024 MB

Linux: Root Disk < 1024 MB

Non-Root Disk: > 99%

avg(last_5m):avg:system.disk.utilized{!device:/,!device:c:} by {host, device} > 99
Datadog Portal Login

This section describes how to federate from Rackspace Fabric to Datadog portal to locate a specific host.

  • To access the Datadog portal for the account, navigate to a device in https://manage.racksapce.com → search for the Account name→ My resources→ select a resource, click the 3 dots next to a server, and select "Launch Datadog Portal."
  • You will be sent to the Datadog "Dashboards" section. This is where you can create dashboards consisting of information from the default configuration and the additional integrations configured on the servers.
    Find more information on configuration Dashboards at Getting Started with Dashboards (datadoghq.com)
Datadog Dashboard Overview

Dashboards provide real-time insights into the performance, historical insights and health of systems and applications within an organization. With dashboards, teams can identify anomalies, prioritize issues, proactively detect problems, diagnose root causes, and ensure that reliability goals are met.

Datadog Portal also can get an instant view of the processes running on a VM and the resources those processes are taking up which will allow you to analyze data from across your entire system in a single pane of glass.

To get a list of all devices configured with Datadog monitoring for the account click on Infrastructure once you federate into the Datadog Portal

  • You will see a list of devices and the "Apps" configured for the servers.
  • Clicking on a server will pull up more information on the server including.
    • The agent configuration
    • The server metrics
    • Running processes
    • Server logs
  • To see more detailed information on a device, click the "Open Host Dashboard" on the top right of the pop out screen.
  • This will take you to a more detailed dashboard for the individual server.
  • Alerting is configured in the "Monitors" section. A default list of monitors will be configured for all hosts within Datadog.
  • A list of enabled Integrations on the account can be found in the "Integrations" section.
Infrastructure

Infrastructure monitoring includes core Datadog features that visualize, monitor, and measure the performance of your hosts, containers, and processes.

Host Map

The host map can be found under the infrastructure menu. It offers the ability to:

  • Quickly visualize your environment
  • Identify outliers
  • Detect usage patterns
  • Optimize resources
Hosts

The following information is displayed in the infrastructure list for your hosts:

Hostname

The preferred hostname alias (use the Options menu to view Cloud Name or Instance ID).

Cloud Name

A hostname alias.

Instance ID

A hostname alias.

Status

Displays ACTIVE when the expected metrics are received and displays INACTIVE if no metrics are received.

CPU

The percent of CPU used (everything but idle).

IOWait

The percent of CPU spent waiting on the IO (not reported for all platforms).

Load 15

The system load over the last 15 minutes.

Apps

The Datadog integrations reporting metrics for the host.

Operating System

The tracked operating system.

Cloud Platform

Cloud platform the host is running on (for example, AWS, Google Cloud, or Azure).

Datadog Agent

Agent version that is collecting data on the host.

Monitors

With Datadog alerting, you have the ability to create monitors that actively check metrics, integration availability, network endpoints, and more. Use monitors to draw attention to the systems that require observation, inspection, and intervention. A metric monitor provides alerts and notifications if a specific metric is above or below a certain threshold. For example, a metric monitor can alert you when disk space is low.

Create a Monitor
To create a monitor, navigate to Monitors > New Monitor > Metric.

  • Detection method: How are you measuring what will be alerted on? Are you concerned about a metric value crossing a threshold, a change in a value crossing a threshold, an anomalous value, or something else?
  • Define the metric: What value are you monitoring to alert? The disk space in your system? The number of errors encountered for logins?
  • Alert conditions: When does an engineer need to be woken up?
  • Notification: What information needs to be in the alert?
Tagging

Using tags enables you to observe aggregate performance across several hosts and narrow the set further based on specific elements. In summary, tagging is a method to observe aggregate data points.

For further reading on how to assign and use tags.

Datadog customer API Key

API keys are unique to your organization. An API key is required by the Datadog Agent to submit metrics and events to Datadog.

  • Should you need to locate the API Key for any reason you can do so by hovering over your name on the bottom left and selecting "Organization Settings"
  • Go to the "API Keys" Section, hover over the key and click the Copy Key button on the far right when it appears.
API Key

API Key


Antivirus Licensing

Rackspace Technology installs an OS antivirus agent on the selected VMs to provide Customers with antivirus services. Rackspace Technology makes no guarantees as to the effectiveness of the antivirus service. This service enables the scanning of guest OS files by a system within the private cloud that maintains up-to-date signatures of known malicious code.

It is connected to a centralized management service maintained by Rackspace that enables visibility into the function of the service and allows tickets to be triggered if any failure with the scanning system occurs or in the event of malicious code being discovered.


Security and Compliance

VM Management utilizes role-based access control (RBAC) to create granular control over permissions. When it comes to Rackspace employees, there are zero standing permissions granted. Rackspace employees are granted temporary access when performing a support task required by the customer. All remote access requests are logged and retained by Rackspace for security purposes.


Service Level Agreements

For the most up-to-date version of the SLAs (service level agreements) please review the terms and conditions page.


Billing and Payments

VM Management is billed at the VM level and is charged via usage per hour used on your monthly bill. The actual rate for VM Management varies depending on the add-on so please ask your seller for the current rates.

The bill for VM Management is available for viewing by customers within the Rackspace portal.


Support and Troubleshooting

RACI Diagram:

For issues using the platform please put a ticket into your Rackspace portal asking for assistance with VM Management. For questions about the shared responsibility model for this product please review the RACI below.

TaskCustomerRackspace
General
Add-On EnrollmentResponsibleOptional Add-On
AgentInstallation on Rackspace provided imagesInformResponsible
Add-On UnenrollmentResponsibleOptional Add-On
AgentInstallation on Customer provided imagesResponsibleOptional Add-On
TroubleshootingConsultResponsible
Patching
Create Patching GroupsConsultResponsible
Change Patch BaselineConsultResponsible
Change Patch GroupConsultResponsible
Change Maintenance WindowConsultResponsible
Monitoring
Configure Monitoring AgentInformResponsible
Configure ThresholdsInformResponsible
Respond to EventsInformResponsible
Anti-Virus
Agent InstallationInformResponsible
Apply UpdatesInformResponsible
Respond to IncidentsInformResponsible
OS Administration
Initiate RequestResponsibleInform
TroubleshootInformResponsible

📘

Note: For those fields that say ‘optional add-on’ in the above RACI contact your sales team and request more information about our Elastic Engineering or Professional Service offerings.


Terms of Service

VM Management terms and conditions can be found here:

https://www.rackspace.com/information/legal/guestosservices


Privacy Policy

Rackspace privacy policy can be found here:

https://www.rackspace.com/information/legal/privacystatement


Feedback and Suggestions

For all service requests please place a ticket in the ticketing portal, however, if you have feedback or suggestions for the design teams you can email us your feedback here at:

[email protected]

📘

Important: The subject of your request should be as follows “VM Management Customer Feedback” and contain your account number in the body as well as any relevant detail to support your feedback.