Create alerts for long-running backups on Azure

by Rackspace Technology Staff

Introduction

This post explores how to monitor Microsoft® Azure® server backups and, if the backups run long, use Log Analytics to trigger an alert on the servers.

Teaser: The post unveils new features of Azure **Backup Report** and Azure Monitor and offers some fancy Kusto Query Language (KQL) operations.

Why should you set an alert for long-running server backups?

DevOps and Azure admin engineers have many virtual machines to manage, routine tasks to perform, and troubleshooting efforts to undertake. You can't manually check every function on every VM, so you might miss unexpected behavior such as a long-running Azure backup, which can be serious. Automated alerts call problems to your attention.

To make this topic easy to understand, I am dividing the process into the following three parts:

  1. Enable Backup Report.
  2. Write a KQL query to check long-running server backups.
  3. Configure an alert for long-running server backups.

Enable Backup Report

Note: This post assumes you already use Recovery Services vaults to back up your Azure infrastructure.

Use the following steps to enable an Azure Backup report:
 

1. Log in to the portal 

Log in to the Azure portal, click the All Services blade, and search for Recovery Services vaults

Azure pic 1

2. Select a vault

Select one of the recovery service vaults on which to enable the Backup Report and monitor the backup duration.

azure pic 2

3. Choose Backup Report

Click the recovery vault blade, Backup Reports, and click Diagnostics Settings.

Note: The preview Backup Reports feature might not be available in some regions.
 

azure pic 3

4. Configure Backup Report

To configure the Backup Report, click on Add diagnostic setting.

 

azure pic 4

5. Enable log streaming 

You can stream backup logs to Azure Event Hubs, a storage account, or Log Analytics. In this example, I chose Log Analytics.

To enable backup log streaming, fill in the name of the report, check Send to Log Analytics, check AzureBackupReport in the Log section, and click Save

azure pic 5

.The portal displays your report, as shown in the following image:

azure pic 6

 KQL query to monitor server backups

This section introduces the following KQL query for long-running Azure server backups:

  AzureDiagnostics
    | where TimeGenerated > ago(1d)
    | where Category == “AzureBackupReport”
    | where OperationName == “Job”
    | where todouble(DataTransferredInMB_s)>1
    | extend Report_Running_Time_UTC= TimeGenerated
    | extend Backup_Job_Start_Time = JobStartDateTime_s

    // If you want time in AM or PM format and want to make more readable, uncomment the following line by removing //

    //| where (Backup_Job_Start_Time contains “AM” or Backup_Job_Start_Time contains “PM”)

 | extend DataTransferedGB = todouble(DataTransferredInMB_s)/1024
    | extend JobDurationHour = todouble(JobDurationInSecs_s)/3600
    | where JobDurationHour > 3
    | extend Vault_Name = split(ResourceId, ‘/’)[-1]
    | extend Server_Name = split(BackupItemUniqueId_s, ‘;’)[-1]
    | project Report_Running_Time_UTC, Backup_Job_Start_Time, SubscriptionId, JobOperation_s, JobStatus_s, DataTransferedGB, JobDurationHour, ResourceGroup, Server_Name, Vault_Name, Level

In case you're not familiar with KQL, let me explain the important elements of this query.

The query examines the last day of Azure backup data (`where TimeGenerated > ago(1d)`) to find jobs that took longer than three hours to complete (`where JobDurationHour > 3`).

You can change the number of hours, which is three in this case, to whatever number you consider to be too long.

Configure an alert

Glad you made it this far...read on for the best part!

Use the following steps to set an alert for long-running backups:

1. Open Log Search
Open the Log Search section of the Log Analytics workspace that you selected during the report configuration.

2. Paste the KQL query

Copy and paste the preceding KQL query in the Query tab and click Run.

Azure Pic 7

 Servers that took longer than three hours to complete display, as shown in the following output:

azure pic 8

3. Create a new alert 

To set an alert for these servers, click New Alert Rule.

azure pic 9

4. Configure the new alert

Under Condition, enter the threshold value with the count. I chose 1 because I want to create an alert on every server and result.

Note: You can also change the Period and Frequency of the alert, as needed.

azure pic 10

Configure other alert requirements, such as alert name, action group (email, recipient), webhooks, and other conditions. 

Conclusion

That’s all there is to it. This post showed you how to configure an alert on Azure VM backups that take longer than three hours. The alert sends a notification email with all the details. With that information, you can take the appropriate action.

 

Learn more about our managed Azure cloud.