Frequently asked questions about Azure Monitor metric alerts

This article discusses common questions about Azure Monitor metric alerts and how to troubleshoot them.

Azure Monitor alerts proactively notify you when important conditions are found in your monitoring data. They allow you to identify and address issues before the users of your system notice them. For more information on alerting, see Overview of alerts in Microsoft Azure.

Metric alert should have fired but didn't

If you believe a metric alert should have fired but it didn't fire and it isn't found in the Azure portal, try the following steps:

  1. Configuration: Review the metric alert rule configuration to make sure it's properly configured:

    • Check that Aggregation type and Aggregation granularity (Period) are configured as expected. Aggregation type determines how metric values are aggregated. To learn more, see Azure Monitor Metrics aggregation and display explained. Aggregation granularity (Period) controls how far back the evaluation aggregates the metric values each time the alert rule runs.

    • Check that Threshold value or Sensitivity are configured as expected.

    • For an alert rule that uses Dynamic Thresholds, check if advanced settings are configured. Number of violations might filter alerts, and Ignore data before can affect how the thresholds are calculated.

      Note

      Dynamic thresholds require at least 3 days and 30 metric samples before they become active.

  2. Fired but no notification: Review the fired alerts list to see if you can locate the fired alert. If you can see the alert in the list but have an issue with some of its actions or notifications, see Troubleshooting problems in Azure Monitor alerts.

  3. Already active: Check if there's already a fired alert on the metric time series for which you expected to get an alert. Metric alerts are stateful, which means that once an alert is fired on a specific metric time series, more alerts on that time series won't be fired until the issue is no longer observed. This design choice reduces noise. The alert is resolved automatically when the alert condition isn't met for three consecutive evaluations.

  4. Dimensions used: If you've selected some dimension values for a metric, the alert rule monitors each individual metric time series (as defined by the combination of dimension values) for a threshold breach. To also monitor the aggregate metric time series, without any dimensions selected, configure another alert rule on the metric without selecting dimensions.

  5. Aggregation and time granularity: If you're visualizing the metric by using metrics charts, ensure that:

    • The selected Aggregation in the metric chart is the same as Aggregation type in your alert rule.
    • The selected Time granularity is the same as Aggregation granularity (Period) in your alert rule, and isn't set to Automatic.

Metric alert fired when it shouldn't have

If you believe your metric alert shouldn't have fired but it did, the following steps might help resolve the issue.

  1. Review the fired alerts list to locate the fired alert. Select the alert to view its details. Review the information provided under Why did this alert fire? to see the metric chart, Metric value, and Threshold value at the time when the alert was triggered.

    Note

    If you're using a Dynamic Thresholds condition type and think that the thresholds used weren't correct, provide feedback by using the frown icon. This feedback affects the machine learning algorithmic research and will help improve future detections.

  2. If you've selected multiple dimension values for a metric, the alert is triggered when any of the metric time series (as defined by the combination of dimension values) breaches the threshold. For more information about using dimensions in metric alerts, see this website.

  3. Review the alert rule configuration to make sure it's properly configured:

    • Check that Aggregation type, Aggregation granularity (Period), and Threshold value or Sensitivity are configured as expected.
    • For an alert rule that uses dynamic thresholds, check if advanced settings are configured, as Number of violations might filter alerts and Ignore data before can affect how the thresholds are calculated.

    Note

    Dynamic thresholds require at least 3 days and 30 metric samples before becoming active.

  4. If you're visualizing the metric by using Metrics chart, ensure that:

    • The selected Aggregation in the metric chart is the same as the Aggregation type in your alert rule.
    • The selected Time granularity is the same as the Aggregation granularity (Period) in your alert rule, and that it isn't set to Automatic.
  5. If the alert fired while there are already fired alerts that monitor the same criteria that aren't resolved, check if the alert rule has been configured not to automatically resolve alerts. Such configuration causes the alert rule to become stateless, which means the alert rule doesn't auto-resolve fired alerts and doesn't require a fired alert to be resolved before firing again on the same time series. To check if the alert rule is configured not to auto-resolve:

    • Edit the alert rule in the Azure portal. See if the Automatically resolve alerts checkbox under the Alert rule details section is cleared.
    • Review the script used to deploy the alert rule or retrieve the alert rule definition. Check if the autoMitigate property is set to false.

Can't find the metric to alert on: Virtual machines guest metrics

To alert on guest operating system metrics of virtual machines, such as memory and disk space, ensure you've installed the required agent to collect this data to Azure Monitor Metrics for:

For more information about collecting data from the guest operating system of a virtual machine, see this website.

Note

If you configured guest metrics to be sent to a Log Analytics workspace, the metrics appear under the Log Analytics workspace resource and start showing data only after you create an alert rule that monitors them. To do so, follow the steps to configure a metric alert for logs.

Currently, monitoring a guest metric for multiple virtual machines with a single alert rule isn't supported by metric alerts. But you can use a log alert rule. To do so, make sure the guest metrics are collected to a Log Analytics workspace and create a log alert rule on the workspace.

Can't find the metric to alert on

If you want to alert on a specific metric but you can't see it when you create an alert rule, check to determine:

Can't find the metric dimension to alert on

If you want to alert on specific dimension values of a metric but you can't find these values:

  • It might take a few minutes for the dimension values to appear under the Dimension values list.
  • The displayed dimension values are based on metric data collected in the last day.
  • If the dimension value isn't yet emitted or isn't shown, you can use the Add custom value option to add a custom dimension value.
  • If you want to alert on all possible values of a dimension and even include future values, choose the Select all current and future values option.
  • Custom metrics dimensions of Application Insights resources are turned off by default. To turn on the collection of dimensions for these custom metrics, see Log-based and pre-aggregated metrics in Application Insights.

Metric alert rules still defined on a deleted resource

When you delete an Azure resource, associated metric alert rules aren't deleted automatically. To delete alert rules associated with a resource that's been deleted:

  1. Open the resource group in which the deleted resource was defined.
  2. In the list that displays the resources, select the Show hidden types checkbox.
  3. Filter the list by Type == microsoft.insights/metricalerts.
  4. Select the relevant alert rules and select Delete.

Make metric alerts occur every time my condition is met

Metric alerts are stateful by default, so other alerts aren't fired if there's already a fired alert on a specific time series. To make a specific metric alert rule stateless and get alerted on every evaluation1 in which the alert condition is met, use one of these options:

  • If you create the alert rule programmatically, for example, via Azure Resource Manager, PowerShell, REST, or the Azure CLI, set the autoMitigate property to False.
  • If you create the alert rule via the Azure portal, clear the Automatically resolve alerts option under the Alert rule details section.

1 For stateless metric alert rules, an alert triggers once every 10 minutes at a minimum, even if the frequency of evaluation is equal or less than 5 minutes and the condition is still being met.

Note

Making a metric alert rule stateless prevents fired alerts from becoming resolved. So, even after the condition isn't met anymore, the fired alerts remain in a fired state until the 30-day retention period.

Define an alert rule on a custom metric that isn't emitted yet

When you create a metric alert rule, the metric name is validated against the Metric Definitions API to make sure it exists. In some cases, you want to create an alert rule on a custom metric even before it's emitted. An example is when you use a Resource Manager template to create an Application Insights resource that will emit a custom metric, along with an alert rule that monitors that metric.

To avoid a deployment failure when you try to validate the custom metric's definitions, use the skipMetricValidation parameter in the criteria section of the alert rule. This parameter will cause the metric validation to be skipped. See the following example for how to use this parameter in a Resource Manager template. For more information, see the complete Resource Manager template samples for creating metric alert rules.

"criteria": {
    "odata.type": "Microsoft.Azure.Monitor.SingleResourceMultipleMetricCriteria",
        "allOf": [
            {
                "name" : "condition1",
                "metricName": "myCustomMetric",
                "metricNamespace": "myCustomMetricNamespace",
                "dimensions":[],
                "operator": "GreaterThan",
                "threshold" : 10,
                "timeAggregation": "Average",
                "skipMetricValidation": true
            }
        ]
    }

Note

Using the skipMetricValidation parameter might also be required when you define an alert rule on an existing custom metric that hasn't been emitted in several days.

Process data for a metric alert rule in a specific region

You can make sure that an alert rule is processed in a specified region if your metric alert rule is defined with a scope of that region and if it monitors a custom metric.

The following regions are currently supported for regional processing of metric alert rules:

  • North Europe
  • West Europe
  • Sweden Central
  • Germany West Central

To enable regional data processing in one of these regions, select the specified region in the Details section of the Create an alert rule wizard.

Note

We're continually adding more regions for regional data processing.

Export the Resource Manager template of a metric alert rule via the Azure portal

You can export the Resource Manager template of a metric alert rule to help you understand its JSON syntax and properties. Then you can use the template to automate future deployments.

  1. In the Azure portal, open the alert rule to view its details.
  2. Select Properties.
  3. Under Automation, select Export template.

Metric alert rules quota too small

The allowed number of metric alert rules per subscription is subject to quota limits.

If you've reached the quota limit, the following steps might help resolve the issue:

  1. Try deleting or disabling metric alert rules that aren't used anymore.

  2. Switch to using metric alert rules that monitor multiple resources. With this capability, a single alert rule can monitor multiple resources by using only one alert rule counted against the quota. For more information about this capability and the supported resource types, see this website.

  3. If you need the quota limit to be increased, open a support request and provide the:

    • Subscription IDs for which the quota limit needs to be increased.
    • Resource type for the quota increase. Select Metric alerts or Metric alerts (Classic).
    • Requested quota limit.

Check the total number of metric alert rules

To check the current usage of metric alert rules, follow the next steps.

From the Azure portal

  1. Open the Alerts screen and select Manage alert rules.
  2. Filter to the relevant subscription by using the Subscription dropdown box.
  3. Make sure not to filter to a specific resource group, resource type, or resource.
  4. In the Signal type dropdown box, select Metrics.
  5. Verify that the Status dropdown box is set to Enabled.
  6. The total number of metric alert rules are displayed above the alert rules list.

From API

Manage alert rules by using Resource Manager templates, REST API, PowerShell, or the Azure CLI

You might run into an issue when you create, update, retrieve, or delete metric alerts by using Resource Manager templates, REST API, PowerShell, or the Azure CLI. The following steps might help resolve the issue.

Resource Manager templates

REST API

Review the REST API guide to verify you're passing all the parameters correctly.

PowerShell

Make sure that you're using the right PowerShell cmdlets for metric alerts:

Azure CLI

Make sure you're using the right CLI commands for metric alerts:

General

  • If you receive a Metric not found error:
  • If you're creating metric alerts on logs, ensure appropriate dependencies are included. For a sample template, see Create Metric Alerts for Logs in Azure Monitor.
  • If you're creating an alert rule that contains multiple criteria, note the following constraints:
    • You can only select one value per dimension within each criterion.
    • You can't use an asterisk (*) as a dimension value.
    • When metrics that are configured in different criteria support the same dimension, a configured dimension value must be explicitly set in the same way for all those metrics. For a Resource Manager template example, see Create a metric alert with a Resource Manager template.

No permissions to create metric alert rules

To create a metric alert rule, you must have the following permissions:

  • Read permission on the target resource of the alert rule.
  • Write permission on the resource group in which the alert rule is created. If you're creating the alert rule from the Azure portal, the alert rule is created by default in the same resource group in which the target resource resides.
  • Read permission on any action group associated to the alert rule, if applicable.

Subscription registration to the Microsoft.Insights resource provider

Metric alerts can only access resources in subscriptions registered to the Microsoft.Insights resource provider. To create a metric alert rule, all involved subscriptions must be registered to this resource provider:

  • The subscription that contains the alert rule's target resource (scope).
  • The subscription that contains the action groups associated with the alert rule, if defined.
  • The subscription in which the alert rule is saved.

Learn more about registering resource providers.

Naming restrictions for metric alert rules

Consider the following restrictions for metric alert rule names:

  • Metric alert rule names can't be changed (renamed) after they're created.
  • Metric alert rule names must be unique within a resource group.
  • Metric alert rule names can't contain the following characters: * # & + : < > ? @ % { } \ /
  • Metric alert rule names can't end with a space or a period.
  • The combined resource group name and alert rule name can't exceed 252 characters.

Note

If the alert rule name contains characters that aren't alphabetic or numeric, for example, spaces, punctuation marks, or symbols, these characters might be URL-encoded when retrieved by certain clients.

Restrictions when you use dimensions in a metric alert rule with multiple conditions

Metric alerts support alerting on multi-dimensional metrics and support defining multiple conditions, up to five conditions per alert rule.

Consider the following constraints when you use dimensions in an alert rule that contains multiple conditions:

  • You can only select one value per dimension within each condition.
  • You can't use the option to Select all current and future values. Select the asterisk (*).
  • When metrics that are configured in different conditions support the same dimension, a configured dimension value must be explicitly set in the same way for all those metrics in the relevant conditions. For example:
    • Consider a metric alert rule that's defined on a storage account and monitors two conditions:
      • Total Transactions > 5
      • Average SuccessE2ELatency > 250 ms
    • You want to update the first condition and only monitor transactions where the ApiName dimension equals "GetBlob".
    • Because both the Transactions and SuccessE2ELatency metrics support an ApiName dimension, you'll need to update both conditions and have them specify the ApiName dimension with a "GetBlob" value.

Set the alert rule's period and frequency

Choose an Aggregation granularity (Period) that's larger than the Frequency of evaluation to reduce the likelihood of missing the first evaluation of added time series in the following cases:

  • Metric alert rule that monitors multiple dimensions: When a new dimension value combination is added.
  • Metric alert rule that monitors multiple resources: When a new resource is added to the scope.
  • Metric alert rule that monitors a metric that isn't emitted continuously (sparse metric): When the metric is emitted after a period longer than 24 hours in which it wasn't emitted.

The Dynamic Thresholds borders don't seem to fit the data

If the behavior of a metric changed recently, the changes won't necessarily be reflected in the Dynamic Threshold borders (upper and lower bounds) immediately. The borders are calculated based on metric data from the last 10 days. When you view the Dynamic Threshold borders for a given metric, look at the metric trend in the last week and not only for recent hours or days.

Why is weekly seasonality not detected by Dynamic Thresholds?

To identify weekly seasonality, the Dynamic Thresholds model requires at least three weeks of historical data. When enough historical data is available, any weekly seasonality that exists in the metric data is identified and the model is adjusted accordingly.

Dynamic Thresholds shows a negative lower bound for a metric even though the metric always has positive values

When a metric exhibits large fluctuation, Dynamic Thresholds builds a wider model around the metric values. This action can result in the lower border being below zero. Specifically, this scenario can happen when:

  • The sensitivity is set to low.
  • The median values are close to zero.
  • The metric exhibits an irregular behavior with high variance, which appears as spikes or dips in the data.

When the lower bound has a negative value, it's plausible for the metric to reach a zero value given the metric's irregular behavior. Consider choosing a higher sensitivity or a larger Aggregation granularity (Period) to make the model less sensitive. Or, use the Ignore data before option to exclude a recent irregularity from the historical data used to build the model.

The Dynamic Thresholds alert rule is too noisy (fires too much)

To reduce the sensitivity of your Dynamic Thresholds alert rule, use one of the following options:

  • Threshold sensitivity: Set the sensitivity to Low to be more tolerant for deviations.
  • Number of violations (under Advanced settings): Configure the alert rule to trigger only if several deviations occur within a certain period of time. This setting makes the rule less susceptible to transient deviations.

The Dynamic Thresholds alert rule is too insensitive (doesn't fire)

Sometimes an alert rule won't trigger, even when a high sensitivity is configured. This scenario usually happens when the metric's distribution is highly irregular. Consider one of the following options:

  • Move to monitoring a complementary metric that's suitable for your scenario, if applicable. For example, check for changes in success rate rather than failure rate.
  • Try selecting a different value for Aggregation granularity (Period).
  • Check if there was a drastic change in the metric behavior in the last 10 days, for example, an outage. An abrupt change can affect the upper and lower thresholds calculated for the metric and make them broader. Wait for a few days until the outage is no longer taken into the thresholds calculation. Or use the Ignore data before option under Advanced settings.
  • If your data has weekly seasonality, but not enough history is available for the metric, the calculated thresholds can result in having broad upper and lower bounds. For example, the calculation can treat weekdays and weekends in the same way and build wide borders that don't always fit the data. This issue should resolve itself after enough metric history is available. Then, the correct seasonality will be detected and the calculated thresholds will update accordingly.

When I configure an alert rule's condition, why is Dynamic Thresholds disabled?

Dynamic thresholds are supported for most metrics, but some metrics can't use dynamic thresholds.

The following table lists the metrics that aren't supported by Dynamic Thresholds.

Resource type Metric name
Microsoft.ClassicStorage/storageAccounts UsedCapacity
Microsoft.ClassicStorage/storageAccounts/blobServices BlobCapacity
Microsoft.ClassicStorage/storageAccounts/blobServices BlobCount
Microsoft.ClassicStorage/storageAccounts/blobServices IndexCapacity
Microsoft.ClassicStorage/storageAccounts/fileServices FileCapacity
Microsoft.ClassicStorage/storageAccounts/fileServices FileCount
Microsoft.ClassicStorage/storageAccounts/fileServices FileShareCount
Microsoft.ClassicStorage/storageAccounts/fileServices FileShareSnapshotCount
Microsoft.ClassicStorage/storageAccounts/fileServices FileShareSnapshotSize
Microsoft.ClassicStorage/storageAccounts/fileServices FileShareQuota
Microsoft.Compute/disks Composite Disk Read Bytes/sec
Microsoft.Compute/disks Composite Disk Read Operations/sec
Microsoft.Compute/disks Composite Disk Write Bytes/sec
Microsoft.Compute/disks Composite Disk Write Operations/sec
Microsoft.ContainerService/managedClusters NodesCount
Microsoft.ContainerService/managedClusters PodCount
Microsoft.ContainerService/managedClusters CompletedJobsCount
Microsoft.ContainerService/managedClusters RestartingContainerCount
Microsoft.ContainerService/managedClusters OomKilledContainerCount
Microsoft.Devices/IotHubs TotalDeviceCount
Microsoft.Devices/IotHubs ConnectedDeviceCount
Microsoft.Devices/IotHubs TotalDeviceCount
Microsoft.Devices/IotHubs ConnectedDeviceCount
Microsoft.DocumentDB/databaseAccounts CassandraConnectionClosures
Microsoft.EventHub/clusters Size
Microsoft.EventHub/namespaces Size
Microsoft.IoTCentral/IoTApps connectedDeviceCount
Microsoft.IoTCentral/IoTApps provisionedDeviceCount
Microsoft.Kubernetes/connectedClusters NodesCount
Microsoft.Kubernetes/connectedClusters PodCount
Microsoft.Kubernetes/connectedClusters CompletedJobsCount
Microsoft.Kubernetes/connectedClusters RestartingContainerCount
Microsoft.Kubernetes/connectedClusters OomKilledContainerCount
Microsoft.MachineLearningServices/workspaces/onlineEndpoints RequestsPerMinute
Microsoft.MachineLearningServices/workspaces/onlineEndpoints/deployments DeploymentCapacity
Microsoft.Maps/accounts CreatorUsage
Microsoft.Media/mediaservices/streamingEndpoints EgressBandwidth
Microsoft.Network/applicationGateways Throughput
Microsoft.Network/azureFirewalls Throughput
Microsoft.Network/expressRouteGateways ExpressRouteGatewayPacketsPerSecond
Microsoft.Network/expressRouteGateways ExpressRouteGatewayNumberOfVmInVnet
Microsoft.Network/expressRouteGateways ExpressRouteGatewayFrequencyOfRoutesChanged
Microsoft.Network/virtualNetworkGateways ExpressRouteGatewayBitsPerSecond
Microsoft.Network/virtualNetworkGateways ExpressRouteGatewayPacketsPerSecond
Microsoft.Network/virtualNetworkGateways ExpressRouteGatewayNumberOfVmInVnet
Microsoft.Network/virtualNetworkGateways ExpressRouteGatewayFrequencyOfRoutesChanged
Microsoft.ServiceBus/namespaces Size
Microsoft.ServiceBus/namespaces Messages
Microsoft.ServiceBus/namespaces ActiveMessages
Microsoft.ServiceBus/namespaces DeadletteredMessages
Microsoft.ServiceBus/namespaces ScheduledMessages
Microsoft.ServiceFabricMesh/applications AllocatedCpu
Microsoft.ServiceFabricMesh/applications AllocatedMemory
Microsoft.ServiceFabricMesh/applications ActualCpu
Microsoft.ServiceFabricMesh/applications ActualMemory
Microsoft.ServiceFabricMesh/applications ApplicationStatus
Microsoft.ServiceFabricMesh/applications ServiceStatus
Microsoft.ServiceFabricMesh/applications ServiceReplicaStatus
Microsoft.ServiceFabricMesh/applications ContainerStatus
Microsoft.ServiceFabricMesh/applications RestartCount
Microsoft.Storage/storageAccounts UsedCapacity
Microsoft.Storage/storageAccounts/blobServices BlobCapacity
Microsoft.Storage/storageAccounts/blobServices BlobCount
Microsoft.Storage/storageAccounts/blobServices BlobProvisionedSize
Microsoft.Storage/storageAccounts/blobServices IndexCapacity
Microsoft.Storage/storageAccounts/fileServices FileCapacity
Microsoft.Storage/storageAccounts/fileServices FileCount
Microsoft.Storage/storageAccounts/fileServices FileShareCount
Microsoft.Storage/storageAccounts/fileServices FileShareSnapshotCount
Microsoft.Storage/storageAccounts/fileServices FileShareSnapshotSize
Microsoft.Storage/storageAccounts/fileServices FileShareCapacityQuota
Microsoft.Storage/storageAccounts/fileServices FileShareProvisionedIOPS

Next steps

For general troubleshooting information about alerts and notifications, see Troubleshooting problems in Azure Monitor alerts.