Types of Azure Monitor alerts

This article describes the kinds of Azure Monitor alerts you can create and helps you understand when to use each type of alert.

Azure Monitor has four types of alerts:

Choose the right alert type

This table can help you decide when to use each type of alert. For more information about pricing, see the pricing page.

Alert type When to use Pricing information
Metric alert Metric alerts are useful when you want to be alerted about data that requires little or no manipulation. Metric data is stored in the system already pre-computed, so metric alerts are less expensive than log alerts. If the data you want to monitor is available in metric data, we recommend that you use metric alerts. Each metric alert rule is charged based on the number of time series that are monitored.
Log alert Log alerts allow you to perform advanced logic operations on your data. If the data you want to monitor is available in logs, or requires advanced logic, you can use the robust features of KQL for data manipulation by using log alerts. Log alerts are more expensive than metric alerts. Each log alert rule is billed based on the interval at which the log query is evaluated. More frequent query evaluation results in a higher cost. For log alerts configured for at-scale monitoring, the cost also depends on the number of time series created by the dimensions resulting from your query.
Activity log alert Activity logs provide auditing of all actions that occurred on resources. Use activity log alerts to receive an alert when a resource experiences a specific event. Examples are a restart, a shutdown, or the creation or deletion of a resource. For more information, see the pricing page.

Metric alerts

A metric alert rule monitors a resource by evaluating conditions on the resource metrics at regular intervals. If the conditions are met, an alert is fired. A metric time-series is a series of metric values captured over a period of time.

You can create rules by using these metrics:

Metric alert rules include these features:

The target of the metric alert rule can be:

Multiple conditions

When you create an alert rule for a single resource, you can apply multiple conditions. For example, you could create an alert rule to monitor an Azure virtual machine and alert when both "Percentage CPU is higher than 90%" and "Queue length is over 300 items." When an alert rule has multiple conditions, the alert fires when all the conditions in the alert rule are true. The alert resolves when at least one of the conditions is no longer true for three consecutive checks.

Narrow the target by using dimensions

Dimensions are name-value pairs that contain more data about the metric value. Using dimensions allows you to filter the metrics and monitor specific time-series, instead of monitoring the aggregate of all the dimensional values.

For example, the transactions metric of a storage account can have an API name dimension that contains the name of the API called by each transaction. Examples are GetBlob, DeleteBlob, and PutPage. You can choose to have an alert fired when there's a high number of transactions in any API name, which is the aggregated data. Or you can use dimensions to further break it down to alert only when the number of transactions is high for specific API names.

If you use more than one dimension, the metric alert rule can monitor multiple dimension values from different dimensions of a metric. The alert rule separately monitors all the dimension value combinations. For instructions on how to use dimensions in metric alert rules, see Monitor multiple time series in a single metric alert rule.

Create resource-centric alerts by splitting by dimensions

To monitor for the same condition on multiple Azure resources, you can use the technique of splitting by dimensions. Splitting by dimensions allows you to create resource-centric alerts at scale for a subscription or resource group. Alerts are split into separate alerts by grouping combinations. Splitting on the Azure resource ID column makes the specified resource into the alert target.

You might also decide not to split when you want a condition applied to multiple resources in the scope. For example, you might want to fire an alert if at least five machines in the resource group scope have CPU usage over 80%.

Monitor multiple resources

You can monitor at scale by applying the same metric alert rule to multiple resources of the same type for resources that exist in the same Azure region. Individual notifications are sent for each monitored resource.

Platform metrics are supported in the Azure cloud for the following services:

Service Global Azure Government China
Azure Virtual Machines Yes Yes Yes
SQL Server databases Yes Yes Yes
SQL Server elastic pools Yes Yes Yes
Azure NetApp Files capacity pools Yes Yes Yes
Azure NetApp Files volumes Yes Yes Yes
Azure Key Vault Yes Yes Yes
Azure Cache for Redis Yes Yes Yes
Azure Stack Edge devices Yes Yes Yes
Recovery Services vaults Yes No No
Azure Database for PostgreSQL - Flexible servers Yes Yes Yes

Note

Multi-resource metric alerts aren't supported for the following scenarios:

  • Alerting on virtual machines' guest metrics.
  • Alerting on virtual machines' network metrics. These metrics include Network In Total, Network Out Total, Inbound Flows, Outbound Flows, Inbound Flows Maximum Creation Rate, and Outbound Flows Maximum Creation Rate.

You can specify the scope of monitoring with a single metric alert rule in one of three ways. For example, with virtual machines you can specify the scope as:

  • A list of virtual machines in one Azure region within a subscription.
  • All virtual machines in one Azure region in one or more resource groups in a subscription.
  • All virtual machines in one Azure region in a subscription.

Dynamic thresholds

Dynamic thresholds use advanced machine learning to:

  • Learn the historical behavior of metrics.
  • Identify patterns and adapt to metric changes over time, such as hourly, daily, or weekly patterns.
  • Recognize anomalies that indicate possible service issues.
  • Calculate the most appropriate threshold for the metric.

Machine learning continuously uses new data to learn more and make the threshold more accurate. The system adapts to the metrics' behavior over time and alerts based on deviations from its pattern. For this reason, you don't have to know the "right" threshold for each metric.

Dynamic thresholds help you:

  • Create scalable alerts for hundreds of metric series with one alert rule. If you have fewer alert rules, you spend less time creating and managing alerts rules.
  • Create rules without having to know what threshold to configure.
  • Configure metric alerts by using high-level concepts without extensive domain knowledge about the metric.
  • Prevent noisy (low precision) or wide (low recall) thresholds that don’t have an expected pattern.
  • Handle noisy metrics (such as machine CPU or memory) and metrics with low dispersion (such as availability and error rate).

For instructions on how to use dynamic thresholds in metric alert rules, see Dynamic thresholds in metric alerts.

Log alerts

A log alert rule monitors a resource by using a Log Analytics query to evaluate resource logs at a set frequency. If the conditions are met, an alert is fired. Because you can use Log Analytics queries, you can perform advanced logic operations on your data and use the robust KQL features to manipulate log data.

The target of the log alert rule can be:

  • A single resource, such as a VM.
  • Multiple resources of the same type in the same Azure region, such as a resource group. This capability is currently available for selected resource types.
  • Multiple resources using cross-resource query.

Log alerts can measure two different things, which can be used for different monitoring scenarios:

  • Table rows: The number of rows returned can be used to work with events such as Windows event logs, Syslog, and application exceptions.
  • Calculation of a numeric column: Calculations based on any numeric column can be used to include any number of resources. An example is CPU percentage.

You can configure if log alerts are stateful or stateless (currently in preview).

Note

Log alerts work best when you're trying to detect specific data in the logs, as opposed to when you're trying to detect a lack of data in the logs. Because logs are semi-structured data, they're inherently more latent than metric data on information like a VM heartbeat. To avoid misfires when you're trying to detect a lack of data in the logs, consider using metric alerts. You can send data to the metric store from logs by using metric alerts for logs.

Dimensions in log alert rules

You can use dimensions when you create log alert rules to monitor the values of multiple instances of a resource with one rule. For example, you can monitor CPU usage on multiple instances running your website or app. Each instance is monitored individually and notifications are sent for each instance.

Splitting by dimensions in log alert rules

To monitor for the same condition on multiple Azure resources, you can use the technique of splitting by dimensions. Splitting by dimensions allows you to create resource-centric alerts at scale for a subscription or resource group. Alerts are split into separate alerts by grouping combinations using numerical or string columns. Splitting on the Azure resource ID column makes the specified resource into the alert target.

You might also decide not to split when you want a condition applied to multiple resources in the scope. For example, you might want to fire an alert if at least five machines in the resource group scope have CPU usage over 80%.

Use the API

Manage new rules in your workspaces by using the ScheduledQueryRules API.

Note

Log alerts for Log Analytics was previously managed by using the legacy Log Analytics Alert API. Learn more about switching to the current scheduledQueryRules API.

Log alerts on your Azure bill

Log alerts are listed under the resource provider microsoft.insights/scheduledqueryrules with:

  • Log alerts on Application Insights shown with the exact resource name along with resource group and alert properties.
  • Log alerts on Log Analytics shown with the exact resource name along with resource group and alert properties when they're created by using the scheduledQueryRules API.
  • Log alerts created from the legacy Log Analytics API aren't tracked in Azure Resources and don't have enforced unique resource names. These alerts are still created on microsoft.insights/scheduledqueryrules as hidden resources. They have the resource naming structure <WorkspaceName>|<savedSearchId>|<scheduleId>|<ActionId>. Log alerts on the legacy API are shown with the preceding hidden resource name along with resource group and alert properties.

Note

Unsupported resource characters such as <, >, %, &, , ?, and / are replaced with an underscore character (_) in the hidden resource names. This change also appears in the billing information.

Activity log alerts

An activity log alert monitors a resource by checking the activity logs for a new activity log event that matches the defined conditions.

You might want to use activity log alerts for these types of scenarios:

  • When a specific operation occurs on resources in a specific resource group or subscription. For example, you might want to be notified when:
    • Any virtual machine in a production resource group is deleted.
    • Any new roles are assigned to a user in your subscription.
  • When a service health event occurs. Service health events include notifications of incidents and maintenance events that apply to resources in your subscription.

You can create an activity log alert on:

  • Any of the activity log event categories, other than on alert events.
  • Any activity log event in a top-level property in the JSON object.

Activity log alert rules are Azure resources, so you can use an Azure Resource Manager template to create them. You can also create, update, or delete activity log alert rules in the Azure portal.

An activity log alert only monitors events in the subscription in which the alert is created.

Smart detection alerts

After you set up Application Insights for your project, your app begins to generate data. Based on this data, Smart Detection takes 24 hours to learn the normal behavior of your app. Your app's performance has a typical pattern of behavior. Some requests or dependency calls will be more prone to failure than others. The overall failure rate might go up as load increases.

Smart Detection uses machine learning to find these anomalies. Smart Detection monitors the data received from your app, and especially the failure rates. Application Insights automatically alerts you in near real time if your web app experiences an abnormal rise in the rate of failed requests.

As data comes into Application Insights from your web app, Smart Detection compares the current behavior with the patterns seen over the past few days. If there's an abnormal rise in failure rate compared to previous performance, an analysis is triggered. To help you triage and diagnose the problem, an analysis of the characteristics of the failures and related application data is provided in the alert details. There are also links to the Application Insights portal for further diagnosis. The feature doesn't need setup or configuration because it uses machine learning algorithms to predict the normal failure rate.

Metric alerts tell you there might be a problem, but Smart Detection starts the diagnostic work for you. It performs much of the analysis you would otherwise have to do yourself. You get the results neatly packaged, which helps you get to the root of the problem quickly.

Smart Detection works for web apps hosted in the cloud or on your own servers that generate application requests or dependency data.

Next steps