Monitoring Azure Application Gateway

When you have critical applications and business processes relying on Azure resources, you want to monitor those resources for their availability, performance, and operation.

This article describes the monitoring data generated by Azure Application Gateway. Azure Application Gateway uses Azure Monitor. If you're unfamiliar with the features of Azure Monitor common to all Azure services that use it, read Monitoring Azure resources with Azure Monitor.

Monitoring overview page in Azure portal

The Overview page in the Azure portal for each Application Gateway includes the following metrics:

  • Sum Total Requests
  • Sum Failed Requests
  • Sum Response Status by HttpStatus
  • Sum Throughput
  • Sum CurrentConnections
  • Avg Healthy Host Count By BackendPool HttpSettings
  • Avg Unhealthy Host Count By BackendPool HttpSettings

This list is just a subset of the metrics available for Application Gateway. For more information, see Monitoring Azure Application Gateway data reference.

Azure Monitor Network Insights

Some services in Azure have a special focused prebuilt monitoring dashboard in the Azure portal that provides a starting point for monitoring your service. These special dashboards are called "insights".

Azure Monitor Network Insights provides a comprehensive view of health and metrics for all deployed network resources (including Application Gateway), without requiring any configuration. For more information, see Azure Monitor Network Insights.

Monitoring data

Azure Application Gateway collects the same kinds of monitoring data as other Azure resources that are described in Monitoring data from Azure resources.

See Monitoring Azure Application Gateway data reference for detailed information on the metrics and logs metrics created by Azure Application Gateway.

Collection and routing

Platform metrics and the Activity log are collected and stored automatically, but can be routed to other locations by using a diagnostic setting.

Resource Logs aren't collected and stored until you create a diagnostic setting and route them to one or more locations.

See Create diagnostic setting to collect platform logs and metrics in Azure for the detailed process for creating a diagnostic setting using the Azure portal, CLI, or PowerShell. When you create a diagnostic setting, you specify which categories of logs to collect. The categories for Azure Application Gateway are listed in Azure Application Gateway monitoring data reference.

The metrics and logs you can collect are discussed in the following sections.

Analyzing metrics

You can analyze metrics for Azure Application Gateway with metrics from other Azure services using metrics explorer by opening Metrics from the Azure Monitor menu. See Analyze metrics with Azure Monitor metrics explorer for details on using this tool.

For a list of the platform metrics collected for Azure Application Gateway, see Monitoring Application Gateway data reference.

For reference, you can see a list of all resource metrics supported in Azure Monitor.

Analyzing logs

Data in Azure Monitor Logs is stored in tables where each table has its own set of unique properties.

All resource logs in Azure Monitor have the same fields followed by service-specific fields. The common schema is outlined in Common and service-specific schema for Azure Resource Logs.

The Activity log is a platform login Azure that provides insight into subscription-level events. You can view it independently or route it to Azure Monitor Logs, where you can do much more complex queries using Log Analytics.

For a list of the types of resource logs collected for Azure Application Gateway, see Monitoring Azure Application Gateway data reference.

For a list of the tables used by Azure Monitor Logs and queryable by Log Analytics, see Monitoring Azure Application Gateway data reference.

Sample Kusto queries

Important

When you select Logs from the Application Gateway menu, Log Analytics is opened with the query scope set to the current Application Gateway. This means that log queries only include data from that resource. If you want to run a query that includes data from other Application Gateways or data from other Azure services, select Logs from the Azure Monitor menu. See Log query scope and time range in Azure Monitor Log Analytics for details.

You can use the following queries to help you monitor your Application Gateway resource.

// Requests per hour 
// Count of the incoming requests on the Application Gateway. 
// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics
| where ResourceType == "APPLICATIONGATEWAYS" and OperationName == "ApplicationGatewayAccess"
| summarize AggregatedValue = count() by bin(TimeGenerated, 1h), _ResourceId
| render timechart 
// Failed requests per hour 
// Count of requests to which Application Gateway responded with an error. 
// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics
| where ResourceType == "APPLICATIONGATEWAYS" and OperationName == "ApplicationGatewayAccess" and httpStatus_d > 399
| summarize AggregatedValue = count() by bin(TimeGenerated, 1h), _ResourceId
| render timechart
// Top 10 Client IPs 
// Count of requests per client IP. 
AzureDiagnostics
| where ResourceType == "APPLICATIONGATEWAYS" and OperationName == "ApplicationGatewayAccess"
| summarize AggregatedValue = count() by clientIP_s
| top 10 by AggregatedValue
// Errors by user agent 
// Number of errors by user agent. 
// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics
| where ResourceType == "APPLICATIONGATEWAYS" and OperationName == "ApplicationGatewayAccess" and httpStatus_d > 399
| summarize AggregatedValue = count() by userAgent_s, _ResourceId
| sort by AggregatedValue desc

Alerts

Azure Monitor alerts proactively notify you when important conditions are found in your monitoring data. They allow you to identify and address issues in your system before your customers notice them. You can set alerts on metrics, logs, and the activity log. Different types of alerts have benefits and drawbacks

If you're creating or running an application that uses Application Gateway, Azure Monitor Application Insights can offer additional types of alerts.

The following tables list common and recommended alert rules for Application Gateway.

Application Gateway v1

Alert type Condition Description
Metric CPU utilization crosses 80% Under normal conditions, CPU usage shouldn't regularly exceed 90%. This can cause latency in the websites hosted behind the Application Gateway and disrupt the client experience.
Metric Unhealthy host count crosses threshold Indicates the number of backend servers that Application Gateway is unable to probe successfully. This catches issues where the Application Gateway instances are unable to connect to the backend. Alert if this number goes above 20% of backend capacity.
Metric Response status (4xx, 5xx) crosses threshold When Application Gateway response status is 4xx or 5xx. There could be occasional 4xx or 5xx response seen due to transient issues. You should observe the gateway in production to determine static threshold or use dynamic threshold for the alert.
Metric Failed requests crosses threshold When failed requests metric crosses a threshold. You should observe the gateway in production to determine static threshold or use dynamic threshold for the alert.

Application Gateway v2

Alert type Condition Description
Metric Compute Unit utilization crosses 75% of average usage Compute unit is the measure of compute utilization of your Application Gateway. Check your average compute unit usage in the last one month and set alert if it crosses 75% of it.
Metric Capacity Unit utilization crosses 75% of peak usage Capacity units represent overall gateway utilization in terms of throughput, compute, and connection count. Check your maximum capacity unit usage in the last one month and set alert if it crosses 75% of it.
Metric Unhealthy host count crosses threshold Indicates number of backend servers that application gateway is unable to probe successfully. This catches issues where Application gateway instances are unable to connect to the backend. Alert if this number goes above 20% of backend capacity.
Metric Response status (4xx, 5xx) crosses threshold When Application Gateway response status is 4xx or 5xx. There could be occasional 4xx or 5xx response seen due to transient issues. You should observe the gateway in production to determine static threshold or use dynamic threshold for the alert.
Metric Failed requests crosses threshold When Failed requests metric crosses threshold. You should observe the gateway in production to determine static threshold or use dynamic threshold for the alert.
Metric Backend last byte response time crosses threshold Indicates the time interval between start of establishing a connection to backend server and receiving the last byte of the response body. Create an alert if the backend response latency is more that certain threshold from usual.
Metric Application Gateway total time crosses threshold This is the interval from the time when Application Gateway receives the first byte of the HTTP request to the time when the last response byte has been sent to the client. Should create an alert if the backend response latency is more that certain threshold from usual.

Next steps