Analyze infrastructure metrics and logs

Performance issues can occur because of interaction between the application and other services in the architecture. For example, issues in database queries, connectivity between services, and under-provisioned resources are all common causes for inefficiencies.

The practice of continuous monitoring must include analysis of platform metrics and logs to get visibility into the health and performance of services that are part of the architecture.

Key points

  • View platform metrics to get visibility into the health and performance of Azure services.
  • Use log data to get visibility into the operations and events of the management plane.
  • Track events from internal dependencies.
  • Check the health of external dependencies, such as an API service.

Platform metrics

Metrics are numerical values that are collected at regular intervals and describe some aspect of a system at a particular time. View the platform metrics that are generated by the services used in the architecture. Each Azure service has a set of metrics that's unique to the functionality of the resource. These metrics give you visibility into their health and performance. There's no added configuration for Azure resources. You can also define custom metrics for an Azure service using the custom metrics API.

Azure Monitor Metrics is a feature of Azure Monitor that collects numeric data from monitored resources into a time series database. To learn more about Azure Monitor Metrics, see What can you do with Azure Monitor Metrics?.

If your application is running in Azure Virtual Machines, configure Azure Diagnostics extension to send guest OS performance metrics to Azure Monitor. Guest OS metrics include performance counters that track guest CPU percentage or memory usage, both of which are frequently used for autoscaling or alerting.

For more information, see Supported metrics with Azure Monitor.

Also, use technology-specific tools for the services used in the architecture. For example, use network traffic capturing tools, such as Azure Network Watcher.

One of the challenges to metric data is that it often has limited information to provide context for collected values. Azure Monitor addresses this challenge with multi-dimensional metrics. These metrics are name-value pairs that carry more data to describe the metric value. To learn about multi-dimensional metrics and an example for network throughput, see multi-dimensional metrics.

Platform logs

Azure provides various operational logs from the platform and the resources. These logs provide insight into what events occurred, what changes were made to the resource, and more. These logs are useful in tracking operations. For example, you can track scaling events to check if autoscaling is working as expected.

Azure Monitor Logs can store various different data types each with their own structure. You can also perform complex analysis on logs data using log queries, which can't be used for analysis of metrics data. Azure Monitor Logs is capable of supporting near real-time scenarios, making them useful for alerting and fast detection of issues. To learn more about Azure Monitor Logs, see What can you do with Azure Monitor Logs?.

Are you collecting Azure activity logs within the log aggregation tool?

Azure activity logs provide audit information about when an Azure resource is modified, such as when a virtual machine is started or stopped. This information is useful for the interpretation and troubleshooting of issues. It provides transparency around configuration changes that can be mapped to adverse performance events.

Are logs available for critical internal dependencies?

To build a robust application health model, ensure there's visibility into the operational state of critical internal dependencies.

In Azure Monitor, enable Azure resource logs so that you have visibility into operations that were done within an Azure resource. Similar to platform metrics, resource logs vary by the Azure service and resource type.

For example, for services such as a shared network virtual appliance (NVA) or Express Route connection, monitor the network performance. Azure monitor can also help diagnose networking related issues. You can trigger a packet capture, diagnose routing issues, analyze network security group flow logs, and gain visibility and control over your Azure network.

Here are some tools you can use:

Also, data from network traffic capturing tools, such as Network Watcher, can be helpful.

Are critical external dependencies monitored?

Monitor critical external dependencies, such as an API service, to ensure operational visibility of performance. For example, a probe can be used to measure the latency of an external API.

Cost optimization for monitoring

Azure Monitor billing model is based on consumption. Azure creates metered instances that track usage to calculate your bill. Pricing depends on the metrics, alerting, notifications, Log Analytics, and Application Insights.

For information about usage and estimated costs, see Monitoring usage and estimated costs in Azure Monitor.

You can also use the pricing calculator to determine your pricing. The pricing calculator helps you estimate your likely costs based on your expected use.


Back to the main article