What is monitoring?
Once an application is deployed to production, monitoring provides information about the application's performance and usage patterns so you can identify, mitigate, or resolve issues.
Goals of monitoring
One goal of monitoring is to achieve high availability by minimizing key metrics that are measured in terms of time:
- Time to detect (TTD): When performance or other issues arise, rich diagnostic data about the issues are fed back to development teams via automated monitoring.
- Time to mitigate (TTM): DevOps teams act on the information to mitigate issues as quickly as possible so that users are no longer affected.
- Time to remediate (TTR): Resolution times are measured, and teams work to improve over time. After mitigation, teams work on how to remediate problems at root cause so that they don't recur.
A second goal of monitoring is to enable validated learning by tracking usage. The core concept of validated learning is that every deployment is an opportunity to track experimental results that support or diminish the hypotheses that led to the deployment. Tracking usage and differences between versions allows teams to measure the impact of change and drive business decisions. If a hypothesis is diminished, the team can fail fast or pivot. If the hypothesis is supported, then the team can double down or persevere. These data-informed decisions lead to new hypotheses and prioritization of the backlog.
Key concepts
Telemetry is the mechanism for collecting data from monitoring. Telemetry can use agents that are installed in deployment environments, an SDK that relies on markers inserted into source code, server logging, or a combination of these. Typically, telemetry will distinguish between the data pipeline optimized for real-time alerting and dashboards and higher-volume data needed for troubleshooting or usage analytics.
Synthetic monitoring uses a consistent set of transactions to assess performance and availability. Synthetic transactions are predictable tests that have the advantage of allowing comparison from release to release in a highly predictable manner. Real user monitoring (RUM), on the other hand, measures experience from the user's browser, mobile device, or desktop. It accounts for last mile conditions such as cellular networks, internet routing, and caching. Unlike synthetics, RUM typically doesn't provide repeatable measurement over time.
Monitoring is often used to test in production. A well-monitored deployment streams data about its health and performance so that you can spot production incidents immediately. Combined with a continuous deployment release pipeline, monitoring will detect new anomalies and allow for prompt mitigation. This allows discovery of the unknown unknowns in application behavior that can't be foreseen in pre-production environments.
Effective monitoring is essential to allow DevOps teams to deliver at speed, get feedback from production, and increase customer satisfaction, acquisition, and retention.
Next steps
Read more about the monitoring capabilities of Azure Monitor.
Learn how to set up and use Application Insights for monitoring.