Develop a Site Reliability Engineering (SRE) strategy

Learn how Site Reliability Engineering enables you to sustainably achieve the appropriate level of reliability in your systems, services, and products.



Modules in this learning path

Organizations big and small have started to realize just how crucial system and application reliability is to their business. They’ve also learned just how difficult it is to maintain that reliability while iterating at the speed demanded by the marketplace. Site Reliability Engineering (SRE) is a proven approach to this challenge.

Respond to incidents and activities in your infrastructure through alerting capabilities in Azure Monitor.

Use application logs in Azure Web Apps to help debug web app code.

Learn how to manage site reliability.

Discover what cloud elasticity means and different ways to scale your cloud resources.

Learn how developers write programs that run on the cloud, including how to deploy, be fault-tolerant, load balance, scale, and deal with latency.

Review multidimensional metrics for the load balancer in Azure Monitor Metrics, and check health probe status for the load balancer.

Evaluate monitoring options for an Azure virtual machine (VM). Enable diagnostics to get data about your VM. View VM metrics in Azure Metrics Explorer. Create a metric alert to monitor performance.