Reliability guides by service

This article provides links to reliability guidance for many Azure services. Most reliability guides contain the following information:

  • Production deployment recommendations provide guidance on how to deploy the service to meet your reliability requirements in production environments.

  • Resilience to transient faults describes how the service handles day-to-day transient faults that can occur in the cloud. It also describes how to handle these faults in your application, including information about retry policies, timeouts, and other best practices.

  • Reliability architecture overview is a synopsis of how the service supports reliability. It includes information about which components Microsoft manages and which components you manage, built-in redundancy features, and how to provision and manage multiple resources, if applicable.

  • Resilience to availability zone failures describes how the service supports availability zones, requirements you need to meet to use availability zones, how traffic is routed and data is replicated between zones, what happens when a zone experiences an outage, zone recovery, and how to configure your resources for availability zone support.

  • Resilience to region-wide failures outlines whether the service provides multi-region capabilities, requirements to use those capabilities, how traffic is routed and data is replicated between regions, the region-down experience, failover and failback support, and how to deploy custom multi-region solutions.

  • Resilience to service maintenance describes how the service handles planned maintenance events, including how to minimize downtime and data loss during these events. It also shows you how to configure the service to improve resilience during maintenance times.

  • Service-level agreements (SLAs), which define and describe the expected uptime, and how the expected uptime changes based on the configuration that you use.

  • Backup and recovery for supported services, including who controls and manages backups, where they're stored and replicated to, how they can be recovered, and whether they're accessible only within a region or across regions.

Reliability guides by service

The following table provides links to reliability guidance for Azure services. Each guide contains information about how the service supports reliability features.

Note

Some documents don't follow a single reliability guide format. These services might list more than one article that references reliability guidance.

Service Reliability guide Other reliability documentation
Azure AI Search Azure AI Search Reliability in AI Search
Azure API Center Azure API Center Reliability in Azure API Center
Azure API Management Azure API Management Reliability in API Management
Azure App Configuration Azure App Configuration Reliability in Azure App Configuration
Azure App Service Azure App Service Reliability in App Service
Azure App Service - App Service Environment Azure App Service - App Service Environment Reliability in App Service Environment
Azure Application Gateway for Containers Azure Application Gateway for Containers Reliability in Application Gateway for Containers
Azure Application Gateway v2 Azure Application Gateway v2 Reliability in Azure Application Gateway
Azure Backup Azure Backup Reliability in Backup
Azure Bastion Azure Bastion Reliability in Azure Bastion
Azure Batch Azure Batch Reliability in Batch
Azure Blob Storage Azure Blob Storage Reliability in Blob Storage
Azure Bot Service Azure Bot Service Reliability in Bot Service
Azure Chaos Studio Azure Chaos Studio Reliability in Chaos Studio
Azure Container Apps Azure Container Apps Reliability in Container Apps
Azure Container Instances Azure Container Instances Reliability in Container Instances
Azure Container Registry Azure Container Registry Reliability in Container Registry
Azure Cosmos DB Azure Cosmos DB Reliability in Azure Cosmos DB
Azure Data Box Azure Data Box Recover data if an entire region fails
Azure Data Explorer Azure Data Explorer Reliability in Azure Data Explorer
Azure Data Factory Azure Data Factory Reliability in Azure Data Factory
Azure Data Manager for Energy Azure Data Manager for Energy Reliability in Azure Data Manager for Energy
Azure Data Share Azure Data Share Disaster recovery for Data Share
Azure Database for MySQL Azure Database for MySQL Reliability in Azure Database for MySQL
Azure Database for PostgreSQL Azure Database for PostgreSQL Reliability in Azure Database for PostgreSQL
Azure Databricks Azure Databricks Reliability in Azure Databricks
Azure DDoS Protection Azure DDoS Protection Reliability in DDoS Protection
Azure Device Registry Azure Device Registry Reliability in Device Registry
Azure DevOps Azure DevOps Data protection overview
Azure Disk Storage Azure Disk Storage Reliability in Azure Disk Storage
Azure DNS Azure DNS Reliability in Azure DNS
Azure DocumentDB Azure DocumentDB Reliability in Azure DocumentDB
Azure Elastic SAN Azure Elastic SAN Reliability in Elastic SAN
Azure Event Grid Azure Event Grid Reliability in Event Grid
Azure Event Hubs Azure Event Hubs Reliability in Azure Event Hubs
Azure ExpressRoute Azure ExpressRoute Reliability in Azure ExpressRoute
Azure Files Azure Files Reliability in Azure Files
Azure Firewall Azure Firewall Reliability in Azure Firewall
Azure Functions Azure Functions Reliability in Azure Functions
Azure Health Data Services Azure Health Data Services Disaster recovery for Health Data Services
Azure Health Data Services Azure Health Data Services: De-identification service Reliability in the Health Data Services de-identification service
Azure Health Data Services Azure Health Data Services: Workspace services (FHIR®, DICOM®, medtech) Business continuity and disaster recovery considerations
Azure HDInsight Azure HDInsight Reliability in HDInsight
Azure IoT Hub Azure IoT Hub Reliability in IoT Hub
Azure Key Vault Azure Key Vault Reliability in Key Vault
Azure Key Vault Managed HSM Azure Key Vault Managed HSM Reliability in Azure Key Vault Managed HSM
Azure Kubernetes Service (AKS) Azure Kubernetes Service (AKS) Reliability in AKS
Azure Load Balancer Azure Load Balancer Reliability in Load Balancer
Azure Logic Apps Azure Logic Apps Reliability in Logic Apps
Azure Managed Grafana Azure Managed Grafana Reliability in Azure Managed Grafana
Azure Machine Learning Azure Machine Learning Failover for business continuity and disaster recovery
Azure Managed Redis Azure Managed Redis Reliability in Azure Managed Redis
Azure Migrate Azure Migrate Azure Migrate and backup and disaster recovery
Azure Monitor Logs Azure Monitor Logs Reliability in Azure Monitor Logs
Azure NAT Gateway Azure NAT Gateway Reliability in Azure NAT Gateway
Azure NetApp Files Azure NetApp Files Reliability in Azure NetApp Files
Azure Network Watcher Azure Network Watcher Network Watcher service availability and redundancy
Azure Notification Hubs Azure Notification Hubs Reliability in Notification Hubs
Azure Private Link service Azure Private Link service Reliability in Azure Private Link service
Azure public IP addresses Azure public IP addresses Azure public IP addresses availability zone
Azure Queue Storage Azure Queue Storage Reliability in Queue Storage
Azure Route Server Azure Route Server Route Server frequently asked questions (FAQs)
Azure Service Bus Azure Service Bus Reliability in Service Bus
Azure Service Fabric Azure Service Fabric Deploy a Service Fabric cluster across availability zones

Disaster recovery in Service Fabric
Azure SignalR Service Azure SignalR Service Reliability in Azure SignalR Service
Azure Site Recovery Azure Site Recovery Reliability in Azure Site Recovery
Azure SQL Database Azure SQL Database Reliability in Azure SQL Database
Azure SQL Managed Instance Azure SQL Managed Instance Reliability in Azure SQL Managed Instance
Azure Storage Actions Azure Storage Actions Reliability in Storage Actions
Azure Storage Discovery Azure Storage Discovery Reliability in Storage Discovery
Azure Storage Mover Azure Storage Mover Reliability in Storage Mover
Azure Stream Analytics Azure Stream Analytics Reliability in Azure Stream Analytics
Azure Table Storage Azure Table Storage Reliability in Table Storage
Azure Traffic Manager Azure Traffic Manager Reliability in Traffic Manager
Azure Virtual Machines Azure Virtual Machines Reliability in Virtual Machines
Azure VM Image Builder Azure VM Image Builder Reliability in VM Image Builder
Azure Virtual Machine Scale Sets Azure Virtual Machine Scale Sets Reliability in Virtual Machine Scale Sets
Azure Virtual Network Azure Virtual Network Reliability in Virtual Network
Azure Virtual WAN Azure Virtual WAN Availability zones and resiliency in Virtual WAN

Disaster recovery design
Azure VMware Solution Azure VMware Solution Reliability in Azure VMware Solution
Azure VPN Gateway Azure VPN Gateway Reliability in VPN Gateway
Azure Web PubSub Azure Web PubSub Service Reliability in Azure Web PubSub Service
Microsoft Fabric Microsoft Fabric Reliability in Microsoft Fabric