Auto-failover groups overview & best practices (Azure SQL Managed Instance)

Applies to: Azure SQL Managed Instance

The auto-failover groups feature allows you to manage the replication and failover of all user databases in a managed instance to another Azure region. This article focuses on using the Auto-failover group feature with Azure SQL Managed Instance and some best practices.

To get started, review Configure auto-failover group. For an end-to-end experience, see the Auto-failover group tutorial.

Note

This article covers auto-failover groups for Azure SQL Managed Instance. For Azure SQL Database, see Auto-failover groups in SQL Database.

Overview

The auto-failover groups feature allows you to manage the replication and failover of user databases in a managed instance to a managed instance in another Azure region. Auto-failover groups are designed to simplify deployment and management of geo-replicated databases at scale.

Automatic failover

You can initiate a geo-failover manually or you can delegate it to the Azure service based on a user-defined policy. The latter option allows you to automatically recover multiple related databases in a secondary region after a catastrophic failure or other unplanned event that results in full or partial loss of the SQL Database or SQL Managed Instance availability in the primary region. Typically, these are outages that cannot be automatically mitigated by the built-in high availability infrastructure. Examples of geo-failover triggers include natural disasters or incidents caused by a tenant or control ring being down due to an OS kernel memory leak on compute nodes.

Offload read-only workloads

To reduce traffic to your primary databases, you can also use the secondary databases in a failover group to offload read-only workloads. Use the read-only listener to direct read-only traffic to a readable secondary database.

Endpoint redirection

Auto-failover groups provide read-write and read-only listener end-points that remain unchanged during geo-failovers. You do not have to change the connection string for your application after a geo-failover, because connections are automatically routed to the current primary. Whether you use manual or automatic failover activation, a geo-failover switches all secondary databases in the group to the primary role. After the geo-failover is completed, the DNS record is automatically updated to redirect the endpoints to the new region. For geo-failover RPO and RTO, see Overview of Business Continuity.

Recovering an application

To achieve full business continuity, adding regional database redundancy is only part of the solution. Recovering an application (service) end-to-end after a catastrophic failure requires recovery of all components that constitute the service and any dependent services. Examples of these components include the client software (for example, a browser with a custom JavaScript), web front ends, storage, and DNS. It is critical that all components are resilient to the same failures and become available within the recovery time objective (RTO) of your application. Therefore, you need to identify all dependent services and understand the guarantees and capabilities they provide. Then, you must take adequate steps to ensure that your service functions during the failover of the services on which it depends.

For more information, see Azure SQL Managed Instance high availability.

Terminology and capabilities

  • Failover group (FOG)

    A failover group allows for all user databases within a managed instance to fail over as a unit to another Azure region in case the primary managed instance becomes unavailable due to a primary region outage. Since failover groups for SQL Managed Instance contain all user databases within the instance, only one failover group can be configured on an instance.

    Important

    The name of the failover group must be globally unique within the .database.windows.net domain.

  • Primary

    The managed instance that hosts the primary databases in the failover group.

  • Secondary

    The managed instance that hosts the secondary databases in the failover group. The secondary cannot be in the same Azure region as the primary.

  • DNS zone

    A unique ID that is automatically generated when a new SQL Managed Instance is created. A multi-domain (SAN) certificate for this instance is provisioned to authenticate the client connections to any instance in the same DNS zone. The two managed instances in the same failover group must share the DNS zone.

  • Failover group read-write listener

    A DNS CNAME record that points to the current primary. It's created automatically when the failover group is created and allows the read-write workload to transparently reconnect to the primary when the primary changes after failover. When the failover group is created on a SQL Managed Instance, the DNS CNAME record for the listener URL is formed as <fog-name>.<zone_id>.database.windows.net.

  • Failover group read-only listener

    A DNS CNAME record that points to the current secondary. It's created automatically when the failover group is created and allows the read-only SQL workload to transparently connect to the secondary when the secondary changes after failover. When the failover group is created on a SQL Managed Instance, the DNS CNAME record for the listener URL is formed as <fog-name>.secondary.<zone_id>.database.windows.net.

  • Automatic failover policy

    By default, a failover group is configured with an automatic failover policy. The system triggers a geo-failover after the failure is detected and the grace period has expired. The system must verify that the outage cannot be mitigated by the built-in high availability infrastructure, for example due to the scale of the impact. If you want to control the geo-failover workflow from the application or manually, you can turn off automatic failover policy.

    Note

    Because verification of the scale of the outage and how quickly it can be mitigated involves human actions, the grace period cannot be set below one hour, and the exact time of the failover may vary slightly from the grace period that was set. This limitation applies to all databases in the failover group regardless of their data synchronization state.

  • Read-only failover policy

    By default, the failover of the read-only listener is disabled. It ensures that the performance of the primary is not impacted when the secondary is offline. However, it also means the read-only sessions will not be able to connect until the secondary is recovered. If you cannot tolerate downtime for the read-only sessions and can use the primary for both read-only and read-write traffic at the expense of the potential performance degradation of the primary, you can enable failover for the read-only listener by configuring the AllowReadOnlyFailoverToPrimary property. In that case, the read-only traffic will be automatically redirected to the primary if the secondary is not available.

    Note

    The AllowReadOnlyFailoverToPrimary property only has effect if automatic failover policy is enabled and an automatic geo-failover has been triggered. In that case, if the property is set to True, the new primary will serve both read-write and read-only sessions.

  • Planned failover

    Planned failover performs full data synchronization between primary and secondary databases before the secondary switches to the primary role. This guarantees no data loss. Planned failover is used in the following scenarios:

    • Perform disaster recovery (DR) drills in production when data loss is not acceptable
    • Relocate the databases to a different region
    • Return the databases to the primary region after the outage has been mitigated (failback)

    Note

    If a database contains in-memory OLTP objects, the primary databases and the target secondary geo-replica databases should have matching service tiers, as in-memory OLTP objects are always hydrated in memory. A lower service tier on the target geo-replica database may result in out-of-memory issues. If this happens, the affected geo-secondary database replica may be put into a limited read-only mode called in-memory OLTP checkpoint-only mode. Read-only table queries are allowed, but read-only in-memory OLTP table queries are disallowed on the affected geo-secondary database replica. Planned failover is blocked if all replicas in the geo-secondary database are in checkpoint only mode. Unplanned failover may fail due to out-of-memory issues. To avoid this, upgrade the service tier of the secondary database to match the primary database during the planned failover, or drill. Service tier upgrades can be size-of-data operations, and may take a while to finish.

  • Unplanned failover

    Unplanned or forced failover immediately switches the secondary to the primary role without waiting for recent changes to propagate from the primary. This operation may result in data loss. Unplanned failover is used as a recovery method during outages when the primary is not accessible. When the outage is mitigated, the old primary will automatically reconnect and become a new secondary. A planned failover may be executed to fail back, returning the replicas to their original primary and secondary roles.

  • Manual failover

    You can initiate a geo-failover manually at any time regardless of the automatic failover configuration. You can initiate a forced (unplanned) or friendly (planned) failover. A friendly failover is only possible when the old primary is accessible, and can be used to relocate the primary to the secondary region without data loss. During an outage that impacts the primary, if automatic failover policy is not configured, a manual failover is required to promote the secondary to the primary role. When a failover is completed, the DNS records are automatically updated to ensure connectivity to the new primary.

  • Grace period with data loss

    Because the data is replicated to the secondary database using asynchronous replication, an automatic geo-failover may result in data loss. You can customize the automatic failover policy to reflect your application's tolerance to data loss. By configuring GracePeriodWithDataLossHours, you can control how long the system waits before initiating a forced failover, which may result in data loss.

Failover group architecture

The auto-failover group must be configured on the primary instance and will connect it to the secondary instance in a different Azure region. All user databases in the instance will be replicated to the secondary instance. System databases like master and msdb will not be replicated.

The following diagram illustrates a typical configuration of a geo-redundant cloud application using managed instance and auto-failover group:

Auto-failover group diagram for SQL MI

If your application uses SQL Managed Instance as the data tier, follow the general guidelines and best practices outlined in this article when designing for business continuity.

Create the geo-secondary instance

To ensure non-interrupted connectivity to the primary SQL Managed Instance after failover, both the primary and secondary instances must be in the same DNS zone. It will guarantee that the same multi-domain (SAN) certificate can be used to authenticate client connections to either of the two instances in the failover group. When your application is ready for production deployment, create a secondary SQL Managed Instance in a different region, and make sure it shares the DNS zone with the primary SQL Managed Instance. You can do it by specifying an optional parameter during creation. If you're using PowerShell or the REST API, the name of the optional parameter is DNSZonePartner. The name of the corresponding optional field in the Azure portal is Primary Managed Instance.

Important

The first managed instance created in the subnet determines DNS zone for all subsequent instances in the same subnet. This means that two instances from the same subnet cannot belong to different DNS zones.

For more information about creating the secondary SQL Managed Instance in the same DNS zone as the primary instance, see Create a secondary managed instance.

Use paired regions

Deploy both managed instances to paired regions for performance reasons. SQL Managed Instance failover groups in paired regions have better performance compared to unpaired regions.

Azure SQL Managed Instance follows a safe deployment practice where Azure paired regions are generally not deployed to at the same time. However, it is not possible to predict which region will be upgraded first, so the order of deployment is not guaranteed. Sometimes, your primary instance will be upgraded first, and sometimes it would be secondary.

In situations where Azure SQL Managed Instance is part of an auto-failover group, and the instances in the group are not in Azure paired regions, select different maintenance window schedules for your primary and secondary database. For example, select a Weekday maintenance window for your geo-secondary database and a Weekend maintenance window for your geo-primary database.

Enable and optimize geo-replication traffic flow between the instances

Connectivity between the virtual network subnets hosting primary and secondary instance must be established and maintained for uninterrupted geo-replication traffic flow. There are multiple ways to provide connectivity between the instances that you can choose among based on your network topology and policies:

Important

Global virtual network peering (VNet peering) is the recommended way for establishing connectivity between two instances in a failover group. It provides a low-latency, high-bandwidth private connection between the peered virtual networks using the Microsoft backbone infrastructure. No public Internet, gateways, or additional encryption is required in the communication between the peered virtual networks. Global virtual network peering is supported for instances hosted in subnets created since September 22, 2020. To be able to use global virtual network peering for SQL managed instances hosted in subnets created before September 22, 2020, consider configuring non-default maintenance window on the instance, as it will move the instance into a new virtual cluster that supports global virtual network peering.

Regardless of the connectivity mechanism, there are requirements that must be fulfilled for geo-replication traffic to flow:

  • Route table and network security groups assigned to managed instance subnets are not shared across the two peered virtual networks.
  • The Network Security Group (NSG) rules on the subnet hosting primary instance allow:
    • Inbound traffic on port 5022 and port range 11000-11999 from the subnet hosting the secondary instance.
    • Outbound traffic on port 5022 and port range 11000-11999 to the subnet hosting the secondary instance.
  • The Network Security Group (NSG) rules on the subnet hosting secondary instance allow:
    • Inbound traffic on port 5022 and port range 11000-11999 from the subnet hosting the primary instance.
    • Outbound traffic on port 5022 and port range 11000-11999 to the subnet hosting the primary instance.
  • IP address ranges of VNets hosting primary and secondary instance must not overlap.
  • There's no indirect overlap of IP address ranges between the VNets hosting the primary and secondary instance, or other VNets they are peered with via local virtual network peering or other means.

Additionally, if you're using other mechanisms for providing connectivity between the instances than the recommended global virtual network peering, you need to ensure the following:

  • Any networking device used, like firewalls or network virtual appliances (NVAs), do not block the traffic described above.
  • Routing is properly configured, and asymmetric routing is avoided.
  • If you deploy auto-failover groups in a hub-and-spoke network topology cross-region, replication traffic should go directly between the two managed instance subnets rather than directed through the hub networks. It will help you avoid connectivity and replication speed issues.

Important

Alternative ways of providing connectivity between the instances involving additional networking devices may make troubleshooting process in case of connectivity or replication speed issues very difficult and require active involvement of network administrators and significantly prolong the resolution time.

Initial seeding

When establishing a failover group between managed instances, there's an initial seeding phase before data replication starts. The initial seeding phase is the longest and most expensive part of the operation. Once initial seeding completes data is synchronized, and only subsequent data changes are replicated. The time it takes for the initial seeding to complete depends on the size of data, number of replicated databases, workload intensity on the primary databases, and the speed of the link between the virtual networks hosting primary and secondary instance that mostly depends on the way connectivity is established. Under normal circumstances, and when connectivity is established using recommended global virtual network peering, seeding speed is up to 360 GB an hour for SQL Managed Instance. Seeding is performed for a batch of user databases in parallel - not necessarily for all databases at the same time. Multiple batches may be needed if there are many databases hosted on the instance.

If the speed of the link between the two instances is slower than what is necessary, the time to seed is likely to be noticeably impacted. You can use the stated seeding speed, number of databases, total size of data, and the link speed to estimate how long the initial seeding phase will take before data replication starts. For example, for a single 100-GB database, the initial seed phase would take about 1.2 hours if the link is capable of pushing 84 GB per hour, and if there are no other databases being seeded. If the link can only transfer 10 GB per hour, then seeding a 100-GB database will take about 10 hours. If there are multiple databases to replicate, seeding will be executed in parallel, and, when combined with a slow link speed, the initial seeding phase may take considerably longer, especially if the parallel seeding of data from all databases exceeds the available link bandwidth.

Important

In case of an extremely low-speed or busy link causing the initial seeding phase to take days the creation of a failover group can time out. The creation process will be automatically canceled after 6 days.

Manage geo-failover to a geo-secondary instance

The failover group will manage geo-failover of all databases on the primary managed instance. When a group is created, each database in the instance will be automatically geo-replicated to the geo-secondary instance. You can't use failover groups to initiate a partial failover of a subset of databases.

Important

If a database is dropped on the primary managed instance, it will also be dropped automatically on the geo-secondary managed instance.

Use the read-write listener (primary MI)

For read-write workloads, use <fog-name>.zone_id.database.windows.net as the server name. Connections will be automatically directed to the primary. This name doesn't change after failover. The geo-failover involves updating the DNS record, so the new client connections are routed to the new primary only after the client DNS cache is refreshed. Because the secondary instance shares the DNS zone with the primary, the client application will be able to reconnect to it using the same server-side SAN certificate. The existing client connections need to be terminated and then recreated to be routed to the new primary. The read-write listener and read-only listener cannot be reached via the public endpoint for managed instance.

Use the read-only listener (secondary MI)

If you have logically isolated read-only workloads that are tolerant to data latency, you can run them on the geo-secondary. To connect directly to the geo-secondary, use <fog-name>.secondary.<zone_id>.database.windows.net as the server name.

In the Business Critical tier, SQL Managed Instance supports the use of read-only replicas to offload read-only query workloads, using the ApplicationIntent=ReadOnly parameter in the connection string. When you have configured a geo-replicated secondary, you can use this capability to connect to either a read-only replica in the primary location or in the geo-replicated location:

  • To connect to a read-only replica in the primary location, use ApplicationIntent=ReadOnly and <fog-name>.<zone_id>.database.windows.net.
  • To connect to a read-only replica in the secondary location, use ApplicationIntent=ReadOnly and <fog-name>.secondary.<zone_id>.database.windows.net.

The read-write listener and read-only listener can't be reached via public endpoint for managed instance.

Potential performance degradation after failover

A typical Azure application uses multiple Azure services and consists of multiple components. The automatic geo-failover of the failover group is triggered based on the state of the Azure SQL components alone. Other Azure services in the primary region may not be affected by the outage and their components may still be available in that region. Once the primary databases switch to the secondary region, the latency between the dependent components may increase. Ensure the redundancy of all the application's components in the secondary region and fail over application components together with the database so that application's performance is not affected by higher cross-region latency.

Potential data loss after failover

If an outage occurs in the primary region, recent transactions may not be able to replicate to the geo-secondary. Failover is deferred for the period you specify using GracePeriodWithDataLossHours. If you configured the automatic failover policy, be prepared for data loss. In general, during outages, Azure favors availability. Setting GracePeriodWithDataLossHours to a larger number, such as 24 hours, or disabling automatic geo-failover lets you reduce the likelihood of data loss at the expense of database availability.

DNS update

The DNS update of the read-write listener will happen immediately after the failover is initiated. This operation won't result in data loss. However, the process of switching database roles can take up to 5 minutes under normal conditions. Until it's completed, some databases in the new primary instance will still be read-only. If a failover is initiated using PowerShell, the operation to switch the primary replica role is synchronous. If it's initiated using the Azure portal, the UI will indicate completion status. If it's initiated using the REST API, use standard Azure Resource Manager's polling mechanism to monitor for completion.

Important

Use manual planned failover to move the primary back to the original location once the outage that caused the geo-failover is mitigated.

Save costs with a license-free DR replica

You can save on SQL Server license costs by configuring your secondary managed instance to be used for disaster recovery (DR) only. To set this up, see Configure free DR replica.

As long as the secondary instance is not used for read-workloads, Microsoft provides you with a free number of vCores to match the primary instance. You're still charged for compute and storage used by the secondary instance. Auto-failover groups support only one replica - the replica must either be a readable replica, or designated as a DR-only replica.

Enable scenarios dependent on objects from the system databases

System databases are not replicated to the secondary instance in a failover group. To enable scenarios that depend on objects from the system databases, make sure to create the same objects on the secondary instance and keep them synchronized with the primary instance.

For example, if you plan to use the same logins on the secondary instance, make sure to create them with the identical SID.

-- Code to create login on the secondary instance
CREATE LOGIN foo WITH PASSWORD = '<enterStrongPasswordHere>', SID = <login_sid>;

To learn more, see Replication of logins and agent jobs.

Synchronize instance properties and retention policies instances

Instances in a failover group remain separate Azure resources, and no changes made to the configuration of the primary instance will be automatically replicated to the secondary instance. Make sure to perform all relevant changes both on primary and secondary instance. For example, if you change backup storage redundancy or long-term backup retention policy on primary instance, make sure to change it on secondary instance as well.

Scale instances

You can scale up or scale down the primary and secondary instance to a different compute size within the same service tier or to a different service tier. When scaling up within the same service tier, we recommend that you scale up the geo-secondary first, and then scale up the primary. When scaling down within the same service tier, reverse the order: scale down the primary first, and then scale down the secondary. When you scale instance to a different service tier, this recommendation is enforced.

The sequence is recommended specifically to avoid the problem where the geo-secondary at a lower SKU gets overloaded and must be reseeded during an upgrade or downgrade process.

Prevent loss of critical data

Due to the high latency of wide area networks, geo-replication uses an asynchronous replication mechanism. Asynchronous replication makes the possibility of data loss unavoidable if the primary fails. To protect critical transactions from data loss, an application developer can call the sp_wait_for_database_copy_sync stored procedure immediately after committing the transaction. Calling sp_wait_for_database_copy_sync blocks the calling thread until the last committed transaction has been transmitted and hardened in the transaction log of the secondary database. However, it doesn't wait for the transmitted transactions to be replayed (redone) on the secondary. sp_wait_for_database_copy_sync is scoped to a specific geo-replication link. Any user with the connection rights to the primary database can call this procedure.

To prevent data loss during user-initiated, planned geo-failover, replication automatically and temporarily changes to synchronous replication, then performs a failover. Replication then returns to asynchronous mode after the geo-failover is complete.

Note

sp_wait_for_database_copy_sync prevents data loss after geo-failover for specific transactions, but does not guarantee full synchronization for read access. The delay caused by a sp_wait_for_database_copy_sync procedure call can be significant and depends on the size of the not yet transmitted transaction log on the primary at the time of the call.

Failover group status

Auto-failover group reports its status describing the current state of the data replication:

  • Seeding - Initial seeding is taking place after creation of the failover group, until all user databases are initialized on the secondary instance. Failover process cannot be initiated while auto-failover group is in the Seeding status, since user databases aren't copied to secondary instance yet.
  • Synchronizing - the usual status of auto-failover group. It means that data changes on the primary instance are being replicated asynchronously to the secondary instance. This status doesn't guarantee that the data is fully synchronized at every moment. There may be data changes from primary still to be replicated to the secondary due to asynchronous nature of the replication process between instances in the auto-failover group. Both automatic and manual failovers can be initiated while the auto-failover group is in the Synchronizing status.
  • Failover in progress - this status indicates that either automatically or manually initiated failover process is in progress. No changes to the failover group or additional failovers can be initiated while the auto-failover group is in this status.

Permissions

Permissions for a failover group are managed via Azure role-based access control (Azure RBAC).

Azure RBAC write access is necessary to create and manage failover groups. The SQL Managed Instance Contributor has all the necessary permissions to manage failover groups.

For specific permission scopes, review how to configure auto-failover groups in Azure SQL Managed Instance.

Limitations

Be aware of the following limitations:

  • Failover groups can't be created between two instances in the same Azure region.
  • Failover groups can't be renamed. You will need to delete the group and re-create it with a different name.
  • A failover group contains exactly two managed instances. Adding additional instances to the failover group is unsupported.
  • An instance can participate only in one failover group at any moment.
  • A failover group can't be created between two instances belonging to different Azure tenants.
  • A failover group between two instances belonging to different Azure subscriptions can't be created using Azure portal or Azure CLI. Use Azure PowerShell or REST API instead to create such a failover group. Once created, cross-subscription failover group is regularly visible in Azure portal and all subsequent operations including failovers can be initiated from Azure portal or Azure CLI.
  • Database rename isn't supported for databases in failover group. You will need to temporarily delete failover group to be able to rename a database.
  • System databases aren't replicated to the secondary instance in a failover group. Therefore, scenarios that depend on objects from the system databases such as Server Logins and Agent jobs, require objects to be manually created on the secondary instances and also manually kept in sync after any changes made on primary instance. The only exception is Service master Key (SMK) for SQL Managed Instance that is replicated automatically to secondary instance during creation of failover group. Any subsequent changes of SMK on the primary instance however will not be replicated to secondary instance. To learn more, see how to Enable scenarios dependent on objects from the system databases.
  • Failover groups can't be created between instances if any of them are in an instance pool.

Programmatically manage failover groups

Auto-failover groups can also be managed programmatically using Azure PowerShell, Azure CLI, and REST API. The following tables describe the set of commands available. Active geo-replication includes a set of Azure Resource Manager APIs for management, including the Azure SQL Database REST API and Azure PowerShell cmdlets. These APIs require the use of resource groups and support Azure role-based access control (Azure RBAC). For more information on how to implement access roles, see Azure role-based access control (Azure RBAC).

Cmdlet Description
New-AzSqlDatabaseInstanceFailoverGroup This command creates a failover group and registers it on both primary and secondary instances
Set-AzSqlDatabaseInstanceFailoverGroup Modifies configuration of a failover group
Get-AzSqlDatabaseInstanceFailoverGroup Retrieves a failover group's configuration
Switch-AzSqlDatabaseInstanceFailoverGroup Triggers failover of a failover group to the secondary instance
Remove-AzSqlDatabaseInstanceFailoverGroup Removes a failover group

Next steps