Performing disaster recovery drills - Azure SQL Managed Instance

Article
08/13/2024

Applies to: Azure SQL Managed Instance

It's recommended to periodically test and validate that applications are ready for a recovery workflow. Verifying the application behavior and implications of data loss and/or the disruption that failover involves is good engineering practice. It is also a requirement by most industry standards as part of business continuity certification.

Performing a disaster recovery drill consists of:

Simulating data tier outage
Recovering
Validate application integrity post recovery

Depending on how you designed your application for business continuity, the workflow to execute the drill can vary. This article describes the best practices for conducting a disaster recovery drill in the context of Azure SQL Managed Instance.

Geo-restore

To prevent the potential data loss when conducting a disaster recovery drill, perform the drill using a test environment by creating a copy of the production environment and using it to verify the application's failover workflow.

Outage simulation

To simulate the outage, you can rename the source database. This name change causes application connectivity failures.

Recovery

Perform a geo-restore of the database to a different instance as described in disaster recovery guidance.
Change the application configuration to connect to the recovered instance and follow the Configure a database after recovery guide to complete the recovery.

Validation

Complete the drill by verifying the application integrity post recovery (including connection strings, logins, basic functionality testing, or other validations part of standard application signoffs procedures).

Failover groups

For an instance protected by failover groups, the drill exercise involves planned failover to the secondary instance. The planned failover ensures that the primary and the secondary instances in the failover group remain in sync when the roles are switched. Unlike the unplanned failover, this operation does not result in data loss, so the drill can be performed in a production environment.

Configure your failover group with the failover policy that suits your business need, and test failover regardless of how your failover policy is configured. For more information, review test failover. A customer-managed failover policy is recommended to give you control over the failover process.

Important

Since system databases aren't replicated between instances in a failover group, manually recreate system objects on the secondary instance and then test environments with system object dependencies to ensure they continue functioning properly after a failover.

Outage simulation

To simulate the outage, you can disable the web application or virtual machine connected to the database. This outage simulation results in the connectivity failures for the web clients.

Recovery

Make sure the application configuration in the DR region points to the former secondary, which becomes the fully accessible new primary.
Initiate a planned failover of the failover group from the secondary instance.
Follow the Configure a database after recovery guide to complete the recovery.

Validation

Complete the drill by verifying the application integrity post recovery (including connectivity, basic functionality testing, or other validations required for the drill signoffs).

To learn more, review:

Microsoft Learn Challenge

Share via

Performing disaster recovery drills - Azure SQL Managed Instance

Geo-restore

Outage simulation

Recovery

Validation

Failover groups

Outage simulation

Recovery

Validation

Feedback

Additional resources

Microsoft Learn Challenge

Share via

Performing disaster recovery drills - Azure SQL Managed Instance

Geo-restore

Outage simulation

Recovery

Validation

Failover groups

Outage simulation

Recovery

Validation

Related content

Feedback

Additional resources