DR for Azure Data Platform - Scenario details

Azure Active Directory
Analysis Services
SQL Database

Data service topology

At a high-level the data service topology for Contoso’s data platform can be illustrated as: Diagram of the high-level Contoso data service topology. This logical diagram abstracts the key functions of the Contoso data ecosystem into a simplified, high-level view. This abstracted view supports the sections covering the scenario deployments, in line with the DR strategy selection and the segregation of responsibilities in a recovery process.

DR Impact vs Customer Activity

The following sections present a breakdown of Contoso activity necessary across DR events of varying impacts.

Area: Foundational components

  • Azure Active Directory including role entitlements
    • Contoso SKU selection: Premium P1
    • DR Impact
      • Azure Data Center Failure: N/A
      • Availability Zone Failure: N/A
      • Azure Regional Failure: N/A
  • Management Groups
    • Contoso SKU selection: N/A
    • DR Impact
      • Azure Data Center Failure: N/A
      • Availability Zone Failure: N/A
      • Azure Regional Failure: N/A
  • Subscriptions
    • Contoso SKU selection: N/A
    • DR Impact
      • Azure Data Center Failure: N/A
      • Availability Zone Failure: N/A
      • Azure Regional Failure: N/A
  • Azure Key Vault
    • Contoso SKU selection: Standard
    • DR Impact
      • Azure Data Center Failure: N/A
      • Availability Zone Failure: N/A
      • Azure Regional Failure: N/A
  • Azure Monitor
    • Contoso SKU selection: N/A
    • DR Impact
      • Azure Data Center Failure: N/A
      • Availability Zone Failure: N/A
      • Azure Regional Failure: N/A
  • Microsoft Defender for Cloud
    • Contoso SKU selection: N/A
    • DR Impact
      • Azure Data Center Failure: N/A
      • Availability Zone Failure: N/A
      • Azure Regional Failure: N/A
  • Cost Management
    • Contoso SKU selection: N/A
    • DR Impact
      • Azure Data Center Failure: N/A
      • Availability Zone Failure: N/A
      • Azure Regional Failure: N/A
  • Azure DNS
    • Contoso SKU selection: N/A
    • DR Impact
      • Azure Data Center Failure: N/A
      • Availability Zone Failure: N/A
      • Azure Regional Failure: N/A
  • Network Watcher
    • Contoso SKU selection: N/A
    • DR Impact
      • Azure Data Center Failure: N/A
      • Availability Zone Failure: N/A
      • Azure Regional Failure: N/A
  • Recovery Services Vault
    • Contoso SKU selection: Default (GRS)
    • DR Impact
      • Azure Data Center Failure: N/A
      • Availability Zone Failure: N/A
      • Azure Regional Failure: N/A
    • Notes
      • Cross Region Restore will enable DR drills and the customer failing over to the secondary region
  • Virtual Networks, including Subnets, UDR & NSGs
    • Contoso SKU selection: N/A
    • DR Impact
      • Azure Data Center Failure: N/A
      • Availability Zone Failure: N/A
      • Azure Regional Failure: Contoso would need to redeploy the Foundation and Data platform VNets with their attached UDRs & NSGs into the secondary region
    • Notes
      • Traffic Manager can be used to geo-route traffic between regions that hold replica VNet structures. If they have the same address space, they can't be connected to the on-premises network, as it would cause routing issues
      • At the time of a disaster and loss of a VNet in one region, you can connect the other VNet in the available region, with the matching address space to your on-premises network
  • Resource Groups
    • Contoso SKU selection: N/A
    • DR Impact
      • Azure Data Center Failure: N/A
      • Availability Zone Failure: N/A
      • Azure Regional Failure: Contoso would need to redeploy the Foundation and Data platform Resource groups into the secondary region
    • Notes
      • This activity would be mitigated by implementing the “Warm Spare” strategy, having the network and resource group topology available in the secondary region
  • Azure Firewall
    • Contoso SKU selection: Standard
    • DR Impact
      • Azure Data Center Failure: N/A
      • Availability Zone Failure: Contoso would need to validate availability and redeploy if necessary
      • Azure Regional Failure: Contoso would need to redeploy the Foundation Azure Firewalls into the secondary region
    • Notes
      • Azure Firewall can be created with Availability Zones for increased availability
      • A “Warm Spare” strategy would mitigate this activity
  • Azure DDoS
    • Contoso SKU selection: Network Protection
    • DR Impact
      • Azure Data Center Failure: N/A
      • Availability Zone Failure: N/A
      • Azure Regional Failure: Contoso would need to create a DDoS protection plan for the Foundation’s VNETs within the secondary region
  • ExpressRoute – Circuit
    • Contoso SKU selection: Standard
    • DR Impact
      • Azure Data Center Failure: N/A
      • Availability Zone Failure: N/A
      • Azure Regional Failure: N/A
    • Notes
      • The physical circuit would remain the responsibility of Microsoft and the connectivity partner to recover
  • VPN Gateway
    • Contoso SKU selection: VpnGw1
    • DR Impact
      • Azure Data Center Failure: N/A
      • Availability Zone Failure: Contoso would need to validate availability and redeploy if necessary
      • Azure Regional Failure: Contoso would need to redeploy the Foundation VPN Gateways into the secondary region
    • Notes
      • VPN Gateways can be created with Availability Zones for increased availability
      • A “Warm Spare” strategy would mitigate this activity
  • Load Balancer
    • Contoso SKU selection: Standard
    • DR Impact
      • Azure Data Center Failure: N/A
      • Availability Zone Failure: Contoso would need to validate availability and redeploy if necessary
      • Azure Regional Failure: Contoso would need to redeploy the Foundation Load Balancers into the secondary region
    • Notes
  • Azure DevOps

Area: Data Platform components

  • Storage Account – Azure Data Lake Gen2
    • Contoso SKU selection: LRS
    • DR Impact
      • Azure Data Center Failure: N/A
      • Availability Zone Failure: Contoso would need to validate availability and redeploy if necessary
      • Azure Regional Failure: Contoso would need to redeploy the Data Platform Storage Accounts and rehydrate them with data in the secondary region
    • Notes
      • Storage Accounts have a broad range of data redundancy options from primary region redundancy up to secondary region redundancy
      • For Secondary region redundancy data is replicated to the secondary region asynchronously. A failure that affects the primary region may result in data loss if the primary region can't be recovered. Azure Storage typically has an RPO of less than 15 minutes
      • In the case of a regional outage, Storage accounts which, are geo-redundant, would be available in the secondary region as LRS. Additional configuration would need to be applied to uplift these components in the secondary region to be geo-redundant
  • Azure Synapse - Pipelines
    • Contoso SKU selection: Computed Optimized Gen2
    • DR Impact
      • Azure Data Center Failure: N/A
      • Availability Zone Failure: N/A
      • Azure Regional Failure: Contoso would need to deploy and restore the Data Platform Azure Synapse Analytics into the secondary region and redeploy the pipelines
    • Notes
      • Automatic restore points are deleted after seven days
      • User-defined restore points are available. Currently (July 2022), there's a ceiling of 42 user-defined restore points that are automatically deleted after seven days
      • Synapse can also perform a DB restore in the local or remote region, and then immediately PAUSE the instance. This process will only incur storage costs – and have zero compute costs. This offers a way to keep a ”live” DB copy at specific intervals
  • Azure Event Hubs
    • Contoso SKU selection: Standard
    • DR Impact
      • Azure Data Center Failure: Contoso would need to validate availability and redeploy if necessary
      • Availability Zone Failure: Contoso would need to validate availability and redeploy if necessary
      • Azure Regional Failure: Contoso would need to redeploy the Event Hubs instance into the secondary region
    • Notes
  • Azure IoT Hubs
    • Contoso SKU selection: Standard
    • DR Impact
      • Azure Data Center Failure: N/A
      • Availability Zone Failure: N/A
      • Azure Regional Failure: Contoso would need to redeploy the IoT Hub into the secondary region
    • Notes
  • Azure Stream Analytics
    • Contoso SKU selection: Standard
    • DR Impact
      • Azure Data Center Failure: N/A
      • Availability Zone Failure: N/A
      • Azure Regional Failure: Contoso would need to redeploy the IoT Hub into the secondary region
    • Notes
      • A key feature of Stream Analytics is its ability to recover fromNode failure
  • Azure Cognitive Services
    • Contoso SKU selection: Pay As You Go
    • DR Impact
      • Azure Data Center Failure: N/A
      • Availability Zone Failure: N/A
      • Azure Regional Failure: N/A
  • Azure Machine Learning
    • Contoso SKU selection: General Purpose – D Series instances
    • DR Impact
      • Azure Data Center Failure: Contoso would need to validate availability and redeploy if necessary
      • Availability Zone Failure: Contoso would need to validate availability and redeploy if necessary
      • Azure Regional Failure: Contoso would need to redeploy Machine Learning into the secondary region
    • Notes
  • Azure Synapse – Data Explorer Pools
    • Contoso SKU selection: Compute Optimized Gen2
    • DR Impact
      • Azure Data Center Failure: N/A
      • Availability Zone Failure: N/A
      • Azure Regional Failure: Contoso would need to redeploy Azure Synapse – Data Explorer Pools and pipelines into the secondary region
  • Azure Synapse – Spark Pools
    • Contoso SKU selection: Compute Optimized Gen2
    • DR Impact
      • Azure Data Center Failure: N/A
      • Availability Zone Failure: N/A
      • Azure Regional Failure: Contoso would need to redeploy Azure Synapse – Spark Pools and pipelines into the secondary region
    • Notes
  • Azure Synapse – Serverless and Dedicated SQL Pools
    • Contoso SKU selection: Compute Optimized Gen2
    • DR Impact
      • Azure Data Center Failure: N/A
      • Availability Zone Failure: N/A
      • Azure Regional Failure: Contoso would need to deploy and restore the Data Platform Azure Synapse Analytics into the secondary region
    • Notes
      • Automatic restore points are deleted after seven days
      • User-defined restore points are available. Currently (July 2022), there's a ceiling of 42 user-defined restore points that are automatically deleted after seven days
      • Synapse can also perform a DB restore in the local or remote region, and then immediately PAUSE the instance. This will only incur storage costs – and have zero compute costs. This solution offers a way to keep a ”live” DB copy at specific intervals
  • Power BI
  • Azure Cosmos DB
    • Contoso SKU selection: Single Region Write with Periodic backup
    • DR Impact
      • Azure Data Center Failure: N/A
      • Availability Zone Failure: N/A
      • Azure Regional Failure: Contoso should monitor, ensuring there are enough provisioned RUs in the remaining regions to support read & write activities
    • Notes
      • Single-region accounts may lose availability following a regional outage. To ensure high availability of your Cosmos DB instance, configure it with a single write region and at least a second (read) region and enable Service-Managed failover
  • Azure Cognitive Search
    • Contoso SKU selection: Standard S1
    • DR Impact
      • Azure Data Center Failure: Contoso would need to validate availability and redeploy if necessary
      • Availability Zone Failure: Contoso would need to validate availability and redeploy if necessary
      • Azure Regional Failure: Contoso would need to redeploy the Cognitive Search into the secondary region
    • Notes
  • Azure Data Share
    • Contoso SKU selection: N/A
    • DR Impact
      • Azure Data Center Failure: Contoso would need to validate availability and redeploy if necessary
      • Availability Zone Failure: Contoso would need to validate availability and redeploy if necessary
      • Azure Regional Failure: Contoso would need to redeploy the Data Share into the secondary region
    • Notes
  • Purview
    • Contoso SKU selection: N/A
    • DR Impact
      • Azure Data Center Failure: N/A
      • Availability Zone Failure: Contoso would need to validate availability and redeploy if necessary
      • Azure Regional Failure: Contoso would need to deploy an instance of Purview into the secondary region
    • Notes
      • This activity would be mitigated by implementing the “Warm Spare” strategy, having a second instance of Azure Purview available in the secondary region
      • A ”Warm Spare” approach has the following key callouts:
        • The primary and secondary Azure Purview accounts can't be configured to the same Azure Data Factory, Azure Data Share and Synapse Analytics accounts, if applicable. As a result, the lineage from Azure Data Factory and Azure Data Share can't be seen in the secondary Azure Purview accounts
        • The integration runtimes are specific to an Azure Purview account. Hence, if scans must run in primary and secondary Azure Purview accounts in parallel, multiple self-hosted integration runtimes must be maintained

Note

This section is intended as general guidance. The vendor’s documentation on disaster recovery, redundancy and backup should be consulted for the correct approach for a new component/service under consideration

“Azure Data Center Failure” covers the situation where the impacted region does not have Availability Zones offered

If new/updated configuration or releases occurred at the point of the disaster event, that should be checked and redeployed (if necessary) as part of the work to bring the platform up to the current date

Next steps

Now that you've learned about the scenario details, you can learn about recommendations related to this scenario