Reliability in Azure Image Builder (AIB)

This article contains specific reliability recommendations for Image Builder and cross-region disaster recovery and business continuity.

Azure Image Builder (AIB) is a regional service with a cluster that serves single regions. The AIB regional setup keeps data and resources within the regional boundary. AIB as a service doesn't do fail over for cluster and SQL database in region down scenarios.

For an architectural overview of reliability in Azure, see Azure reliability.

Note

Azure Image Builder doesn't support availability zones.

Reliability recommendations

This section contains recommendations for achieving resiliency and availability. Each recommendation falls into one of two categories:

  • Health items cover areas such as configuration items and the proper function of the major components that make up your Azure Workload, such as Azure Resource configuration settings, dependencies on other services, and so on.

  • Risk items cover areas such as availability and recovery requirements, testing, monitoring, deployment, and other items that, if left unresolved, increase the chances of problems in the environment.

Reliability recommendations priority matrix

Each recommendation is marked in accordance with the following priority matrix:

Image Priority Description
High Immediate fix needed.
Medium Fix within 3-6 months.
Low Needs to be reviewed.

Reliability recommendations summary

Category Priority Recommendation
High Availability Use generation 2 virtual machine source images
Disaster Recovery Replicate image templates to a secondary region

High availability

Use generation 2 virtual machine (VM) source images

When building your image templates, use source images that support generation 2 VMs. Generation 2 VMs support key features that aren’t supported in generation 1 VMs such as:

  • Increased memory
  • Support for disks greater than 2TB
  • New UEFI-based boot architecture instead, which can improve boot and installation times
  • Intel Software Guard Extensions (Intel SGX)
  • Virtualized persistent memory (vPMEM)

For more information on generation 2 VM features and capabilities, see Generation 2 VMs: Features and capabilities.

Disaster recovery

Replicate image templates to a secondary region

The Azure Image Builder service that's used to deploy Image Templates doesn’t currently support availability zones. Therefore, when building your image templates, you should replicate them to a secondary region, preferably to your primary region’s paired region. With a secondary region, you can quickly recover from a region failure and continue to deploy virtual machines from your image templates. For more information, see Cross-region disaster recovery and business continuity.

// Azure Resource Graph Query
// List all Image Templates that are not replicated to another region
resources
| where type =~ "microsoft.virtualmachineimages/imagetemplates"
| mv-expand distribution=properties.distribute
| where array_length(parse_json(distribution).replicationRegions) == 1
| project recommendationId = "it-2", name, id, param1=strcat("replicationRegions:",parse_json(distribution).replicationRegions)

Cross-region disaster recovery and business continuity

Disaster recovery (DR) is about recovering from high-impact events, such as natural disasters or failed deployments that result in downtime and data loss. Regardless of the cause, the best remedy for a disaster is a well-defined and tested DR plan and an application design that actively supports DR. Before you begin to think about creating your disaster recovery plan, see Recommendations for designing a disaster recovery strategy.

When it comes to DR, Microsoft uses the shared responsibility model. In a shared responsibility model, Microsoft ensures that the baseline infrastructure and platform services are available. At the same time, many Azure services don't automatically replicate data or fall back from a failed region to cross-replicate to another enabled region. For those services, you are responsible for setting up a disaster recovery plan that works for your workload. Most services that run on Azure platform as a service (PaaS) offerings provide features and guidance to support DR and you can use service-specific features to support fast recovery to help develop your DR plan.

To ensure fast and easy recovery for Azure Image Builder (AIB), it's recommended that you run an image template in region pairs or multiple regions when designing your AIB solution. You should also replicate resources from the start when you're setting up your image templates.

Multi-region geography disaster recovery

When a regional disaster occurs, Microsoft is responsible for outage detection, notifications, and support for AIB. However, you're responsible for setting up disaster recovery for the control (service side) and data planes.

Outage detection, notification, and management

Microsoft sends a notification if there's an outage in the Azure Image Builder (AIB) Service. One common outage symptom is image templates getting 500 errors when attempting to run. You can review Azure Image Builder outage notifications and status updates through support request management.

Set up disaster recovery and outage detection

You're responsible for setting up disaster recovery for your Azure Image Builder (AIB) environment, as there isn't a region failover at the AIB service side. You need to configure both the control plane (service side) and data plane.

It's recommended that you create an AIB resource in another nearby region, into which you can replicate your resources. For more information, see the supported regions and what resources are included in an AIB creation.

Single-region geography disaster recovery

In the case of a diaster for single-region, you still need to get an image template resource from that region even when that region isn't available. You can either maintain a copy of an image template locally or can use Azure Resource Graph from the Azure portal to get an image template resource.

To get an image template resource using Resource Graph from the Azure portal:

  1. Go to the search bar in Azure portal and search for resource graph explorer.

    Screenshot of Azure Resource Graph Explorer in the portal.

  2. Use the search bar on the far left to search resource by type and name to see how the details give you properties of the image template. The See details option on the bottom right shows the image template's properties attribute and tags separately. Template name, location, ID, and tenant ID can be used to get the correct image template resource.

    Screenshot of using Azure Resource Graph Explorer search.

Capacity and proactive disaster recovery resiliency

Microsoft and its customers operate under the shared responsibility model. In customer-enabled DR (customer-responsible services), you're responsible for addressing DR for any service you deploy and control. To ensure that recovery is proactive, you should always pre-deploy secondaries. Without pre-deployed secondaries, there's no guarantee of capacity at time of impact.

When planning where to replicate a template, consider:

  • AIB region availability:
  • Azure paired regions:
    • For your geographic area, choose two regions paired together.
    • Recovery efforts for paired regions where prioritization is needed.

Additional guidance

In regards to your data processing information, refer to the Azure Image Builder data residency details.

Next steps