Migrate your cluster to support multiple availability zones (Preview)

Many Azure regions provide availability zones, which are separated groups of datacenters within a region. Availability zones are close enough to have low-latency connections to other availability zones. They're connected by a high-performance network with a round-trip latency of less than 2 ms. However, availability zones are far enough apart to reduce the likelihood that more than one will be affected by local outages or weather. Availability zones have independent power, cooling, and networking infrastructure. They're designed so that if one zone experiences an outage, then regional services, capacity, and high availability are supported by the remaining zones. For more information, see Azure Availability Zones.

Azure Data Explorer clusters can be configured to use availability zones in supported regions. By using availability zones, a cluster can better withstand the failure of a single datacenter in a region to support business continuity scenarios.

You can configure availability zones when creating a cluster in the Azure portal or programmatically using one of the following methods:

  • REST API
  • C# SDK
  • Python SDK
  • PowerShell
  • ARM Template

Important

  • Once a cluster is configured with availability zones, you can't change the cluster to not use availability zones.
  • Multiple zones aren't supported in all regions. Therefore, clusters located in these regions can't be set up to use availability zones.
  • Using availability zones incurs additional costs.

Note

  • Before you proceed, make sure you familiar with the migration process and considerations.
  • You can also use these steps to change the zones of an existing cluster that uses availability zones.

In this article, you learn about:

Prerequisites

  • Make sure your cluster is in a region where migration to multiple availability zones is supported.

  • For migrating a cluster to support availability zones, you need a cluster that was deployed without any availability zones.

  • For changing the zones of a cluster, you need a cluster that is configured with availability zones.

  • For REST API, familiarize yourself with Manage Azure resources by using the REST API.

  • For other programmatic methods, see Prerequisites.

Get the list of availability zones for your cluster's region

You can get a list of availability zones for your cluster in the following ways:

  1. In the Azure portal, go to your cluster's Overview page.

  2. Under Settings, select Scale up.

  3. In the row for your cluster, the availability zones are listed in the Availability zones column.

    Availability zones

Configure your cluster to support availability zones

To add availability zones to an existing cluster, you must update the cluster zones attribute with a list of the target availability zones. Follow the instructions for your preferred method, using the information in the following table:

Parameter Value
subscriptionId The subscription ID of the cluster
resourceGroupName The resource group name of the cluster
clusterName The name of the cluster
apiVersion 2023-05-02 or later

Important

Changing the availability zones for an existing cluster only changes the availability zones for the compute. The persistent storage is not changed.

Follow the instructions on how to deploy a template.

  1. Make the REST API call to the following endpoint where you replace the parameters with your values:

    PUT https://management.azure.com/subscriptions/{subscriptionId}/resourceGroups/{resourceGroupName}/providers/Microsoft.Kusto/clusters/{clusterName}?api-version={apiVersion}
    
  2. Specify your availability zones in the request body. For example, to configure the cluster to use availability zones 1, 2, and 3, set the body as follows:

    { "zones": [ "{zone1}", "{zone2}", "{zone3}" ] }
    

During the migration, the following message appears in the Azure portal, on the cluster's overview page. The message is removed after the migration completes.

Zonality change for the storage of this cluster is in progress. Update time may vary depending on the amount of data.

Architecture of clusters with availability zones

When availability zones are configured, a cluster's resources are deployed as follows:

  • Compute layer: Azure Data Explorer is a distributed computing platform that has two or more nodes. If availability zones are configured, compute nodes are distributed across the defined availability zone for maximum intra-region resiliency. A zone failure might degrade cluster performance, until the failed compute resources are redeployed in the surviving zones. We recommended configuring the maximum available zones in a region.

    Note

    • In some cases, due to compute capacity limitations, only partial availability zones will be available for the compute layer.
    • A cluster's compute layer implements a best effort approach to evenly spread instances across selected zones.
  • Persistent storage layer: Clusters use Azure Storage as its durable persistence layer. If availability zones are configured, ZRS is enabled, placing storage replicas across all three availability zones for maximum intra-region resiliency.

    Note

    • ZRS incurs an additional cost.
    • When availability zones aren't configured, storage resources are deployed with the default setting of Locally Redundant Storage (LRS), placing all 3 replicas is a single zone.

Migration process

When an existing cluster that was deployed without any availability zones is configured to support availability zones, the following steps take place as part of the migration process:

  • Compute is distributed in the defined availability zones

    The process of redistributing compute resources involves a preparation stage in which the zonal Compute resources cache is warmed. During the preparation stage, the existing cluster's compute resources continue to function, ensuring uninterrupted service. This preparation phase can take up to tens of minutes. The transition to the new compute resources only occurs once it's fully prepared and operational. This parallel processing approach ensures a relatively seamless experience, with only minimal service disruption during the switchover process, typically lasting between one to three minutes. However, it's important to note that query performance might be affected during the SKU migration. The degree of impact can vary depending on specific usage patterns.

  • Historical persistent storage data is migrated to ZRS

    The migration process is dependent on the regional support for the transition from LRS to ZRS storage, as well as the available storage accounts capacity in the selected zones. The transfer of historical data can be a time-consuming process, potentially taking several hours or even extending over to weeks.

  • All new data is written to ZRS

    After the request for migration to availability zones is initiated, all new data is replicated and stored in the ZRS configuration.

    Note

    • Following the migration request, there might be a delay of up to several minutes before all new data begins to be written in the ZRS configuration.
    • If a cluster has streaming ingestion, then the recycling of new data to be written as ZRS data, can take up to 30 days.

Considerations

The request for migration to availability zones might not be successful due to capacity constraints. For a successful migration, there must be sufficient compute and storage capacity to support the migration. If there are capacity limitations, you'll get an error message indicating the issue.