Events
Mar 31, 11 PM - Apr 2, 11 PM
The ultimate Microsoft Fabric, Power BI, SQL, and AI community-led event. March 31 to April 2, 2025.
Register todayThis browser is no longer supported.
Upgrade to Microsoft Edge to take advantage of the latest features, security updates, and technical support.
The Service Bus Geo-Replication feature is one of the options to insulate Azure Service Bus applications against outages and disasters, providing replication of both metadata (entities, configuration, properties) and data (message data and message property / state changes).
Note
This feature is available for the Premium tier of Azure Service Bus.
The Geo-Replication feature ensures that the metadata and data of a namespace are continuously replicated from a primary region to one or more secondary regions.
Note
Currently only a single secondary is supported.
This feature allows promoting any secondary region to primary, at any time. Promoting a secondary repoints the name for the namespace to the selected secondary region, and switches the roles between the primary and secondary region. The promotion is nearly instantaneous once initiated.
Important
Region | Region | Region |
---|---|---|
AustraliaCentral | GermanyNorth | NorwayWest |
AustraliaCentral2 | GermanyWestCentral | PolandCentral |
AustraliaEast | IsraelCentral | SouthAfricaNorth |
AustraliaSoutheast | ItalyNorth | SouthAfricaWest |
BrazilSoutheast | JapanEast | SoutheastAsia |
CanadaCentral | JapanWest | SouthIndia |
CanadaEast | JioIndiaCentral | SpainCentral |
CentralIndia | JioIndiaWest | SwedenCentral |
CentralUS | KoreaCentral | SwitzerlandNorth |
CentralUSEUAP | KoreaSouth | SwitzerlandWest |
EastAsia | MexicoCentral | UAECentral |
EastUS2 | NorthCentralUS | UAENorth |
FranceCentral | NorthEurope | UKSouth |
FranceSouth | NorwayEast | UKWest |
The Geo-replication feature can be used to implement different scenarios, as described here.
Data and metadata are continuously synchronized between the primary and secondary regions. If a region lags or is unavailable, it's possible to promote a secondary region as the primary. This promotion allows for the uninterrupted operation of workloads in the newly promoted region. Such a promotion may be necessitated by degradation of Service Bus or other services within your workload, particularly if you aim to run the various components together. Depending on the severity and impacted services, the promotion could either be planned or forced. In case of planned promotion in-flight messages are replicated before finalizing the promotion, while with forced promotion this is immediately executed.
There are times when you want to migrate your Service Bus workloads to run in a different region. For example, when Azure adds a new region that is geographically closer to your location, users, or other services. Alternatively, you might want to migrate when the regions where most of your workloads run is shifted. The Geo-Replication feature also provides a good solution in these cases. In this case, you would set up Geo-Replication on your existing namespace with the desired new region as secondary region and wait for the synchronization to complete. At this point, you would start a planned promotion, allowing any in-flight messages to be replicated. Once the promotion is completed you can now optionally remove the old region, which is now the secondary region, and continue running your workloads in the desired region.
The Geo-Replication feature implements metadata and data replication in a primary-secondary replication model. At a given time there’s a single primary region, which is serving both producers and consumers. The secondaries act as hot stand-by regions, meaning that it isn't possible to interact with these secondary regions. However, they run in the same configuration as the primary region, allowing for fast promotion, and meaning they your workloads can immediately continue running after promotion has been completed. The Geo-Replication feature is available for the Premium tier.
Some of the key aspects of Geo-Replication feature are:
There are two replication modes, synchronous and asynchronous. It's important to know the differences between the two modes.
Using asynchronous replication, all requests are committed on the primary, after which an acknowledgment is sent to the client. Replication to the secondary regions happens asynchronously. Users can configure the maximum acceptable amount of lag time. The lag time is the service side offset between the latest action on the primary and the secondary regions. If the lag for an active secondary grows beyond user configuration, the primary starts throttling incoming requests.
Using synchronous replication, all requests are replicated to the secondary, which must commit and confirm the operation before committing on the primary. As such, your application publishes at the rate it takes to publish, replicate, acknowledge, and commit. Moreover, it also means that your application is tied to the availability of both regions. If the secondary region lags or is unavailable, messages won't be acknowledged and committed, and the primary will throttle incoming requests.
With synchronous replication:
On the other hand, synchronous replication provides the greatest assurance that your data is safe. If you have synchronous replication, then when we commit it, it commits in all of the regions you configured for Geo-Replication, providing the best data assurance.
With asynchronous replication:
As such, it doesn’t have the absolute guarantee that all regions have the data before we commit it like synchronous replication does, and data loss or duplication may occur. However, as you're no longer immediately impacted when a single region lags or is unavailable, application availability improves, in addition to having a lower latency.
Capability | Synchronous replication | Asynchronous replication |
---|---|---|
Latency | Longer due to distributed commit operations | Minimally impacted |
Availability | Tied to availability of secondary regions | Loss of a secondary region doesn't immediately impact availability |
Data consistency | Data always committed in both regions before acknowledgment | Data committed in primary only before acknowledgment |
RPO (Recovery Point Objective) | RPO 0, no data loss on promotion | RPO > 0, possible data loss on promotion |
The replication mode can be changed after configuring Geo-Replication. You can go from synchronous to asynchronous or from asynchronous to synchronous. If you go from asynchronous to synchronous, your secondary will be configured as synchronous after lag reaches zero. If you're running with a continual lag for whatever reason, then you may need to pause your publishers in order for lag to reach zero and your mode to be able to switch to synchronous. The reasons to have synchronous replication enabled, instead of asynchronous replication, are tied to the importance of the data, specific business needs, or compliance reasons, rather than availability of your application.
Note
In case a secondary region lags or becomes unavailable, the application will no longer be able to replicate to this region and will start throttling once the replication lag is reached. To continue using the namespace in the primary location, the afflicted secondary region can be removed. If no more secondary regions are configured, the namespace will continue without Geo-Replication enabled. It's possible to add additional secondary regions at any time.
To enable the Geo-Replication feature, you need to use primary and secondary regions where the feature is enabled. The Geo-Replication feature depends on being able to replicate published messages from the primary to the secondary regions. If the secondary region is on another continent, this has a major impact on replication lag from the primary to the secondary region. If using Geo-Replication for availability reasons, you're best off with secondary regions being at least on the same continent where possible. To get a better understanding of the latency induced by geographic distance, you can learn more from Azure network round-trip latency statistics.
The Geo-Replication feature enables customers to configure a secondary region towards which to replicate metadata and data. As such, customers can perform the following management tasks:
Note
Currently in the public preview only new namespaces are supported.
The following section is an overview to set up the Geo-Replication feature on a new namespace through the Azure portal.
Note
This experience might change during public preview. We'll update this document accordingly.
To create a namespace with the Geo-Replication feature enabled, add the geoDataReplication properties section.
param serviceBusName string
param primaryLocation string
param secondaryLocation string
param maxReplicationLagInSeconds int
resource sb 'Microsoft.ServiceBus/namespaces@2023-01-01-preview' = {
name: serviceBusName
location: primaryLocation
sku: {
name: 'Premium'
tier: 'Premium'
capacity: 1
}
properties: {
geoDataReplication: {
maxReplicationLagDurationInSeconds: maxReplicationLagInSeconds
locations: [
{
locationName: primaryLocation
roleType: 'Primary'
}
{
locationName: secondaryLocation
roleType: 'Secondary'
}
]
}
}
}
Once you create a namespace with the Geo-Replication feature enabled, you can manage the feature from the Replication (preview) blade.
To switch between replication modes, or update the maximum replication lag, click on the link under Replication consistency, and click the checkbox to enable / disable synchronous replication, or update the value in the textbox to change the asynchronous maximum replication lag.
To remove a secondary region, click on the ...-ellipsis next to the region, and click Delete. To delete the region, follow the instructions in the pop-up blade.
A promotion is triggered manually by the customer (either explicitly through a command, or through client owned business logic that triggers the command) and never by Azure. It gives the customer full ownership and visibility for outage resolution on Azure's backbone. When choosing Planned promotion, the service waits to catch up the replication lag before initiating the promotion. On the other hand, when choosing Forced promotion, the service immediately initiates the promotion. The namespace will be placed in read-only mode from the time that a promotion is requested, until the time that the promotion has completed. It is possible to do a forced promotion at any time after a planned promotion has been initiated. This puts the user in control to expedite the promotion, when a planned failover takes longer than desired.
Important
When using Forced promotion, any data that has not been replicated may be lost.
After the promotion is initiated:
The hostname is updated to point to the secondary region, which can take up to a few minutes.
Note
You can check the current primary region by initiating a ping command: ping your-namespace-fully-qualified-name
Clients automatically reconnect to the secondary region.
You can automate promotion either with monitoring systems, or with custom-built monitoring solutions. However, such automation takes extra planning and work, which is out of the scope of this article.
In the portal, click on the Promote icon, and follow the instructions in the pop-up blade to delete the region.
Execute the Azure CLI command to initiate the promotion. The Force property is optional, and defaults to false.
az rest --method post --url https://management.azure.com/subscriptions/<subscriptionId>/resourceGroups/<resourceGroup>/providers/Microsoft.ServiceBus/namespaces/<namespaceName>/failover?api-version=2023-01-01-preview --body "{'properties': {'PrimaryLocation': '<newPrimaryLocation>', 'api-version':'2023-01-01-preview', 'Force':'false'}}"
Users can monitor the progress of the replication job by monitoring the replication lag metric in Log Analytics.
AzureMetrics
| where TimeGenerated > ago(1h)
| where MetricName == "ReplicationLagDuration"
Publishing applications can publish data to geo replicated namespaces via the namespace hostname of the Geo-Replication enabled namespace. The publishing approach is the same as the non-Geo-Replication case and no changes to data plane SDKs or client applications are required. Publishing may not be available during the following circumstances:
Publisher applications can't directly access any namespaces in the secondary regions.
Consuming applications can consume data using the namespace hostname of a namespace with the Geo-Replication feature enabled. Consumer operations aren't supported from the moment that promotion is initiated until promotion is completed.
Note the following considerations to keep in mind with this release:
The Premium tier for Service Bus is priced per Messaging Unit. With the Geo-Replication feature, secondary regions run on the same number of MUs as the primary region, and the pricing is calculated over the total number of MUs. Additionally, there's a charge for based on the published bandwidth times the number of secondary regions. During the early public preview, this charge is waived.
To learn more about Service Bus messaging, see the following articles:
Events
Mar 31, 11 PM - Apr 2, 11 PM
The ultimate Microsoft Fabric, Power BI, SQL, and AI community-led event. March 31 to April 2, 2025.
Register today