Anycast routing with Azure Route Server

You can deploy your application across Availability Zones in a single Azure region to achieve higher availability, but sometimes, you might need to deploy your applications in multiple regions, either to achieve a higher resiliency, a better performance for users across the globe, or better business continuity. There are different approaches that can be taken to direct users to one of the locations where a multi-region application is deployed to: DNS-based approaches such as Azure Traffic Manager, routing-based services like Azure Front Door, or the Azure cross-region Load Balancer.

The previous Azure services are recommended for getting users to the best application location over the public internet using public IP addressing, but they don't support private networks and IP addresses. This article explores the usage of a route-based approach (IP anycast) to provide multi-regional, private-networked application deployments.

IP anycast essentially consists of advertising exactly the same IP address from more than one location, so that packets from the application users are routed to the closest region (in terms of routing). Providing multi-region reachability over anycast offers some advantages over DNS-based approaches, such as not having to rely on clients not caching their DNS answers, and not requiring to modify the DNS design for the application.

Topology

In the design of this scenario, the same IP address is advertised from virtual networks in different Azure regions, where network virtual appliances (NVAs) advertise the application's IP address through Azure Route Server. The following diagram depicts two simple hub and spoke topologies, each in a different Azure region. An NVA in each region advertises the same route (a.b.c.d/32 in this example) to its local Azure Route Server (the route prefix must not overlap with Azure and on-premises networks). The routes are further propagated to the on-premises network through ExpressRoute. When application users want to access the application from on-premises, the DNS infrastructure (not covered by this document) resolves the DNS name of the application to the anycast IP address (a.b.c.d), which the on-premises network devices route to one of the two regions.

Diagram shows an example of using IP anycast with Azure Route Server.

The decision of which of the available regions is selected is entirely based on routing attributes. If the routes from both regions are identical, the on-premises network typically uses equal-cost multi-path (ECMP) routing to send each application flow to each region. It's possible as well to modify the advertisements generated by each NVA in Azure to make one of the regions preferred. For example, using BGP AS Path prepending to establish a deterministic path from on-premises to the Azure workload.

Important

The NVAs advertising the routes should include some health check mechanism to stop advertising the route when the application is not available in their respective regions, to avoid blackholing traffic.

Return traffic

When the application traffic from the on-premises client arrives to one of the NVAs in Azure, the NVA either performs connection reverse-proxy or Destination Network Address Translation (DNAT). Then, it sends the packets to the actual application, which typically resides in a spoke virtual network peered to the hub virtual network where the NVA is deployed. Traffic back from the application goes back through the NVA, which would happen naturally if the NVA is reverse-proxying the connection (or performs Source NAT additionally to Destination NAT).

Otherwise, traffic arriving to the application is still sourced from the original on-premises client's IP address. In this case, packets can be routed back to the NVA with user-defined routes (UDRs). Special care must be taken if there are more than one NVA instance in each region, since traffic could be asymmetric (the inbound and outbound traffic going through different NVA instances). Asymmetric traffic is typically not an issue if NVAs are stateless, but it results in errors if NVAs keep track of connection states, such as firewalls.