SQL Server 2016 HA Series Part 1 – 3 Auto Failover Targets
This series is intended to provide a short focused write up on some of the key new SQL Server 2016 HA features.
Got questions? Stick them in the comments below and we will do our best to answer them!
Part 1 – 3 Auto Failover Targets
With SQL Server 2016 we are enabling customers even greater scale and resiliency. One of the ways we help you achieve this is by increasing the number of automatic failover targets from 2 to 3 (2 + primary).
This is a cool new enhancement but it does add a new layer of complexity depending on how you initially configure your environment.
One of the questions we often get asked about this new feature is “How can i predict where my AG will end up on failover?”. Well the answer here is that there is no options inside SQL Server to configure this behavior as a “normal” configuration setting. However, there is a “workaround” which can be done but it’s not that pretty….
The behavior here is that failover will occur in replica “build” order. What I mean by this is that it’s the order in which you add your replicas to your Availability Group. This then in turn creates the role preference order in your cluster role.
In this example I add Replica 1 and 3 as synchronous mode in the initial creation of the Availability Group, then use the Add Replica wizard to add Repilca 2 as synchronous mode and Replica 4 as asynchronous mode.
Here is what the preferred owners look like in the cluster role properties in cluster manager
Based on my (made up) companies requirements , I expect the first auto failover to happen within the same data center, so in the below diagram I would expect SQL01 to be the primary and auto failover to happen to SQL02, and in the unlikely event that SQL02 goes down or the entire DC goes down then auto failover to happen to SQL03 on DC02
Trying out my setup then, SQL01 is currently primary and is about to suffer a server failure (I'm only pausing the cluster node here):
Auto failover happens as we expect, but the AG fails to SQL03 – not what we want here!
As I mentioned above, there is no way to alter this behavior without removing replicas and adding them back in again.
Don’t try and change the preferred owner list in cluster manager because although it will change the behavior for the next failover, once the environment back all back up and running again, and your back on your preferred replica (SQL01 in my case) SQL is clever and will change the failover order back again!
The take away here is if you are considering using 3 auto failover targets have a good think about your design and where you want your auto failovers to happen and in what order, then ensure you build your Availability Group in this order.