Exchange 2010 Datacenter switchover
when exchange 2010 was released and the DAG feature was first introduced everybody was excited about it and looking forward to deploy it, but performing the disaster recovery is a nightmare because it requires multiple steps so I thought sharing the steps to do a site switchover in a scenario based will help many people who would like to either create a DR document or do the actual site switchover
in this Blog I will only cover the mailbox role and will do simulate a mailbox servers failure in production site that requires a full site switchover, there are other steps you need to do to activate the other server roles that are not mentioned here
we have the below Organization that contains two Sites
CLT site (production)
ADA-CLT-MBX, ADA-CLT-MBX02, ADA-CLT-HC
SEA site (DR)
ADA-SEA-MBX, ADA-SEA-HC
we are going to deploy a DAG on three Mailbox servers and learn how to do site switchover
first we need to create a new domain called Adatum.com
place one DC in CLT site and another one in SEA site
install exchange on all servers in both CLT and SEA sites
follow this blog to implement DAG
No DAC Enabled
to test our DR site switch over we need to do the following steps
- simulate the production site power failure by switching off all exchange mailbox servers in main site
- Stop the Cluster service on each DAG member in the second data-center by running the following command on each member:
net stop clussvc
- On a DAG member in the second data-center, force a quorum start of the Cluster service by running the following command:
net start clussvc /forcequorum
- Open the Fail-over Cluster Management tool and connect to the DAG's underlying cluster. Expand the cluster, and then expand Nodes. Right-click each node in the primary data-center, select More Actions, and then select Evict.
- activate mailbox servers in DR site
The quorum must be modified based on the number of DAG members in the second data-center.
If there's an odd number of DAG members, change the DAG quorum model from a Node a File Share Majority to a Node Majority quorum by running the following command:
cluster <DAGName> /quorum /nodemajority
- If there's an even number of DAG members, reconfigure the witness server and directory by running the following command in the Exchange Management Shell:
Set-DatabaseAvailabilityGroup <DAGName> -WitnessServer <ServerName>
in our scenario we have odd number so we will use
cluster loayaldag /quorum /nodemajority
-
Start the Cluster service on any remaining DAG members in the second data-center by running the following command:
net start clussvc
- Perform server switch-overs to activate the mailbox databases in the DAG by running the following command
get-mailboxdatabase | Move-ActiveMailboxDatabase -ActivateOnServer ADA-SEA-MBX -SkipActiveCopyChecks -SkipHealthChecks -SkipClientExperienceChecks -SkipLagChecks -MountDialOverride:Besteffort****
if after the above command the databases are not mounted you can run this command
Get-MailboxDatabase <DAGMemberinSecondSite> | Mount-Database
now you need to change the OWA url and the MX records to point to the DR HUB and CAS servers
after the power is restored in the Primary site we need to reactivate the service
- start all mailbox servers
- remove the copies from the Primary site
- remove the servers from the DAG
- add the servers back to the DAG either using EMC or Add-DatabaseAvailabilityGroupServer
- add database copies again
exchange server is in DAC mode
- first we need to enable DAC mode by running
Set-DatabaseAvailabilityGroup -Identity loayaldag -DatacenterActivationMode DagOnly
- simulate the failure shutdown the mailbox servers in the primary site
The Cluster service must be stopped on each DAG member in the second data-center
Stop-Service ClusSvc "exchange management shell"
or
net stop clussvc "cmd"
- The DAG members in the primary data-center must be marked as stopped in the primary data-center. Stopped is a state of Active Manager that prevents databases from mounting, and Active Manager on each server in the failed data-center is put into this state by using the stop-DatabaseAvailabilityGoup cmdlet from the Primary site servers, If the Mailbox server is unavailable but Active Directory is operating in the primary data-center, the Stop-DatabaseAvailabilityGroup command with the ConfigurationOnly parameter must be run against all servers in this state in the primary data-center
in our scenario the mailbox servers are off but the AD is still active so we will use the ConfigurationOnly parameter
Stop-DatabaseAvailabilityGroup -Identity loayaldag -ActiveDirectorySite CLT -ConfigurationOnly
- to complete activation of the mailbox servers in the second data-center are as follows
The Mailbox servers in the standby datacenter are then activated by using the Restore-DatabaseAvailabilityGroup cmdlet,The Active Directory site of the standby datacenter is passed to the Restore-DatabaseAvailabilityGroup cmdlet to identify which servers to use to restore service and to configure the DAG to use an alternate witness server. If the alternate witness server wasn't previously configured, you can configure it by using the AlternateWitnessServer and AlternateWitnessDirectory parameters of the Restore-DatabaseAvailabilityGroup cmdlet
Restore-DatabaseAvailabilityGroup -Identity loayaldag -ActiveDirectorySite sea -AlternateWitnessServer ada-sea-hc -AlternateWitnessDirectory c:\loayaldag
- The databases can now be activated. Depending on the specific configuration used by the organization, this may not be automatic. If the servers in the standby datacenter have an activation blocked setting, the system won't do an automatic failover from the primary datacenter to the standby datacenter of any database. If no failover restrictions are present for any of the database copies in the standby datacenter, the system will activate copies in the second datacenter assuming they are healthy. If databases are configured with an activation blocked setting that requires explicit manual action, there are two choices for action:
- Clear the setting that blocks activation. This will make the system return to its default behavior, which is to activate any available copy.
- Leave the setting unchanged and use the Move-ActiveMailboxDatabase cmdlet to complete the database activation in the second datacenter. To complete this step using the Move-ActiveMailboxDatabase cmdlet when activation blocked is set, you must explicitly identify the target of the move.
in our scenario there is no block so we will just go for activation
get-mailboxdatabase | Move-ActiveMailboxDatabase -ActivateOnServer ada-sea-mbx -SkipActiveCopyChecks -SkipClientExperienceChecks -SkipHealthChecks -SkipLagChecks** -MountDialOverride:besteffort
- after we fix the issues on the servers in the clt site we need to restore the service and add them back to DAG
- First you can reincorporate the DAG members in the primary site by using the Start-DatabaseAvailabilityGroup cmdlet. Then, to make sure that the proper quorum model is being used by the DAG, run the Set-DatabaseAvailabilityGroup cmdlet against the DAG without specifying any parameters.
Start-DatabaseAvailabilityGroup -Identity loayaldag -ActiveDirectorySite clt
Set-DatabaseAvailabilityGroup
- After the Mailbox servers in the primary data-center have been incorporated into the DAG, they will need some time to synchronize their database copies. Depending on the nature of the failure, the length of the outage, and actions taken by an administrator during the outage, this may require reseeding the database copies
hope that the above is clear and straightforward