Automated Disaster Recovery Testing and Failover with Hyper-V Replica and PowerShell 3.0 for FREE!
Hyper-V Replica is a new built-in feature of both Windows Server 2012 and our FREE Hyper-V Server 2012 products. Hyper-V Replica enables Hyper-V hosts or clusters to enable distance replication of running VMs to remote Hyper-V hosts over a standard IP WAN connection. It provides a very cost-effective disaster recovery solution in the event of a primary data center outage. To get this feature in the VMware world, we'd pay gobs of extra money to license "Site Recovery Manager", but in Hyper-V it's just included as part of the core feature set - no extra cost.
To learn more about Hyper-V Replica, Kevin Remde, my friend and colleague, has a great new post over on his "Full of I.T." blog. Be sure to check it out!
How Can I Automate Disaster Recovery and DR Testing with Hyper-V Replica?
There's two ways to manage the Hyper-V Replica feature:
- GUI - We can use the built-in Hyper-V Manager tool to manage Hyper-V Replica settings, enable VMs for replication, and execute planned failovers, unplanned failovers and test failovers ... but, being a GUI tool, it's intended for interacting with Hyper-V Replica on an individual VM-by-VM basis.
- Script - PowerShell 3.0 (of course - what else?!) includes 22 Cmdlets for configuring, enabling, monitoring, and managing Hyper-V Replica on an automated basis.
Since most organizations have many VM's and often want an automated solution for managing Disaster Recovery, the basic components of a scripted solution is what we'll focus on this article.
Enabling Hyper-V Replica on each Hyper-V Host
The first step to leveraging Hyper-V Replica is enabling it on each Hyper-V Host. With PowerShell 3.0, this can easily be done using the "Set-VMReplicationServer " Cmdlet as follows:
Set-VMReplicationServer -ComputerName HYPERV_HOST -ReplicationEnabled $true -AllowedAuthenticationType Kerberos -ReplicationAllowedFromAnyServer $true -DefaultStorageLocation DRIVE:\FOLDER
If you've got Windows Firewall enabled on each of your Hyper-V hosts, you'll also need to make sure that the Hyper-V Replica inbound port is open on each Hyper-V host. You can do this using the "Enable-NetFirewallRule" PowerShell 3.0 Cmdlet, which replaces the now deprecated "netsh advfirewall" command line tool.
Enable-NetFirewallRule -displayname "Hyper-V Replica HTTP Listener (TCP-In)”
Enabling Replication for each Virtual Machine
After Hyper-V Replica is enabled on each Hyper-V host, the next step is to enable replication for each Virtual Machine currently running on your primary Hyper-V hosts. To do this, you can use the "Enable-VMReplication" PowerShell 3.0 Cmdlet as follows:
Enable-VMReplication -VMName VM_NAME -ReplicaServerName DEST_HYPERV_HOST -ReplicaServerPort 80 -AuthenticationType Kerberos -CompressionEnabled $true -RecoveryHistory 0
Note that once VM replication is enabled, you can then use the "Set-VMReplication" PowerShell Cmdlet should you need to modify any of the initial replication settings.
Checking in on Replication
Once a VM is enabled for replication, you can check on the replication status by using the "Measure-VMReplication" PowerShell cmdlet as follows:
Measure-VMReplication -VMName VM_NAME
Name State Health LReplTime PReplSize(M) AvgLatency AvgReplSize(M) SuccReplCount
---- ----- ------ --------- ------------ ---------- -------------- -------------
Win8Ent01 Replicating Normal 10/5/2012 9:57:31 AM 0.0039 00:08:41 3,691.15 3 of 3
In the Cmdlet output above, we see the state (Replicating), the overall Heatlh and Last Replication Time, as well as some useful insights for WAN capacity planning to support replication: AvgLatency and AvgReplSize (in MB). To use these statistics for planning your Recovery Point Objective ( RPO ) and bandwidth requirements between hosts, check out this article posted by one of my peer IT Pro Technical Evangelists, Tommy Patterson.
Testing VM Failover
Testing Disaster Recovery Failover with traditional solutions is usually a painful process - to test failover procedures, normally you have to completely failover your production workloads after-hours to your DR site and then, often times, spend the entire remainder of your maintenance window trying to fail everything back to production before users come back to work ... Not so with Hyper-V Replica! Hyper-V provides the ability to test failover at any time by using the "Start-VMFailover" cmdlet with the -AsTest parameter. Here's an example:
Start-VMFailover -ComputerName DEST_HYPERV_HOST -VMName VM_NAME -AsTest
When testing failover using the Cmdlet above, Hyper-V will create a copy of the replicated VM that is disconnected from all virtual networks. This allows you to start that VM and verify that all services are running properly, without risk of exposing the replicated VM to your production network and causing potential network conflicts. Cool stuff!
Checking for Data Center Connectivity
Hyper-V Replica is intended as a data center, or site-wide, recovery solution. As such, we'd normally want to execute a failover in situations where the primary data center or Hyper-V replica server is no longer reachable. To test for Hyper-V Replica connectivity between the primary and secondary sites, we can create a PowerShell function that leverages the "Test-VMReplicationConnection" Cmdlet as follows:
Function PrimarySiteAvailable {
Param ([string]$HyperVHost)
$Test = Test-VMReplicationConnection -AuthenticationType Kerberos -ReplicaServerName $HyperVHost -ReplicaServerPort 80 -ErrorAction SilentlyContinue
If ( $Test -match "was successful") {
Return $True
}
Else {
Return $False
}
}
Now, we have a function we can call from the remote Hyper-V Replica host periodically to test connectivity to the primary Hyper-V Replica host and site as follows:
$IsPrimarySiteUp = PrimarySiteAvailable -HyperVHost PRIMARY_HYPERV_HOST
If ($IsPrimarySiteUp -eq $False) { ... Insert Failover or Notification Commands Here ... }
Of course, we could get a lot more sophisticated with our code to test connectivity by including additional support for multiple retries, timing logic, running as a scheduled task, etc ... but this example is sufficient to demonstrate the basic capabilities.
CAUTION! True automated datacenter failover can be tricky to implement due to potential "Split Brain" scenarios with some application configurations. Be sure that you test your logic to account for these conditions with the applications that you use in your environment. Many organizations stop just short of fully automating the failover process and instead have a notification sent to the Admin team when a primary site outage is detected. The Admin team can then triage the situation and determine if a failover is warranted. To speed the failover process, the Admin team can then choose to use a script containing the Cmdlets below for each VM requiring failover.
At last! Performing Scripted Failover ...
We almost there .... we just need to add the actual Cmdlets that perform VM failover into a PowerShell script to automate the failover process when we invoke it. We've already seen the needed Cmdlet above when we were testing failover - our old friend "Start-VMFailover". We'll just remove the -AsTest parameter to perform a real unplanned failover. Here's an example:
$VM = Start-VMFailover -ComputerName DEST_HYPERV_HOST -VMName VM_NAME -PassThru -Confirm:$false
Start-VM -VM $VM
What's Next?
In this article, we walked through examples of the PowerShell 3.0 Cmdlets that you can leverage to easily implement an enterprise disaster recovery solution using Windows Server 2012 and Hyper-V Replica. To continue your learning, check out these resources next ...
- Hyper-V Replica: Learn more about Hyper-V Replica in this post on Kevin's blog!
- PowerShell: Get up to speed on PowerShell 3.0 in this post on Matt's blog!
- Server Management: Interested in learning more about server management in Windows Server 2012? Check out this post on Brian's blog!
- Want to Get Certified on Windows Server 2012? Become an "Early Expert" for FREE at https://EarlyExperts.net!
How Are You Using Hyper-V Replica?
Have you found a unique and interesting use case for Hyper-V Replica and/or PowerShell 3.0? Be sure to share your story in the comments below!
HTH,
Keith
Comments
Anonymous
January 01, 2003
The comment has been removedAnonymous
January 01, 2003
Hi Josh, Thanks for your comment! You are correct that SRM does help to automate the failover and failback process in VMware hypervisor environments. However, a couple points to keep in mind - first SRM is certainly not free ( starting at $5K+ USD for smaller environments ). Second, SRM does not automatically initiate failover and failback, but rather orchestrates the failover and failback process of multiple VM's via a defined recovery plan when an administrator initiates the recovery plan manually. If you're instead looking for easy, orchestrated failover and failback, you may be interested in checking out our new Windows Azure Hyper-V Recovery Manager - which provides site-to-site protection of entire Private Clouds. By leveraging Windows Azure, the entire recovery plan is safely located off-site in the cloud, so that you also don't have to be concerned with a physical site disaster preventing the initiation of your disaster plan. You can check out the details on Windows Azure Hyper-V Recovery Manager, which is really the appropriate solution to contrast with SRM, at blogs.technet.com/.../step-by-step-disaster-recovery-for-private-clouds-with-windows-azure-hyper-v-recovery-manager-build-your-private-cloud-in-a-month.aspx. Hope this helps! KeithAnonymous
January 01, 2003
Great article. It helps me in improving Disaster Recovery for our SharePoint enterprise infrastructure. Many thanks for sharing. -T.sAnonymous
January 01, 2003
Thanks, PowerComp, for your detailed script example! Fantastic work! :-)Anonymous
October 09, 2012
Thanks for the article. This is just what I was looking for. I will start to implement it tomorrow. Whish me luck and I may have more questions.Anonymous
February 23, 2013
The comment has been removedAnonymous
August 13, 2013
Failover Clustering is great, but not if shared storage isn't an option. I took this post and developed a very robust AUTOMATIC FAILOVER script. It first checks if the VM is online; if not it checks the host's hyper-v service status (for or needs, this works better because the failover isn't triggered by normal VM reboots). If the host hyper-v service is down a countdown is triggered which rechecks the VM and HyperV service at specific intervals. (NOTE: the host Hyper-V service can be down while the VM is still running!!!- this is checked for). While the failover isn't instant, the check and wait periods can be adjusted to other needs. Thanks for the original post, hope this helps!Anonymous
August 13, 2013
Here's the script: Function PrimarySiteAvailable { $Test = Test-VMReplicationConnection -AuthenticationType Kerberos -ReplicaServerName HOST1 -ReplicaServerPort 80 -ErrorAction SilentlyContinue If ($Test -match "was successful") { Return $True } Else { Return $False } } Start-Sleep -s 5 Do{ $status = (get-service -Name lanmanserver -ComputerName Webctrl).Status $date = Get-Date " Monitoring VM " " " Write-Host "VM is online " $date -ForegroundColor "Green" Start-Sleep -s 10 cls } While ($status -eq "Running") cls Write-Host "VM is offline " $date -Foregroundcolor "Yellow" " " "checking HOST1 HYPER-V service..." " " Do { $IsPrimarySiteUp = PrimarySiteAvailable -HOST1 PRIMARY_HYPERV_HOST $date = Get-Date " Monitoring HOST1 " " " Write-Host "HOST1 is online " $date -ForegroundColor "Green" Start-Sleep -s 30 $status = (get-service -Name lanmanserver -ComputerName Webctrl).Status If ($status -eq "Running") {c:scriptscontrol.bat} ELSE {} } While ($IsPrimarySiteUp -eq $True) " " ######## start failover countdown cls Write-Host " Primary VM server is not responding, starting AutoFailover in 10 minutes " -Foregroundcolor "Red" " " "press CTRL-C to abort" Start-sleep -s 240 " " "checking HOST1 HYPER-V service..." " " $IsPrimarySiteUp = PrimarySiteAvailable -HOST1 PRIMARY_HYPERV_HOST If ($IsPrimarySiteUp -eq $True) {c:scriptscontrol.bat} ELSE {} ########## 6 minute check cls Write-Host " Primary VM server is not responding, starting AutoFailover in 6 minutes " -Foregroundcolor "Red" " " "press CTRL-C to abort" Start-sleep -s 180 " " "checking HOST1 HYPER-V service..." " " $IsPrimarySiteUp = PrimarySiteAvailable -HOST1 PRIMARY_HYPERV_HOST If ($IsPrimarySiteUp -eq $True) {c:scriptscontrol.bat} ELSE {} ########## 3 minute check cls Write-Host " Primary VM server is not responding, starting AutoFailover in 3 minutes " -Foregroundcolor "Red" " " "press CTRL-C to abort" Start-sleep -s 180 " " "checking HOST1 HYPER-V service..." " " $IsPrimarySiteUp = PrimarySiteAvailable -HOST1 PRIMARY_HYPERV_HOST If ($IsPrimarySiteUp -eq $True) {c:scriptscontrol.bat} ELSE {} cls " starting failover..." $VM = Start-VMFailover -ComputerName HOST2 -VMName WebCTRL -PassThru -Confirm:$false Start-VM -VM $VMAnonymous
August 13, 2013
just a followup... the C:scriptscontrol.bat is simply a batch file that calls this ps script. Calling it from within the script basically restarts it from scratch.Anonymous
August 28, 2013
Or.. you use Vmware and install SRM... which does this automatically.. both for failover AND failback..Anonymous
December 23, 2013
December 23rd, 2013: Updated to include additional resources ... Module 0: Added links for New FREE EBOOKS and Documentation for Windows Server 2012 R2, System Center 2012 R2 VMM and Windows Azure Pack Module 1: Added links to Datacenter TCOAnonymous
December 23, 2013
December 23rd, 2013: Updated to include additional resources ... Module 0: Added links for New FREE EBOOKS and Documentation for Windows Server 2012 R2, System Center 2012 R2 VMM and Windows Azure Pack Module 1: Added links to Datacenter TCOAnonymous
December 23, 2013
December 23rd, 2013: Updated to include additional resources ... Module 0: Added links for New FREE EBOOKS and Documentation for Windows Server 2012 R2, System Center 2012 R2 VMM and Windows Azure Pack Module 1: Added links to Datacenter TCOAnonymous
December 23, 2013
December 23rd, 2013: Updated to include additional resources ... Module 0: Added links for New FREE EBOOKS and Documentation for Windows Server 2012 R2, System Center 2012 R2 VMM and Windows Azure Pack Module 1: Added links to Datacenter TCOAnonymous
January 21, 2014
In lots of customer discussions, the one thing that comes out often - How does Microsoft VirtualizationAnonymous
February 05, 2014
Update: The Hyper-V Replica Capacity Planner tool was updated this month with a new version that adds several new features, including supporting for Windows Server 2012 R2, Hyper-V Replica Extended Replication and new storage options. This article hasAnonymous
February 07, 2014
Las week, Brad Anderson , CVP for Windows Server and System Center, announced Windows Azure Hyper-V Recovery Manager (HRM) as a Cloud-integrated Disaster Recovery solution on his In the Cloud blog. Be sure to get the full details in Brad's articleAnonymous
February 23, 2014
Implementing an effective Disaster Recovery and Business Continuance plan can be a really complicated and expensive ordeal for some Private Clouds. In many cases, competing solutions either don't meet the Recovery Point Objectives (RPO) that an organizationAnonymous
February 27, 2014
Throughout our Disaster Recovery Planning Series of IT Pros this month, we've been discussing the steps involved in planning an effective Disaster Recovery strategy ... And planning for the additional capacity needed to implement a successful DR planAnonymous
April 11, 2014
April 11, 2014: Updated to include additional resources ...
Take this Build Your Cloud series with you "on-the-go" ... Download our FREE Windows Phone app! Built for Windows Phone using App Studio
My fellow Technical EvangelistsAnonymous
May 31, 2014
Pingback from VMware and Hyper-V comparison | CKINET