CCR or Stretched CCR?
Having spoke with a few customers about whether a local CCR and SCR is the best solution or a stretched CCR across 2 data centres I thought I'd write a post.
There is no right and wrong answer to that question, in typical consulting style 'it depends'. There are various factors to take into consideration when designing the right solution for your customer:-
- Network Infrastructure such as data centre locations, network bandwidth, latency, redundant links (including switches)
- Customer requirements (do they require full site resilience without manual intervention)
- Cost
- Does the customer currently have the right skills to manage the environment
- How many copies or the database are required, (2 with a stretched cluster, 3 with CCR and SCR)
- Can a 3rd data centre be used to host the File Share Witness (FSW)
There are also some factors to think about from the client side, such as DNS refresh. If the customer doesn't have a stretched Virtual LAN (VLAN) between data centres, the cluster will be assigned 1 Network Name resource and 2 IP address resources (since both nodes are separate IP subnets). When the the clustered mailbox server (CMS)fails over the CMS will be assigned a different IP. As part of the cluster configuration in Windows 2008 we recommend the default DNS TTL value for the CMS Network Name resource should be changed.
By default the cluster service has a setting of 20mins, you need to be careful if you change the DNS TTL value through the DNS management console as this will be over written by the cluster settings. So if you want to change the default value from 20mins to our recommended setting of 5 mins you'll need to make the change through cluster administrator.
In order to make this change you'll need Local Admin on each node in the cluster and have full control permission to the cluster.
From a cmd prompt run - cluster.exe res <CMSNetworkNameResource> /priv HostRecordTTL=300 (where 300 is the recommended 5 mins as mentioned above)
Take the cluster offline by running Stop-ClusteredMailboxServer cmdlet in Power Shell
Bring the cluster back online by running Start-ClusteredMailboxServer cmdlet.
I’ve listed below a few risks and how they can be mitigated if you do decide to go with a stretched CCR over CCR + SCR
Risk |
Mitigation |
File Share Witness (FSW) Location |
Locate the FSW at an alternate location to provide additional resilience to the cluster |
Client cache IP refresh interval |
this can configured on the cluster in Windows 2008, or a stretched VLAN can used |
Logical corruption of the databases |
SCR would provide this feature, but take into consideration your Recovery Time Objective (RTO) |
Is the network link between physical locations resilient |
Ensure there is alternate routes available |
Does the network link between physical locations have low latency (below 50ms) |
Test network latency |
Network link between between physical locations has enough bandwidth |
Test network bandwidth |
Backup solution can backup any node in any physical location |
ensure your chosen backup solution can back up both locations in the event of a site failure |
Manual configuration required to control message routing within a data centre (SubmissionServerOverridelist) |
Ensure your operational guides are up to date with how to configure mail routing |
Control Client Access within a Datacentre |
ISA Server |
Querying of AD may take place across the data centre interconnect |
None |
Potential loss of email data in the event of a site failure |
Email will be stored in the transport dumpster of the HT server in the failed site |
Operational Management |
Having an in-depth understanding of cluster technology and Window 2008 and Exchange 2007 experience |
Written by Daniel Kenyon-Smith