Planning for Standby Continuous Replication
Microsoft Exchange Server 2007 will reach end of support on April 11, 2017. To stay supported, you will need to upgrade. For more information, see Resources to help you upgrade your Office 2007 servers and clients.
Applies to: Exchange Server 2007 SP1, Exchange Server 2007 SP2, Exchange Server 2007 SP3
Although deploying standby continuous replication (SCR) is similar to deploying local continuous replication (LCR), there are important differences that you must consider. There are general requirements that must be met for SCR.
General Requirements for Standby Continuous Replication
Before enabling any storage groups for SCR, we recommend that you familiarize yourself with the following requirements for SCR sources and targets:
A source can have multiple targets. For example, a source could have one target that exists in the same datacenter as the source, and a second target that exists in a separate datacenter. There is no limit to the number of targets you can have for each source. However, we recommend using no more than four targets per source. The additional impact to the source server needs to be verified and planned for accordingly if more than four targets are configured.
Each target can have multiple source servers. Both the source and the target system must be running Exchange 2007 SP1. The operating system can be any operating system supported by Exchange 2007 SP1 (for example, Windows Server 2008 or Windows Server 2003); however, all SCR target computers must be running the same operating system as their SCR source computer. Using different operating systems for an SCR source and its targets (for example, where the SCR source is Windows Server 2003 and the SCR target is Windows Server 2008, or vice versa) is not supported.
The SCR target computer must have the Exchange 2007 SP1 Mailbox server role installed. If the SCR target computer is a cluster node, the node must be a passive node (e.g., the passive clustered mailbox role is installed), and the cluster cannot contain any clustered mailbox servers.
You must plan your Exchange Server installation paths more carefully when using SCR if you plan on using a standby cluster and the clustered mailbox server recovery feature (Setup /RecoverCMS) as part of the SCR target activation process. To use the server recovery process, the installation path for Exchange Server must be the same for the SCR source computer and the SCR target computer. If Exchange Server is installed into %ProgramFiles%\Microsoft\Exchange Server on the SCR source computer, it must also be installed into %ProgramFiles%\Microsoft\Exchange Server on all computers that will be SCR targets for the SCR source server. If these install paths do not match, Setup /RecoverCMS will fail because the install path in the registry will not match the value for the msExchInstallPath attribute of the Mailbox server object in the Active Directory directory service.
If your activation process includes the recovery of a clustered mailbox server, you must disable SCR for all storage group(s) on the clustered mailbox server before using Setup /RecoverCMS as part of the activation process. If SCR is not disabled for all storage groups, Setup /RecoverCMS will fail.
The storage group and database paths on the SCR source and all targets must not conflict with any other storage group or database paths. You must plan your storage group and database paths more carefully when using SCR because the storage group and database path used by an SCR source will be used for the copy of the storage group and database on all SCR targets for the source.
The SCR source and SCR target computers must be in the same Active Directory domain, but they can be located in the same or in different Active Directory sites.
Each target computer supports a maximum of 50 SCR targets (50 replicated storage groups) when using the Enterprise Edition of Exchange 2007, and a maximum of 5 SCR targets when using the Standard Edition of Exchange 2007.
Restrictions on SCR Target Computers
When a passive node or a standalone Mailbox server is configured as an SCR target, the following capabilities are blocked:
A stand-alone Mailbox server that is designated as an SCR target cannot have LCR enabled for any storage groups. The Microsoft Exchange Replication service has not been designed or modified to handle managing both LCR and replication from another source.
A passive node that is designated as an SCR target must be a member of a failover cluster that does not have any clustered mailbox servers. This is referred to as a standby cluster. For more information about standby clusters, see High Availability.
SCR and Public Folder Databases
SCR and public folder replication are two very different forms of replication built into Exchange. Due to interoperability limitations between continuous replication and public folder replication, if more than one Mailbox server in the Exchange organization has a public folder database, public folder replication is enabled and public folder databases should not be hosted in SCR environments.
Because database portability can be used only with mailbox databases, activation for an SCR target copy of a public folder database can only be performed as part of a server or clustered mailbox server recovery operation (for example, Setup /m:recoverServer, or Setup /recoverCMS).
The following are the recommended configurations for using public folder databases and SCR in your Exchange organization:
If you have a single Mailbox server in your Exchange organization, and that Mailbox server is a stand-alone Mailbox server or a clustered mailbox server in an SCC, the Mailbox server can host a public folder database and the storage group containing the public folder database can be enabled for SCR, provided the storage group is not enabled for LCR. In this configuration, there is a single public folder database in the Exchange organization. Thus, public folder replication is disabled. In this scenario, public folder database redundancy is achieved using SCR; SCR maintains two copies of your public folder database.
If you have multiple Mailbox servers and only one of the Mailbox servers contains a public folder database, and that Mailbox server is a stand-alone Mailbox server or a clustered mailbox server in an SCC, the Mailbox server can host a public folder database and the storage group containing the public folder database can be enabled for SCR, provided the storage group is not enabled for LCR. In this configuration, there is a single public folder database in the Exchange organization. Thus, public folder replication is disabled. In this scenario, public folder database redundancy is also achieved using SCR.
If you are migrating public folder data into a storage group enabled for SCR, you can use public folder replication to move the contents of a public folder database from a stand-alone Mailbox server or a clustered mailbox server in an SCC to the SCR-enabled storage group. When replication has completed successfully, all public folder databases outside of the SCR-enabled storage groups should be removed, and you should not host any other public folder databases in the Exchange organization.
If you are migrating public folder data out of a storage group enabled for SCR, you can use public folder replication to move the contents of a public folder database from the storage group to a stand-alone Mailbox server or a clustered mailbox server in an SCC. When replication has completed successfully, all public folder databases inside of all SCR-enabled storage groups should be removed and all subsequent public folder databases should not be hosted in SCR-enabled storage groups.
During any period where more than one public folder database exists in the Exchange organization and one or more public folders databases are hosted in a storage group enabled for SCR, if a failure of the SCR source storage group occurs and an SCR target public folder database needs to be activated, it can only be made mountable if all logs for the storage group hosting the public folder database are available. If any logs are missing or unavailable as a result of the failure of the SCR source, you will not be able to activate the SCR target copy of the public folder database. In this event, the SCR source must be brought online to ensure no data loss, or the public folder database must be re-created on the SCR source and its content must be recovered using public folder replication from a public folder databases that other than the SCR target copy.
SCR and Backups
You cannot back up an SCR target copy. LCR and CCR support backups from both the active and passive copy. SCR supports backups of the SCR source only. An SCR target's database headers will be updated and the log files will be truncated when a supported backup is taken against the SCR source storage group. If the SCR source storage group is enabled for CCR or LCR, the SCR target's database headers will be updated and the log files will be truncated when backups are taken against either the active or passive copies of the SCR source storage group.
SCR and Restores
When an SCR source database is replaced with an earlier version of the database, you must suspend and then resume continuous replication for the storage group using Suspend-StorageGroupCopy and Resume-StorageGroupCopy, respectively. This process is needed to update the Microsoft Exchange Replication Service with the correct log generation information. If continuous replication is not suspended and resumed, the Replication Service will have outdated log generation information and will stop replicating log files.
SCR and Log File Truncation
In Exchange 2007 RTM, rules are enforced so that in a continuous replication environment, a log file is not deleted unless it has been backed up and replayed into the copy of the database. This rule is modified when using SCR. SCR (which introduces the concept of multiple database copies) allows log files to be truncated at the SCR source as soon as they are inspected by all SCR target computers. Log truncation on the SCR source does not wait until all logs have been replayed into all SCR targets because SCR target copies can be configured with large log replay delays. However, log truncation on an SCR source will not occur if one or more SCR targets for a storage group are down. In order for logs to be truncated on an SCR source, the SCR target computer(s) must be online and accessible by the source.
On an SCR target, a background thread runs every three minutes to determine if any log files need to be truncated. If the following three criteria are met, a log file will be truncated on an SCR target:
The log file has been truncated on the SCR source.
The log file generation sequence is below the log file checkpoint for the storage group.
The log file is older than ReplayLagTime + TruncationLagTime. (For a description of these parameters, see "Cmdlet Updates for SCR" in the topic, Standby Continuous Replication)
In an LCR or CCR environment that is extended with SCR, if the following three criteria are met, a log file will be truncated on the active and passive copies in the LCR or CCR environment:
The log file has been backed up.
The log file generation sequence is below the log file checkpoint for the storage group.
All SCR targets have inspected the log file.
Optimizing Windows 2003 Networking for SCR
Although no network optimizations are needed when using SCR on Windows Server 2008, when using SCR on Windows Server 2003, we recommend that you optimize your Windows Server TCP/IP settings for your specific network link's speed and latency. Specifically, you may need to adjust the Transmission Control Protocol (TCP) receive window size and Request for Comments (RFC) 1323 window scaling options on an SCR source computer and its SCR target computers. In addition, you may find it beneficial to configure address resolution protocol (ARP) cache expiration settings and to disable the advanced TCP/IP options for the Windows Server 2003 Scalable Networking Pack (SNP) in the Windows registry.
In addition to these recommendations, if your environment includes the use of the IP Security (IPsec) protocol, we recommend that you configure IPsec consistently throughout your SCR environment. Either the SCR source and all of its SCR targets should use IPsec, or neither the SCR source or any of its targets should use IPSec. If only one node is configured to use IPsec, the IPsec Security Association process can cause packet delay or packet loss.
TCP Receive Windows and RFC 1323 Scaling Options
The TCP receive window size is the maximum amount of data (in bytes) that can be received at one time on a connection. The sending computer can send only that maximum amount of data before waiting for an acknowledgment and a TCP window update from the receiving computer. It may be beneficial to tune this setting to increase throughput during log shipping.
To optimize the TCP throughput, the sending computer should transmit enough packets to fill the pipe between the sender and receiver. The capacity of the network pipe is based on the pipe’s bandwidth and its latency (round-trip time). The higher the latency, the greater capacity you have, because there is more time to send data between acknowledgements. By increasing the TCP window size, the system can take advantage of the time between acknowledgements by sending more data.
The TCP/IP standard allows for a receive window up to 65,535 octets in size, which is the maximum value that can be specified in the 16-bit TCP window size field. To improve performance on high-bandwidth, high-delay networks, Windows Server TCP/IP supports the ability to advertise receive window sizes larger than 65,535 octets, by using scalable windows as described in RFC 1323, TCP Extensions for High Performance. When using window scaling, hosts in a conversation can negotiate a window size that allows multiple large packets, such as those often used in file transfer protocols, to be pending in the receiver's buffers. RFC 1323 details a method for supporting larger receive window sizes by allowing TCP to negotiate a scaling factor for the window size at connection establishment.
You can optimize the TCP receive window size and RFC 1323 window scaling options on a computer running Windows Server 2003 by modifying two registry entries: TCPWindowSize and TCP1323Opts. For more information about these features, see Microsoft Knowledge Base article 224829, Description of Windows 2000 and Windows Server 2003 TCP Features.
We recommend that you use version 13 or later of the Exchange 2007 Mailbox Server Role Storage Requirements Calculator to determine the optimal settings for these registry entries based on your network link and network latency. To download the calculator, see Exchange 2007 Mailbox Server Role Storage Requirements Calculator from the Exchange Team Blog. The Storage Calculator also includes step-by-step instructions for entering the registry values on your servers.
ARP Cache Expiration
The ARP cache is an in-memory table that maps IP addresses to media access control (MAC) addresses. Entries in the ARP cache are referenced each time that an outbound packet is sent to the IP address in the entry. By default, Windows Server 2003 adjusts the size of the ARP cache automatically to meet the needs of the system. If an entry is not used by any outgoing datagram for two minutes, the entry is removed from the ARP cache. Entries that are being referenced are removed from the ARP cache after ten minutes. Entries added manually are not removed from the cache automatically.
Internal testing by the Microsoft internal IT department showed that the default ARP cache expiration settings resulted in packet loss in CCR and SCR environments. When packet loss occurs, the sending server must transmit the lost data again. In a continuous replication environment, it is important for log files to be copied to the passive node as quickly as possible, and transmitting data again due to lost packets can adversely affect log shipping throughput.
You can modify the ArpCacheMinReferencedLife TCP/IP parameter in the Windows registry to control ARP cache expiration. This parameter determines how long referenced entries must remain in the ARP cache table before they can be deleted. Internally, Microsoft found that the optimal setting for the ArpCacheMinReferencedLife registry value was to use the same value being used for ARP cache expiration by the routers on the network, which was 4 hours.
Before modifying the value for ArpCacheMinReferencedLife in your own environment, we recommend using Microsoft Network Monitor or a similar capture tool to collect and analyze the network traffic on the network interface being used to copy logs from the active node to the passive node. For detailed steps to modify the ArpCacheMinReferencedLife registry value, see Appendix A: TCP/IP Configuration Parameters.
Scalable Networking Pack Advanced TCP/IP Features
The Windows Server 2003 Scalable Networking Pack (SNP) is a separate update for Windows Server 2003 that contains stateful and stateless offloads to accelerate the Windows network stack. The update includes TCP Chimney offload, Receive Side Scaling (RSS), and Network Direct Memory Access (NetDMA).
TCP Chimney is a stateful offload. TCP Chimney offload enables TCP/IP processing to be offloaded to network adapters that can handle the TCP/IP processing in hardware.
RSS and NetDMA are stateless offloads. Where multiple CPUs reside in a single computer, the Windows networking stack limits "receive" protocol processing to a single CPU. RSS resolves this issue by enabling the packets that are received from a network adapter to be balanced across multiple CPUs. NetDMA allows for a Direct Memory Access (DMA) engine on the Peripheral Component Interconnect (PCI) bus. The TCP/IP stack can use the DMA engine to copy data instead of interrupting the CPU to handle the copy operation. A related component, TCPA, is another offload function where a hardware DMA engine on the PCI bus can be used to assist receive processing.
While these features can provide network performance benefits in some environments, there are some scenarios in which they cannot be used because of the use of other technologies. For example, TCP Chimney offload and NetDMA cannot be used if any of the following technologies are used:
Internet Protocol security (IPsec)
Internet Protocol Network Address Translation (IPNAT)
NDIS 5.1 intermediate drivers
In addition, there are known issues in some environments, including environments with Microsoft Exchange, in which network performance can decrease when using these features. For details on some of these issues, see the Exchange Team blog post, Windows 2003 Scalable Networking pack and its possible effects on Exchange.
We recommend that you disable all of the features in SCR environments that run on the Windows Server 2003 operating system for both the operating system and each network interface card (NIC) in the system. You can disable these features as follows:
To disable the TCP Chimney offload feature, open a command prompt and run the following command:
Netsh int ip set chimney DISABLED
To disable the other SNP features, you can set the values for the EnableRSS and EnableTCPA TCP/IP registry parameters to 0 in the Windows registry. For detailed steps to do this, see Knowledge Base article 936594, You may experience network-related problems after you install Windows Server 2003 SP2 or the Scalable Networking Pack on a Windows Server 2003-based computer.
To disable these features on the NIC(s) in the system, refer to the instructions that came with the NIC or consult with your hardware vendor.
For more information about the SNP, see Knowledge Base article 912222, The Microsoft Windows Server 2003 Scalable Networking Pack release, and the Scalable Networking Web site.