Exchange 2010 Tested Solutions: 20000 Mailboxes in Two Sites Running Hyper-V on Dell R910 Servers, EMC CLARiiON Storage, and Brocade Network Solutions
마지막으로 수정된 항목: 2016-12-14
Rob Simpson, Program Manager, Microsoft Exchange; Boris Voronin, Pr. Solutions Engineer, Exchange Solutions Engineering, EMC; Bryan Hoke, Brocade
February 2011
요약
In Exchange 2010 Tested Solutions, Microsoft and participating server, storage, and network partners examine common customer scenarios and key design decision points facing customers who plan to deploy Microsoft Exchange Server 2010. Through this series of white papers, we provide examples of well-designed, cost-effective Exchange 2010 solutions deployed on hardware offered by some of our server, storage, and network partners.
You can download this document from the Microsoft Download Center.
Applies To
Microsoft Exchange Server 2010 release to manufacturing (RTM)
Microsoft Exchange Server 2010 with Service Pack 1 (SP1)
Microsoft Windows Server 2008 R2
Microsoft Windows Server 2008 R2 Hyper-V
Table of Contents
소개
Solution Summary
Customer Requirements
Mailbox Profile Requirements
Geographic Location Requirements
Server and Data Protection Requirements
Design Assumptions
Server Configuration Assumptions
Storage Configuration Assumptions
Solution Design
Determine High Availability Strategy
Estimate Mailbox Storage Capacity Requirements
Estimate Mailbox I/O Requirements
Determine Storage Type
Choose Storage Solution
Explore EMC Value-Added Solutions for Exchange 2010
Estimate Mailbox Memory Requirements
Estimate Mailbox CPU Requirements
Summarize Mailbox Requirements
Determine Whether Server Virtualization Will Be Used
Determine Server Model for Hyper-V Root Server
Calculate CPU Capacity of Mailbox Server Model
Determine CPU Capacity of Virtual Machines
Determine Number of Mailbox Server Virtual Machines Required
Determine Number of Mailboxes per Mailbox Server
Determine Memory Required Per Mailbox Server
Determine Number of Client Access and Hub Transport Server Combination Virtual Machines Required
Determine Memory Required per Combined Client Access and Hub Transport Virtual Machines
Determine Number of Physical Servers Required
Determine Virtual Machine Distribution
Determine Memory Required per Root Server
Identify Failure Domains Impacting Database Copy Layout
Determine Maximum Database Size
Determine Minimum Number of Databases Required
Design Database Copy Layout
Determine Storage Design
Determine Placement of the File Share Witness
Plan Namespaces
Determine Client Access Server Array and Load Balancing Strategy
Determine Hardware Load Balancing Model
Determine Hardware Load Balancing Device Resiliency Strategy
Determine Hardware Load Balancing Methods
Solution Overview
Logical Solution Diagram
Physical Solution Diagram
Server Hardware Summary
Client Access and Hub Transport Server Configuration
Mailbox Server Configuration
Database Layout
EMC Replication Enabler Exchange 2010
Network Switch Hardware Summary
Load Balancing Hardware Summary
Storage Hardware Summary
Storage Configuration
Fibre Channel Switch Hardware Summary
Solution Validation Methodology
Storage Design Validation Methodology
Server Design Validation
Functional Validation Tests
Database Switchover (In-Site) Validation
Server Switchover (In-Site) Validation
Server Failover Validation
Datacenter Switchover Validation
Primary Datacenter Service Restoration Validation
Test Facility
Solution Validation Results
Functional Validation Results
Storage Design Validation Results
Server Design Validation Results
결론
추가 정보
소개
This document provides an example of how to design, test, and validate an Exchange Server 2010 solution for a customer environment with 20,000 mailboxes deployed on Dell servers, EMC storage, and Brocade network solutions. One of the key challenges with designing Exchange 2010 environments is examining the current server and storage options available and making the right hardware choices that provide the best value over the anticipated life of the solution. Following the step-by-step methodology in this document, we walk through the important design decision points that help address these key challenges while ensuring that the customer's core business requirements are met. After we have determined the optimal solution for this customer, the solution undergoes a standard validation process to ensure that it holds up under simulated production workloads for normal operating, maintenance, and failure scenarios.
맨 위로 이동
Solution Summary
The following tables summarize the key Exchange and hardware components of this solution.
Exchange components
Exchange component | Value or description |
---|---|
Target mailbox count |
20000 |
Initial target mailbox size |
500 megabytes (MB) |
Target message profile |
150 messages per day |
Exchange database copy count |
4 |
EMC Replication Enabler Exchange 2010 database copy count |
2 |
Volume Shadow Copy Service (VSS) backup |
EMC Replication Manager |
Site resiliency |
Yes |
Virtualization |
Hyper-V |
Exchange server count |
32 virtual machines (VMs) |
Physical server count |
4 |
Hardware components
Hardware component | Value or description |
---|---|
Server partner |
Dell |
Server model |
PowerEdge R910 |
Server type |
Rack |
Processor |
Intel Xeon X7560 |
Storage partner |
EMC |
Storage type |
Storage area network (SAN) |
Storage model |
CLARiiON CX4-480 |
Disk type |
450 gigabytes (GB) 15,000 rpm Fibre Channel 3.5" |
Network partner |
Brocade |
Ethernet switch |
Brocade FastIron GS Series Layer 2 Switch |
Load balancer |
Brocade ServerIron ADX 1000 |
Fibre Channel switch |
Brocade 300 SAN Switch |
맨 위로 이동
Customer Requirements
One of the most important first steps in Exchange solution design is to accurately summarize the business and technical requirements that are critical to making the correct design decisions. The following sections outline the customer requirements for this solution.
맨 위로 이동
Mailbox Profile Requirements
Determine mailbox profile requirements as accurately as possible because these requirements may impact all other components of the design. If Exchange is new to you, you may have to make some educated guesses. If you have an existing Exchange environment, you can use the Microsoft Exchange Server Profile Analyzer tool to assist with gathering most of this information. The following tables summarize the mailbox profile requirements for this solution.
Mailbox count requirements
Mailbox count requirements | Value |
---|---|
Mailbox count (total number of mailboxes including resource mailboxes) |
20000 |
Projected growth percent (%) in mailbox count (projected increase in mailbox count over the life of the solution) |
0 |
Expected mailbox concurrency % (maximum number of active mailboxes at any time) |
70% |
Mailbox size requirements
Mailbox size requirements | Value |
---|---|
Initial average mailbox size in MB |
500 |
Tiered mailbox size |
No |
Average mailbox archive size in MB |
Not applicable |
Projected growth (%) in mailbox size in MB (projected increase in mailbox size over the life of the solution) |
400% |
Target average mailbox size in MB |
2048 MB |
Mailbox profile requirements
Mailbox profile requirements | Value |
---|---|
Target message profile (average total number of messages sent plus received per user per day) |
150 messages per day |
Tiered message profile |
No |
Target average message size in kilobytes (KB) |
75 |
% in MAPI cached mode |
100 |
% in MAPI online mode |
0 |
% in Outlook Anywhere cached mode |
0 |
% in Microsoft Office Outlook Web App (Outlook Web Access in Exchange 2007 and previous versions) |
0 |
% in Exchange ActiveSync |
0 |
맨 위로 이동
Geographic Location Requirements
Understanding the distribution of mailbox users and datacenters is important when making design decisions about high availability and site resiliency.
The following table outlines the geographic distribution of people who will be using the Exchange system.
Geographic distribution of people
Mailbox user site requirements | Value |
---|---|
Number of major sites containing mailbox users |
2 |
Number of mailbox users in site 1 |
10000 |
Number of mailbox users in site 2 |
10000 |
The following table outlines the geographic distribution of datacenters that could potentially support the Exchange e-mail infrastructure.
Geographic distribution of datacenters
Datacenter site requirements | Value or description |
---|---|
Total number of datacenters |
2 |
Number of active mailboxes in proximity to datacenter 1 |
10000 |
Number of active mailboxes in proximity to datacenter 2 |
10000 |
Requirement for Exchange to reside in more than one datacenter |
Yes |
맨 위로 이동
Server and Data Protection Requirements
It's also important to define server and data protection requirements for the environment because these requirements will support design decisions about high availability and site resiliency.
The following table identifies server protection requirements.
Server protection requirements
Server protection requirements | Value |
---|---|
Number of simultaneous server or VM failures within site |
2 |
Number of simultaneous server or VM failures during site failure |
0 |
The following table identifies data protection requirements.
Data protection requirements
Data protection requirement | Value or description |
---|---|
Requirement to maintain a backup of the Exchange databases outside of the Exchange environment (for example, third-party backup solution) |
Yes |
Requirement to maintain copies of the Exchange databases within the Exchange environment (for example, Exchange native data protection) |
Yes |
Requirement to maintain multiple copies of mailbox data in the primary datacenter |
No |
Requirement to maintain copies of mailbox data in a secondary datacenter |
No |
Requirement to maintain a lagged copy of any Exchange databases |
No |
Lagged copy period in days |
Not applicable |
Target number of database copies |
4 |
Deleted Items folder retention window in days |
7 |
Recovery point objective |
0 |
Recovery time objective |
30 minutes |
맨 위로 이동
Design Assumptions
This section includes information that isn't typically collected as part of customer requirements, but is critical to both the design and the approach to validating the design.
맨 위로 이동
Server Configuration Assumptions
The following table describes the peak CPU utilization targets for normal operating conditions, and for site server failure or server maintenance conditions.
Server utilization targets
Target server CPU utilization design assumption | Value |
---|---|
Normal operating for Mailbox servers |
<70% |
Normal operating for Client Access servers |
<70% |
Normal operating for Hub Transport servers |
<70% |
Normal operating for multiple server roles (Client Access, Hub Transport, and Mailbox servers) |
<70% |
Normal operating for multiple server roles (Client Access and Hub Transport servers) |
<70% |
Node failure for Mailbox servers |
<80% |
Node failure for Client Access servers |
<80% |
Node failure for Hub Transport servers |
<80% |
Node failure for multiple server roles (Client Access, Hub Transport, and Mailbox servers) |
<80% |
Node failure for multiple server roles (Client Access and Hub Transport servers) |
<80% |
Site failure for Mailbox servers |
<80% |
Site failure for Client Access servers |
<80% |
Site failure for Hub Transport servers |
<80% |
Site failure for multiple server roles (Client Access, Hub Transport, and Mailbox servers) |
<80% |
Site failure for multiple server roles (Client Access and Hub Transport servers) |
<80% |
맨 위로 이동
Storage Configuration Assumptions
The following tables summarize some data configuration and input/output (I/O) assumptions made when designing the storage configuration.
Data configuration assumptions
Data configuration assumption | Value or description |
---|---|
Data overhead factor |
20% |
Mailbox moves per week |
1% |
Dedicated maintenance or restore logical unit number (LUN) |
No |
LUN free space |
20% |
I/O configuration assumptions
I/O configuration assumption | Value or description |
---|---|
I/O overhead factor |
20% |
Additional I/O requirements |
None |
맨 위로 이동
Solution Design
The following section provides a step-by-step methodology used to design this solution. This methodology takes customer requirements and design assumptions and walks through the key design decision points that need to be made when designing an Exchange 2010 environment.
맨 위로 이동
Determine High Availability Strategy
When designing an Exchange 2010 environment, many design decision points for high availability strategies impact other design components. We recommend that you determine your high availability strategy as the first step in the design process. We highly recommend that you review the following information prior to starting this step:
Step 1: Determine whether site resiliency is required
If you have more than one datacenter, you must decide whether to deploy Exchange infrastructure in a single datacenter or distribute it across two or more datacenters. The organization's recovery service level agreements (SLAs) should define what level of service is required following a primary datacenter failure. This information should form the basis for this decision.
*Design Decision Point*
In this solution, there are two physical datacenter locations. The SLA states that datacenter resiliency is required for all mission-critical services including e-mail. The Exchange 2010 design will be based on a two site deployment with site resiliency for the messaging service and data.
Step 2: Determine relationship between mailbox user locations and datacenter locations
In this step, we look at whether all mailbox users are located primarily in one site or if they're distributed across many sites and whether those sites are associated with datacenters. If they're distributed across many sites and there are datacenters associated with those sites, you need to determine if there's a requirement to maintain affinity between mailbox users and the datacenter associated with that site.
*Design Decision Point*
In this example, two datacenters are co-located with two offices in the same metropolitan area. Each office contains approximately 50 percent of the active mailbox users. There's a desire to maintain affinity between the mail users location and the location of the primary active copy of their mailbox during normal operating conditions.
Step 3: Determine database distribution model
Because the customer has decided to deploy Exchange infrastructure in more than one physical location, the customer needs to determine which database distribution model best meets the needs of the organization. There are three database distribution models:
Active/Passive distribution Active mailbox database copies are deployed in the primary datacenter and only passive database copies are deployed in a secondary datacenter. The secondary datacenter serves as a standby datacenter and no active mailboxes are hosted in the datacenter under normal operating conditions. In the event of an outage impacting the primary datacenter, a manual switchover to the secondary datacenter is performed and active databases are hosted there until the primary datacenter returns online.
Active/Passive distribution
Active/Active distribution (single DAG) Active mailbox databases are deployed in the primary and secondary datacenters. A corresponding passive copy is located in the alternate datacenter. All Mailbox servers are members of a single database availability group (DAG). In this model, the wide area network (WAN) connection between two datacenters is potentially a single point of failure. Loss of the WAN connection results in Mailbox servers in one of the datacenters going into a failed state due to loss of quorum.
Active/Active distribution (single DAG)
Active/Active distribution (multiple DAGs) This model leverages multiple DAGs to remove WAN connectivity as a single point of failure. One DAG has active database copies in the first datacenter and its corresponding passive database copies in the second datacenter. The second DAG has active database copies in the second datacenter and its corresponding passive database copies in the first datacenter. In the event of loss of WAN connectivity, the active copies in each site continue to provide database availability to local mailbox users.
Active/Active distribution (multiple DAGs)
*Design Decision Point*
In this example, because active mailbox databases will be deployed in each of the datacenter locations, the database distribution model will be active/active with multiple DAGs. There are some additional design considerations when deploying an active/active database distribution model with multiple DAGs, which will be addressed in a later step.
Step 4: Determine backup and database resiliency strategy
Exchange 2010 includes several new features and core changes that, when deployed and configured correctly, can provide native data protection that eliminates the need to make traditional data backups. Backups are traditionally used for disaster recovery, recovery of accidentally deleted items, long term data storage, and point-in-time database recovery. Exchange 2010 can address all of these scenarios without the need for traditional backups:
Disaster recovery In the event of a hardware or software failure, multiple database copies in a DAG enable high availability with fast failover and no data loss. DAGs can be extended to multiple sites and can provide resilience against datacenter failures.
Recovery of accidentally deleted items With the new Recoverable Items folder in Exchange 2010 and the hold policy that can be applied to it, it's possible to retain all deleted and modified data for a specified period of time, so recovery of these items is easier and faster. For more information, see 메시징 정책 및 규정 준수, 복구 가능한 항목 이해, and 보존 태그 및 보존 정책 이해.
Long-term data storage Sometimes, backups also serve an archival purpose. Typically, tape is used to preserve point-in-time snapshots of data for extended periods of time as governed by compliance requirements. The new archiving, multiple-mailbox search, and message retention features in Exchange 2010 provide a mechanism to efficiently preserve data in an end-user accessible manner for extended periods of time. For more information, see 개인 보관 파일 이해, 여러 사서함 검색 이해, and 보존 태그 및 보존 정책 이해.
Point-in-time database snapshot If a past point-in-time copy of mailbox data is a requirement for your organization, Exchange provides the ability to create a lagged copy in a DAG environment. This can be useful in the rare event that there's a logical corruption that replicates across the databases in the DAG, resulting in a need to return to a previous point in time. It may also be useful if an administrator accidentally deletes mailboxes or user data.
There are technical reasons and several issues that you should consider before using the features built into Exchange 2010 as a replacement for traditional backups. Prior to making this decision, see 백업, 복원 및 재해 복구 이해.
*Design Decision Point*
In this example, the SLA requires a backup of the Exchange messaging data to provide point-in-time recovery. Because lagged copies aren't desired, a VSS-based backup solution will be implemented. Details about this solution are covered in a later step, after a storage vendor is selected.
Step 5: Determine number of database copies required
The next important decision when defining your database resiliency strategy is to determine the number of database copies to deploy. We strongly recommend deploying a minimum of three copies of a mailbox database before eliminating traditional forms of protection for the database, such as Redundant Array of Independent Disks (RAID) or traditional VSS-based backups.
For additional information, see 사서함 데이터베이스 복사본 이해.
*Design Decision Point*
To support site resiliency, a minimum of two copies of each database (one in each of the two sites) is required. To support server resiliency in the primary site, two copies of each database in the primary site are required. In the event of a site failure, server resiliency is needed in the secondary site to support server failure and maintenance events. To support site resiliency and server resiliency requirements, four database copies are required. However, deploying four database copies plus a VSS-based backup solution can be an expensive design, depending on the chosen storage solution. Because the budget is limited and may not support a four copy design, this decision will be evaluated further when a storage solution is chosen.
Step 6: Determine database copy type
There are two types of database copies:
High availability database copy This database copy is configured with a replay lag time of zero. As the name implies, high availability database copies are kept up-to-date by the system, can be automatically activated by the system, and are used to provide high availability for mailbox service and data.
Lagged database copy This database copy is configured to delay transaction log replay for a period of time. Lagged database copies are designed to provide point-in-time protection, which can be used to recover from store logical corruptions, administrative errors (for example, deleting or purging a disconnected mailbox), and automation errors (for example, bulk purging of disconnected mailboxes).
*Design Decision Point*
In this example, all mailbox database copies will be deployed as high availability database copies. In a previous step, it was decided to implement a VSS-based backup solution to provide point-in-time recovery.
참고
Even though in this example only a VSS-based backup solution is used to provide point-in-time recovery, there's value in deploying a lagged copy alongside a traditional point-in-time backup because recovering from a lagged copy is quicker than recovering from a backup solution.
Step 7: Determine Mailbox server resiliency strategy
Exchange 2010 has been re-engineered for mailbox resiliency. Automatic failover protection is now provided at the mailbox database level instead of at the server level. You can strategically distribute active and passive database copies to Mailbox servers within a DAG. Determining how many database copies you plan to activate on a per-server basis is a key aspect to Exchange 2010 capacity planning. There are different database distribution models that you can deploy, but generally we recommend one of the following:
Design for all copies activated In this model, the Mailbox server role is sized to accommodate the activation of all database copies on the server. For example, a Mailbox server may host four database copies. During normal operating conditions, the server may have two active database copies and two passive database copies. During a failure or maintenance event, all four database copies would become active on the Mailbox server. This solution is usually deployed in pairs. For example, if deploying four servers, the first pair is servers MBX1 and MBX2, and the second pair is servers MBX3 and MBX4. In addition, when designing for this model, you will size each Mailbox server for no more than 40 percent of available resources during normal operating conditions. In a site resilient deployment with three database copies and six servers, this model can be deployed in sets of three servers, with the third server residing in the secondary datacenter. This model provides a three-server building block for solutions using an active/passive site resiliency model.
This model can be used in the following scenarios:
Active/Passive multisite configuration where failure domains (for example, racks, blade enclosures, and storage arrays) require easy isolation of database copies in the primary datacenter
Active/Passive multisite configuration where anticipated growth may warrant easy addition of logical units of scale
Configurations that aren't required to survive the simultaneous loss of any two Mailbox servers in the DAG
This model requires servers to be deployed in pairs for single site deployments and sets of three for multisite deployments. The following table illustrates a sample database layout for this model.
Design for all copies activated
In the preceding table, the following applies:
C1 = active copy (activation preference value of 1) during normal operations
C2 = passive copy (activation preference value of 2) during normal operations
C3 = passive copy (activation preference value of 3) during site failure event
Design for targeted failure scenarios In this model, the Mailbox server role is designed to accommodate the activation of a subset of the database copies on the server. The number of database copies in the subset will depend on the specific failure scenario that you're designing for. The main goal of this design is to evenly distribute active database load across the remaining Mailbox servers in the DAG.
This model should be used in the following scenarios:
All single site configurations with three or more database copies
Configurations required to survive the simultaneous loss of any two Mailbox servers in the DAG
The DAG design for this model requires between 3 and 16 Mailbox servers. The following table illustrates a sample database layout for this model.
Design for targeted failure scenarios
In the preceding table, the following applies:
C1 = active copy (activation preference value of 1) during normal operations
C2 = passive copy (activation preference value of 2) during normal operations
C3 = passive copy (activation preference value of 3) during normal operations
*Design Decision Point*
In a previous step, it was decided to deploy two DAGs, each with an active/passive database distribution model. Each DAG will have two high availability database copies in the primary datacenter and two high availability copies in the secondary datacenter. Because the two high availability copies in each datacenter are usually deployed in separate hardware failure domains, this model usually results in a Mailbox server resiliency strategy that designs for all copies being activated. Database layout design decisions for this model will be examined in a later step.
Step 9: Determine number of Mailbox servers per DAG
In this step, you need to determine the minimum number of Mailbox servers required to support the DAG design. This number may be different from the number of servers required to support the workload, so the final decision on the number of servers is made in a later step.
*Design Decision Point*
This example uses four high availability database copies. To support four copies, a minimum of four Mailbox servers in the DAG is required. In an active/passive configuration, two of the servers will reside in the primary datacenter, and two servers will reside in the secondary datacenter. In this model, the number of servers in the DAG should be deployed in multiples of four. The following table outlines the possible configurations.
Total Mailbox server count
Primary datacenter | Secondary datacenter | Total Mailbox server count |
---|---|---|
2 |
2 |
4 |
4 |
4 |
8 |
8 |
8 |
16 |
맨 위로 이동
Estimate Mailbox Storage Capacity Requirements
Many factors influence the storage capacity requirements for the Mailbox server role. For additional information, we recommend that you review 사서함 데이터베이스 및 로그 용량 요소 이해.
The following steps outline how to calculate mailbox capacity requirements. These requirements will then be used to make decisions about which storage solution options meet the capacity requirements. A later section covers additional calculations required to properly design the storage layout on the chosen storage platform.
Microsoft has created a Mailbox Server Role Requirements Calculator that will do most of this work for you. To download the calculator, see E2010 Mailbox Server Role Requirements Calculator. For additional information about using the calculator, see Exchange 2010 Mailbox Server Role Requirements Calculator.
Step 1: Calculate mailbox size on disk
Before attempting to determine what your total storage requirements are, you should know what the mailbox size on disk will be. A full mailbox with a 1-GB quota requires more than 1 GB of disk space because you have to account for the prohibit send/receive limit, the number of messages the user sends or receives per day, the Deleted Items folder retention window (with or without calendar version logging and single item recovery enabled), and the average database daily variations per mailbox. The Mailbox Server Role Requirements Calculator does these calculations for you. You can also use the following information to do the calculations manually.
The following calculations are used to determine the mailbox size on disk for this solution:
Whitespace = 150 messages per day × 75 ÷ 1024 MB = 11 MB
Dumpster = (150 messages per day × 75 ÷ 1024 MB × 7 days) + (500 MB × 0.012) + (500 MB × 0.058) = 112 MB
Mailbox size on disk = mailbox limit + whitespace + dumpster
= 500 + 11 + 112
= 623 MB
Step 2: Calculate database storage capacity requirements
In this step, the high level storage capacity required for all mailbox databases is determined. The calculated capacity includes database size, catalog index size, and 20 percent free space.
To determine the storage capacity required for all databases, use the following formula:
Database size = (number of mailboxes × mailbox size on disk × database overhead growth factor) × (20% data overhead)
= (20000 × 623 × 1) × 1.2
= 14952000 MB
= 14602 GB
Database index size = 10% of database size
= 1186 GB
Total database capacity = (database size + index size) ÷ 0.80 to add 20% volume free space
= (14602 + 1186) ÷ 0.8
= 19735 GB
Step 3: Calculate transaction log storage capacity requirements
To ensure that the Mailbox server doesn't sustain any outages as a result of space allocation issues, the transaction logs also need to be sized to accommodate all of the logs that will be generated during the backup set. Provided that this architecture is leveraging the mailbox resiliency and single item recovery features as the backup architecture, the log capacity should allocate for three times the daily log generation rate in the event that a failed copy isn't repaired for three days. (Any failed copy prevents log truncation from occurring.) In the event that the server isn’t back online within three days, you would want to temporarily remove the copy to allow truncation to occur.
To determine the storage capacity required for all transaction logs, use the following formula:
Log files size = (log file size × number of logs per mailbox per day × log reply time in days × number of mailbox users) + (1% mailbox move overhead)
= (1 MB × 30 × 3 × 20000) + (20000 × 0.01 × 500 MB)
= 1900000 MB
= 1855 GB
Step 4: Determine total storage capacity requirements
The following table summarizes the high level storage capacity requirements for this solution. In a later step, you will use this information to make decisions about which storage solution to deploy. You will then take a closer look at specific storage requirements in later steps.
Summary of storage capacity requirements
Disk space requirements | Value |
---|---|
Average mailbox size on disk (MB) |
623 |
Database capacity required (GB) |
14602 |
Log capacity required (GB) |
1855 |
Total capacity required (GB) |
16457 |
Total capacity required for two database copies (GB) |
32913 |
Total capacity required for three database copies (GB) |
49370 |
Total capacity required for four database copies (GB) |
65828 |
Total capacity required (terabytes) |
64 |
맨 위로 이동
Estimate Mailbox I/O Requirements
When designing an Exchange environment, you need an understanding of database and log performance factors. We recommend that you review 데이터베이스와 로그 성능 요소 이해.
Calculate total mailbox I/O requirements
Because it's one of the key transactional I/O metrics needed for adequately sizing storage, you should understand the amount of database I/O per second (IOPS) consumed by each mailbox user. Pure sequential I/O operations aren't factored in the IOPS per Mailbox server calculation because storage subsystems can handle sequential I/O much more efficiently than random I/O. These operations include background database maintenance, log transactional I/O, and log replication I/O. In this step, you calculate the total IOPS required to support all mailbox users, using the following:
Total required IOPS = IOPS per mailbox user × number of mailboxes × I/O overhead factor
= 0.15 × 20000 × 1.2
= 3600
참고
To determine the IOPS profile for a different message profile, see the table "Database cache and estimated IOPS per mailbox based on message activity" in 데이터베이스와 로그 성능 요소 이해.
The high level storage IOPS requirement is approximately 3,600. When choosing a storage solution, you need to ensure that the solution meets this requirement.
맨 위로 이동
Determine Storage Type
Exchange 2010 includes improvements in performance, reliability, and high availability that enable organizations to run Exchange on a wide range of storage options.
When examining the storage options available, being able to balance the performance, capacity, manageability, and cost requirements is essential to achieving a successful storage solution for Exchange.
For more information about choosing a storage solution for Exchange 2010, see 사서함 서버 저장소 디자인.
Determine whether you prefer a DAS or SAN storage solution
There is a wide range of storage options available for Exchange 2010. The list of choices can be reduced by determining whether deploying a direct-attached storage (DAS) solution (including using local disk) or a SAN solution is preferred. There are many reasons for choosing one over the other, and you should work with your preferred storage vendor to determine which solution meets your business and total cost of ownership (TCO) requirements.
*Design Decision Point*
In this example, a SAN infrastructure is deployed, and SAN is used for storing all data in the environment. A SAN storage solution will continue to be used, and options for deploying Exchange 2010 will be explored.
맨 위로 이동
Choose Storage Solution
Use the following steps to choose a storage solution.
Step 1: Identify preferred storage vendor
In this example, EMC storage has been used for many years, and an EMC storage solution will be used for Exchange 2010 deployment. EMC Corporation offers high performing storage arrays like CLARiiON and Symmetric.
Step 2: Review available options from preferred vendor
The EMC CLARiiON family provides multiple tiers of storage, such as enterprise flash drives, Fibre Channel, and Serial ATA (SATA), which reduces costs because multiple tiers can be managed with a single management interface.
CLARiiON Virtual Provisioning provides benefits beyond traditional thin provisioning, including simplified storage management and improved capacity utilization. You can present a large amount of capacity to a host, and then consume space as needed from a shared pool.
CLARiiON CX4 Series provides four models with flexible levels of capacity, functionality, and performance. The features of each model are described in the following table.
Feature | CX4 model 120 | CX4 model 240 | CX4 model 480 | CX4 model 960 |
---|---|---|---|---|
Maximum disks |
120 |
240 |
480 |
960 |
Storage processors |
2 |
2 |
2 |
2 |
Physical memory per storage processor |
3 GB |
4 GB |
8 GB |
16 GB |
Maximum write cache |
600 MB |
1.264 GB |
4.5 GB |
10.764 GB |
Maximum initiators per system |
256 |
512 |
512 |
1024 |
High-availability hosts |
128 |
256 |
256 |
512 |
Minimum form factor size |
6U |
6U |
6U |
9U |
Maximum standard LUNs |
1024 |
1024 |
4096 |
4096 |
SnapView snapshots |
Yes |
Yes |
Yes |
Yes |
SnapView clones |
Yes |
Yes |
Yes |
Yes |
SAN copy |
Yes |
Yes |
Yes |
Yes |
MirrorView/S |
Yes |
Yes |
Yes |
Yes |
MirrorView/A |
Yes |
Yes |
Yes |
Yes |
RecoverPoint |
Yes |
Yes |
Yes |
Yes |
Step 3: Select a disk type
In this example, 450 GB Fibre Channel 15,000 rpm disks are selected, which provide good I/O performance and capacity to satisfy the initial Exchange user requirements. Because the I/O and capacity requirements could be met with other disk types, you should consult your storage vendor when choosing a disk type.
Step 4: Select an array
In this example, the solution needs to provide 54 terabytes of usable storage and 3,600 IOPS. Any of the options in the preceding table will handle the IOPS requirements, so the decision will be based on capacity requirements. The CLARiiON CX4 model 480 is selected because it provides the necessary capacity and I/O performance to support all Exchange 2010 requirements.
맨 위로 이동
Explore EMC Value-Added Solutions for Exchange 2010
EMC provides value-added solutions for Exchange Server 2010 environments including business continuity for disaster recovery, with low data loss and minimal recovery time options. The following sections explore some of these options.
Step 1: Determine whether EMC Replication Enabler Exchange 2010 will be used
Exchange 2010 includes an application programming interface (API) to enable integration of third-party replication solutions into the DAG framework. When enabled, third-party replication support disables the native network-based log shipping mechanism used by DAGs. Storage-based replication technologies can then be used to protect the Exchange database copies specified within the Exchange 2010 environment.
As an alternative to a native Exchange 2010 network-based DAG replication, EMC developed a free software tool called EMC Replication Enabler for Microsoft Exchange Server 2010 (sometimes known as REE). This tool enables block-level, synchronous storage-based replication over existing Fibre Channel SANs.
Replication Enabler Exchange 2010 integrates with the DAG third-party replication API to enable both shared local storage as well as synchronously replicated storage with EMC MirrorView and EMC RecoverPoint as database copies within a DAG. The ability to use shared local storage as well as synchronously replicated remote copies helps to enable high availability and site resiliency functionality similar to the high availability model and geographically dispersed cluster capabilities available with previous versions of Exchange. Using the array-based replication solution eases the strain on the network bandwidth while preserving the scalability, reliability, and performance benefits of the storage array.
Replication Enabler Exchange 2010 provides the ability to deploy the equivalent of four Exchange database copies (two in the primary datacenter and two in the secondary datacenter) while only provisioning two physical copies of the database (one in each datacenter). Server resiliency requirements in both datacenters can be met without the added expense of provisioning four physical copies.
Replication Enabler Exchange 2010 also provides the ability to move log shipping and database seeding traffic from the Ethernet network to the existing Fibre Channel infrastructure to alleviate current bandwidth concerns between datacenters.
*Design Decision Point*
In this example, there isn't sufficient budget to deploy four physical copies of all Exchange databases and a VSS-based backup solution on a SAN infrastructure. Replication Enabler Exchange 2010 will be used so that server resiliency requirements in both datacenters can be met. Sufficient storage to support two physical copies of each database and a VSS-based backup solution will be purchased.
Currently, there are network bandwidth constraints during peak times. Although normal log shipping load is supported, reseeding databases during peak times may exceed available bandwidth. Replication Enabler Exchange 2010 will be used so that the existing Fibre Channel infrastructure can be used to maintain database copies rather than the Ethernet network infrastructure.
Step 2: Determine whether EMC MirrorView or EMC RecoverPoint will be used for replication
EMC Replication Enabler Exchange 2010 can use either EMC MirrorView or EMC RecoverPoint to provide synchronous storage replication. This step explores these two options to help decide which option is best, based on customer requirements.
EMC MirrorView/S
By creating a synchronous mirror between EMC arrays, MirrorView/S maintains an exact byte-for-byte copy of your production data in a second location. You can use the mirrored copy for failover, for online restore from backup, and for running backups against a SnapView snapshot of the remote mirror. MirrorView/S helps minimize exposure to internal issues and external disaster situations, and is designed to provide fast recovery time if a disaster occurs.
MirrorView/S protects data throughout the entire mirroring process. Fracture logs track changes and provide a source for restoring modifications to source data if the source array loses contact with the target array during a failure. When the target array becomes available, MirrorView/S captures the pending writes in the fracture log and writes them to the target array, restoring its consistent state.
MirrorView/S also maintains a write-intent log in the unlikely event of a source array issue. Upon repair of the source array, MirrorView/S accesses the write-intent log to make any changes in process between the two arrays during the failure, to the source data. Next, a partial sync with the target array takes place to obtain a consistent state between the source and target arrays.
MirrorView/S offers a feature called consistency groups, which helps ensure consistent remote copies of data from one or more applications for disaster recovery purposes. Interrelated LUNs remain in sync and are recoverable in the event of an outage at the primary site. All LUNs in the consistency group must reside in the same array.
EMC RecoverPoint
EMC RecoverPoint is an out-of-band, block-level replication product for a heterogeneous server and storage environment. RecoverPoint continuous data protection provides local synchronous replication between LUNs that reside in one or more arrays at the same site. RecoverPoint continuous remote replication provides remote asynchronous replication between two sites for LUNs that reside in one or more arrays. Both feature bidirectional replication and an any-point-in-time recovery capability, which allows the target LUNs to be rolled back to a previous point in time and used for read/write operations without affecting the ongoing replication or data protection. The bidirectional replication and any-point-in-time recovery capability can be enabled simultaneously with RecoverPoint concurrent local and remote data protection.
With EMC RecoverPoint appliances, replication can be performed over the Ethernet or the Fibre Channel SAN. RecoverPoint provides a single point replication solution for all applications that require protection to support customer SLAs.
RecoverPoint is appliance-based, which enables it to better support large amounts of information stored across a heterogeneous server and storage environment. RecoverPoint uses lightweight splitting technology (on the application server, in the fabric, or in the CLARiiON array) to mirror a write to a RecoverPoint appliance that resides outside of the primary data path. Implementing an out-of-band approach enables RecoverPoint to deliver continuous replication without impacting an application's I/O operations. If the CLARiiON-based splitter is used, small computer system interface (SCSI) and Internet SCSI (iSCSI) LUNs hosted by the CLARiiON system can be used by RecoverPoint.
*Design Decision Point*
In this example, EMC MirrorView is selected to provide synchronous storage replication. MirrorView is better suited to metropolitan-based application replication scenarios. RecoverPoint is a better solution for long distance replication scenarios and supports a number of advanced features that aren't required for this Exchange solution. MirrorView is a more cost-effective choice.
Step 3: Determine whether EMC Replication Manager and EMC SnapView will be used
EMC Replication Manager provides support for Exchange 2010 database-level snapshots using VSS, which ensures a consistent copy of active Exchange databases with minimal impact to the production Exchange environment. EMC Replication Manager integrates with EMC SnapView, a storage system-based software application, which can be used to create a copy of a LUN using either clones or snapshots.
Replication Manager with SnapView supports Exchange 2010 in stand-alone or DAG environments, including support for a third-party DAG mode with EMC Replication Enabler Exchange 2010. VSS provides the framework for creating point-in-time transportable snapshots of Exchange 2010 data. Replication Manager provides full error checking functionality using the Exchange Server Database Utilities (Eseutil.exe) tool to ensure consistent, useable snapshots.
*Design Decision Point*
Because an EMC Replication Enabler Exchange 2010 solution is being used, EMC Replication Manager with SnapView will be selected as the VSS-based backup solution. EMC Replication Manager is fully integrated with EMC Replication Enabler Exchange 2010.
맨 위로 이동
Estimate Mailbox Memory Requirements
Sizing memory correctly is an important step in designing a healthy Exchange environment. We recommend that you review 메모리 구성 및 Exchange 성능 이해 and 사서함 데이터베이스 캐시 이해.
Calculate required database cache
The Extensible Storage Engine (ESE) uses database cache to reduce I/O operations. In general, the more database cache available, the less I/O generated on an Exchange 2010 Mailbox server. However, there's a point where adding additional database cache no longer results in a significant reduction in IOPS. Therefore, adding large amounts of physical memory to your Exchange server without determining the optimal amount of database cache required may result in higher costs with minimal performance benefit.
The IOPS estimates that you completed in a previous step assume a minimum amount of database cache per mailbox. These minimum amounts are summarized in the table "Estimated IOPS per mailbox based on message activity and mailbox database cache" in 사서함 데이터베이스 캐시 이해.
The following table outlines the database cache per user for various message profiles.
Database cache per user
Messages sent or received per mailbox per day (about 75 KB average message size) | Database cache per user (MB) |
---|---|
50 |
3 MB |
100 |
6 MB |
150 |
9 MB |
200 |
12 MB |
In this step, you determine high level memory requirements for the entire environment. In a later step, you use this result to determine the amount of physical memory needed for each Mailbox server. Use the following information:
Total database cache = profile specific database cache × number of mailbox users
= 9 MB × 20000
= 180000 MB
= 176 GB
The total database cache requirements for the environment are 176 GB.
맨 위로 이동
Estimate Mailbox CPU Requirements
Mailbox server capacity planning has changed significantly from previous versions of Exchange due to the new mailbox database resiliency model provided in Exchange 2010. For additional information, see 사서함 서버 프로세서 용량 계획.
In the following steps, you calculate the high level megacycle requirements for active and passive database copies. These requirements will be used in a later step to determine the number of Mailbox servers needed to support the workload. Note that the number of Mailbox servers required also depends on the Mailbox server resiliency model and database copy layout.
Using megacycle requirements to determine the number of mailbox users that an Exchange Mailbox server can support isn't an exact science. A number of factors can result in unexpected megacycle results in test and production environments. Megacycles should only be used to approximate the number of mailbox users that an Exchange Mailbox server can support. It's always better to be conservative rather than aggressive during the capacity planning portion of the design process.
The following calculations are based on published megacycle estimates as summarized in the following table.
Megacycle estimates
Messages sent or received per mailbox per day | Megacycles per mailbox for active mailbox database | Megacycles per mailbox for remote passive mailbox database | Megacycles per mailbox for local passive mailbox |
---|---|---|---|
50 |
1 |
0.1 |
0.15 |
100 |
2 |
0.2 |
0.3 |
150 |
3 |
0.3 |
0.45 |
200 |
4 |
0.4 |
0.6 |
Step 1: Calculate active mailbox CPU requirements
In this step, you calculate the megacycles required to support the active database copies, using the following:
Active mailbox megacycles required = profile specific megacycles × number of mailbox users
= 3 × 20000
= 60000
Step 2: Calculate active mailbox remote database copy CPU requirements
Usually, there is processor overhead associated with shipping logs required to maintain the database copy on the remote servers. This overhead is typically 10 percent of the active mailbox megacycles for each remote copy being serviced. In a solution using EMC Replication Enabler Exchange 2010, the database copy is maintained at the storage level, so there's no additional CPU overhead required to maintain database copies.
Step 3: Calculate local passive mailbox CPU requirements
Usually, there is processor overhead associated with maintaining the local passive copies of each database. In a solution using EMC Replication Enabler Exchange 2010, the database copy is maintained at the storage level, so there's no additional CPU overhead required to maintain database copies.
Step 4: Adjust total CPU requirements for maximum concurrency
In this example, of the 20,000 active mailboxes, 6,000 are call center staff who work the evening shift. These mailboxes are never accessed during the day, so the maximum concurrency never exceeds 14,000 active mailboxes or 70 percent. Calculate the requirements, using the following:
Adjusted megacycles required = total megacycles × maximum concurrency
= 60000 × 0.70
= 42000
The required megacycles to support 70 percent of the mailboxes is 42,000.
맨 위로 이동
Summarize Mailbox Requirements
The following table summarizes the approximate megacycles and database cache required for this environment. This information will be used in later steps to determine which servers will be deployed in the solution.
Mailbox CPU requirements | Value |
---|---|
Total megacycles required for 100% concurrency |
60000 |
Total megacycles required for 70% concurrency |
42000 |
Total database cache required for all mailboxes |
176 GB |
Total database cache required for 70% concurrency |
124 GB |
맨 위로 이동
Determine Whether Server Virtualization Will Be Used
Several factors are important when considering server virtualization for Exchange. For more information about supported configurations for virtualization, see Exchange 2010 시스템 요구 사항.
The main reasons customers use virtualization with Exchange are as follows:
If you expect server capacity to be underutilized and anticipate better utilization, you may purchase fewer servers as a result of virtualization.
You may want to use Windows Network Load Balancing (NLB) when deploying Client Access, Hub Transport, and Mailbox server roles on the same physical server.
If your organization is using virtualization in all server infrastructure, you may want to use virtualization with Exchange, to be in alignment with corporate standard policy.
*Design Decision Point*
In this solution, all new applications are being virtualized. Exchange will be virtualized to align with the corporate standard.
맨 위로 이동
Determine Server Model for Hyper-V Root Server
You can use the following steps to determine the server model for the Hyper-V root servers.
Step 1: Identify preferred server vendor
In this solution, the preferred server vendor is Dell.
The Dell eleventh generation PowerEdge servers offer industry leading performance and efficiency. Innovations include increased memory capacity and faster I/O rates, which help deliver the performance required by today's most demanding applications.
Step 2: Select a server model
In this example, the Dell R910 server platform is selected as the standard Hyper-V root server. Exchange VMs will be deployed on this platform.
The standard Dell R910 configuration for this solution is summarized in the following table.
Dell PowerEdge R910 server
Component | Description |
---|---|
Processors (x4) |
Intel Xeon 7560 2.26 gigahertz (GHz) Latest eight-core Intel Xeon processors 7500 series |
Form factor |
4U rack |
Memory |
192 GB (16 × 8 GB dual inline memory module (DIMM)) Up to 1 terabyte (64 DIMM slots): DDR3 1066 megahertz (MHz) |
Drives |
16 × 2.5" SAS or solid-state drive (SSD) hard disk drive option |
I/O slots |
Up to 10 PCIe G2 slots:
One storage x8 slot |
맨 위로 이동
Calculate CPU Capacity of Mailbox Server Model
In previous steps, you calculated the megacycles required to support the number of active mailbox users. In the following steps, you determine how many available megacycles the server model and processor can support, so that the number of active mailboxes each server can support can be determined.
Step 1: Determine benchmark value for server and processor
Because the megacycle requirements are based on a baseline server and processor model, you need to adjust the available megacycles for the server against the baseline. To do this, independent performance benchmarks maintained by Standard Performance Evaluation Corporation (SPEC) are used. SPEC is a non-profit corporation formed to establish, maintain, and endorse a standardized set of relevant benchmarks that can be applied to the newest generation of high-performance computers.
To help simplify the process of obtaining the benchmark value for your server and processor, we recommend you use the Exchange Processor Query tool. This tool automates the manual steps to determine your planned processor's SPECInt 2006 rate value. To run this tool, your computer must be connected to the Internet. The tool uses your planned processor model as input, and then runs a query against the Standard Performance Evaluation Corporation Web site returning all test result data for that specific processor model. The tool also calculates an average SPECint 2006 rate value based on the number of processors planned to be used in each Mailbox server Use the following calculations:
SPECint_rate2006 value = 759
SPECint_rate2006 value ÷ processor core = 759 ÷ 32
= 23.7
Step 2: Calculate adjusted megacycles
In previous steps, you calculated the required megacycles for the entire environment based on megacycle per mailbox estimates. Those estimates were measured on a baseline system (HP DL380 G5 x5470 3.33 GHz, 8 cores) that has a SPECint_rate2006 value of 150 (for an 8 core server), or 18.75 per core.
In this step, you need to adjust the available megacycles for the chosen server and processor against the baseline processor so that the required megacycles can be used for capacity planning.
To determine the megacycles of the Dell R910 Intel X7560 2.6 GHz platform, use the following formula:
Adjusted megacycles per core = (new platform per core value) × (hertz per core of baseline platform) ÷ (baseline per core value)
= (23.7 × 3330) ÷ 18.75
= 4212
Adjusted megacycles per server = adjusted megacycles per core × number of cores
= 4212 × 32
= 134798
Step 3: Adjust available megacycles for virtualization overhead
When deploying VMs on the root server, you need to account for megacycles required to support the hypervisor and virtualization stack. This overhead varies from server to server and under different workloads. A conservative estimate of 10 percent of available megacycles will be used. Use the following calculation:
Adjusted available megacycles per virtual processor = megacycles per core × 0.90
= 4212 × 0.90
= 3792
Adjusted available megacycles per server = usable megacycles × 0.90
= 134798 × 0.90
= 121319
Each server has a usable capacity for VMs of 121,319 megacycles. The usable capacity per logical processor is 3,792 megacycles.
맨 위로 이동
Determine CPU Capacity of Virtual Machines
Now that you know the megacycles of the root server, you can calculate the megacycles of each VM. These values will be used to determine how many VMs are required and how many mailboxes will be hosted by each VM.
Step 1: Calculate available megacycles per virtual machine
In this step, you determine how many megacycles are available for each VM deployed on the root server. With Windows Server 2008 R2 Hyper-V, each VM can be assigned up to four virtual processors. For this step, assume that all Exchange VMs will have four virtual processors. Use the following calculation:
Available megacycles per VM = adjusted available megacycles per virtual processor × number of virtual processors
= 3792 × 4
= 15165
Step 2: Determine target available megacycles per virtual machine
Because the design assumptions state not to exceed 80 percent processor utilization, in this step, you adjust the available megacycles to reflect the 80 percent target. Use the following calculation:
Target available megacycles = available megacycles × target maximum processor utilization
= 15165 × 0.80
= 12132
맨 위로 이동
Determine Number of Mailbox Server Virtual Machines Required
You can use the following steps to determine the number of Mailbox server VMs required.
Step 1: Determine maximum number of mailboxes supported by a single Mailbox server virtual machine
To determine the number of active mailboxes supported by a Mailbox server VM, use the following calculation:
Number of active mailboxes = available megacycles ÷ megacycles per mailbox
= 12131 ÷ 3
= 4044
Each Mailbox server VM can support a maximum of 4,044 active mailboxes.
Step 2: Determine minimum number of Mailbox server virtual machines required per site to support the required workload for each DAG
To determine the minimum number of Mailbox server VMs required in the primary site, use the following calculation:
Active mailboxes in each DAG = 10000 × 70%
= 7000
Number of VMs required = total active mailbox count in each DAG ÷ active mailboxes per VM
= 7000 ÷ 4044
= 1.7
Based on processor capacity, a minimum of two Mailbox server VMs in each DAG are required to support the anticipated peak workload during normal operating conditions.
맨 위로 이동
Determine Number of Mailboxes per Mailbox Server
You can use the following steps to determine the number of mailboxes per Mailbox server.
Step 1: Determine number of mailboxes per server during normal operation
To determine the expected number of mailboxes per server in each DAG during normal operating conditions, which will be used in a later step to determine storage requirements per server, use the following calculation:
Number of mailboxes per server = total mailbox count ÷ server count
= 10000 ÷ 4
= 2500
Step 2: Determine number of active mailboxes per server during normal operation
In a previous step, you determined that the expected maximum concurrency is 70 percent. You also need to calculate the expected number of active mailboxes during normal operating conditions. This value will be used in a later step to determine the normal operating load per server.
To determine the number of active mailboxes per server during normal operation, use the following calculation:
Number of active mailboxes per server = total active mailbox count per DAG ÷ server count
= 7000 ÷ 4
= 1750
Step 3: Determine number of mailboxes per server during worst case failure event
In this step, you determine the expected number of mailboxes per server in each DAG during the worst case failure or maintenance conditions. This value will be used in a later step to determine storage requirements per server.
To determine the number of mailboxes per server during a worst case failure event, use the following calculation:
Number of mailboxes per server = total mailbox count per DAG ÷ server count
= 10000 ÷ 2
= 5000
Step 4: Determine number of active mailboxes per server during worst case failure event
In a previous step, you determined that the expected maximum concurrency is 70 percent. You also need to calculate the expected number of active mailboxes in each DAG during the worst case failure or maintenance conditions. This value will be used in a later step to determine the worst case load per server.
To determine the number of active mailboxes per server during a worst case failure event, use the following calculation:
Number of active mailboxes per server = total mailbox count per DAG ÷ server count
= 7000 ÷ 2
= 3500
맨 위로 이동
Determine Memory Required Per Mailbox Server
You can use the following steps to determine the memory required per Mailbox server.
Step 1: Determine database cache requirements per server for the worst case failure scenario
In a previous step, you determined that the database cache requirements for all mailboxes was 176 GB and the average cache required per active mailbox was 9 MB.
To design for the worst case failure scenario, you calculate based on active mailboxes residing on two of four Mailbox servers. Use the following calculation:
Memory required for database cache = number of active mailboxes × average cache per mailbox
= 3500 x 9 MB
= 31500 MB
= 30.8 GB
Step 2: Determine total memory requirements per Mailbox server virtual machine for the worst case failure scenario
In this step, reference the following table to determine the recommended memory configuration.
Memory requirements
Server physical memory (RAM) | Database cache size (Mailbox server role only) |
---|---|
24 GB |
17.6 GB |
32 GB |
24.4 GB |
48 GB |
39.2 GB |
The recommended physical memory configuration to support 30.8 GB of database cache for a Mailbox server role is 48 GB. To support 30.8 GB of database cache, approximately 40 GB of memory is required. Because virtualization is being used, physical memory boundaries don't have to be aligned, and 40 GB can be assigned to each Mailbox server VM.
맨 위로 이동
Determine Number of Client Access and Hub Transport Server Combination Virtual Machines Required
You can use the following steps to determine the minimum and total number of Client Access and Hub Transport server combination VMs required.
Step 1: Determine minimum number of Client Access and Hub Transport server combination VMs required
In a previous step, you determined that eight Mailbox server VMs in each site (four from one DAG and four from the other DAG) are required. We recommend that you deploy one Client Access and Hub Transport server combination VM for every Mailbox server VM (assuming an identical number of virtual processors are assigned to each VM).
Number of Client Access and Hub Transport server combination VMs required
Server role configuration | Recommended processor core ratio |
---|---|
Mailbox server role: Client Access and Hub Transport combined server role |
1:1 |
However, when there are two DAGs represented in the same site, examine the worst case failure scenario before you determine the minimum number of Client Access and Hub Transport server combination VMs required. In this solution, the worst case failure scenario would be to lose two of the four Mailbox servers in the primary DAG and have a simultaneous site failure where two Mailbox servers in the secondary DAG are hosting active mailbox databases. Following the recommended processor core ratio of 1:1, a minimum of four Client Access and Hub Transport server combination VMs are required in each site to support the maximum 70 percent concurrency of 20,000 active users running on four Mailbox servers in each site, as shown in the following figure.
Client Access and Hub Transport server sizing
Step 2: Determine total number of Client Access and Hub Transport server combination VMs required
In the previous step, you determined that a minimum of four Client Access and Hub Transport server combination VMs are required to support the workload in the worst case failure or maintenance scenario. In this step, you need to determine the number of Client Access and Hub Transport server VMs to deploy in each site.
Consider the server resiliency model for Client Access and Hub Transport server combination VMs. To match the server resiliency model for Mailbox servers, the simultaneous failure of half of the Client Access and Hub Transport server combination VMs needs to be accommodated. A total of eight Client Access and Hub Transport servers VMs are required in each site to support the server resiliency model, as shown in the following figure.
Client Access and Hub Transport server combination VMs
맨 위로 이동
Determine Memory Required per Combined Client Access and Hub Transport Virtual Machines
To determine the memory configuration for the combined Client Access and Hub Transport server role VM, reference the following table.
Memory configurations for Exchange 2010 servers based on installed server roles
Exchange 2010 server role | Minimum supported | Recommended maximum |
---|---|---|
Client Access and Hub Transport combined server role (Client Access and Hub Transport server roles running on the same physical server) |
4 GB |
2 GB per core (8 GB minimum) |
Because the Client Access and Hub Transport server VM has four virtual processors, allocate 8 GB of memory to each Client Access and Hub Transport server VM.
맨 위로 이동
Determine Number of Physical Servers Required
You can use the following steps to determine the number of physical servers required.
Step 1: Summarize virtual server count
In previous steps, you determined that 32 VMs running Exchange server roles were required, as summarized in the following table.
VM count
Combination Client Access and Hub Transport server | Mailbox server | Total | |
---|---|---|---|
Datacenter 1 |
8 |
8 |
16 |
Datacenter 2 |
8 |
8 |
16 |
Total |
16 |
16 |
32 |
Step 2: Summarize virtual processor count
Each VM will have four virtual processors assigned. The following table summarizes the virtual processor requirements for the 32 VMs.
Virtual processor requirements for the 32 VMs
Combination Client Access and Hub Transport server | Mailbox server | Total | |
---|---|---|---|
Site A |
32 |
32 |
64 |
Site B |
32 |
32 |
64 |
Total |
6 |
48 |
128 |
Step 3: Summarize memory required
Each combination Client Access and Hub Transport server VM requires 8 GB of memory. Each Mailbox server VM requires 40 GB of memory. The following table summarizes the memory requirements of the 24 VMs.
Memory requirements for the 24 VMs
Combination Client Access and Hub Transport server | Mailbox server | Total | |
---|---|---|---|
Site A |
64 |
320 |
384 |
Site B |
64 |
320 |
384 |
Total |
28 |
640 |
768 |
Step 4: Determine number of Dell R910 servers required
The Dell R910 is a four socket server. With four Intel Xeon 7560 2.26 GHz processors, there are 32 logical processors (cores). In each site, a total of 64 virtual processors are required to support 16 VMs. A minimum of two Dell R910 servers are required in each site to support the virtual processor requirements.
맨 위로 이동
Determine Virtual Machine Distribution
In previous steps, you determined that each site will have two Hyper-V root servers, each hosting eight VMs. In the next steps, you examine how to distribute the VMs running different Exchange server roles across the two root servers.
When deciding which VMs to host on which root server, your main goal should be to eliminate single points of failure. Don't locate all Client Access and Hub Transport server role VMs on the same root server, and don't locate all Mailbox server role VMs on the same root server, as shown in the following figure.
Virtual machine distribution (incorrect)
The proper distribution is to have an even distribution of Client Access and Hub Transport server role VMs across all of the physical root servers and an even distribution of Mailbox server role VMs across all of the physical root servers. In this solution, there will be two Hyper-V root servers in each site, with each server supporting four Client Access and Hub Transport server role VMs and four Mailbox server role VMs, as shown in the following figure.
Virtual machine distribution (correct)
맨 위로 이동
Determine Memory Required per Root Server
In a previous step, you determined that the memory requirements for a Client Access and Hub Transport server role VM was 8 GB and a Mailbox server role VM was 40 GB. You then determined that each root server will host four Client Access and Hub Transport server role VMs and four Mailbox server role VMs.
To determine the memory required for each root server, use the following calculation:
Root server memory = Client Access and Hub Transport server role VM memory + Mailbox server role VM memory
= (8 GB × 4 servers) + (40 GB × 4 servers)
= 192 GB
The root server will be provisioned with 192 GB of physical memory.
맨 위로 이동
Identify Failure Domains Impacting Database Copy Layout
Use the following steps to identify failure domains impacting database copy layout.
Step 1: Identify failure domains associated with storage
In a previous step, you decided to deploy two CX4 model 480 storage arrays, one in each of the two sites. From an Exchange perspective, there are three copies of each database, two in the primary datacenter and one in the secondary datacenter for each DAG. From a physical storage perspective, there are two copies of each database, one in each datacenter. EMC Replication Enabler Exchange 2010 manages the single physical database copy in the primary datacenter, which allows two Exchange Mailbox servers to access the single physical copy during various failure or maintenance scenarios.
In this scenario, because there's only two physical copies of each database, each database copy residing on one of the two CX4 model 480 storage arrays represents a failure domain for the purposes of designing the database copy layout for the DAG.
Failure domains associated with storage
Step 2: Identify failure domains associated with servers
In the previous step, you identified that from an Exchange perspective, there are four database copies, two in the primary datacenter and two in the secondary datacenter for each DAG. However, there are only two physical copies of each database. Replication Enabler Exchange 2010 allows two Exchange servers in each site to access the same physical database during a mailbox server failure or maintenance event. For the purposes of database copy layout planning, this scenario is the same as if there were four physical database copies. Each Mailbox server that holds a copy of a database represents a failure domain. From a server perspective, there will be four failure domains, two in the primary datacenter and two in the secondary datacenter for each DAG.
Failure domains associated with servers
맨 위로 이동
Determine Maximum Database Size
In a previous step, it was decided to deploy EMC Replication Manager as the backup solution. This solution includes error checking of database snapshots to ensure the backup is consistent and ready to be used in a recovery scenario. As a best practice, keep databases at a reasonable size. For example, if you deploy two terabyte databases, the amount of time required to perform error checking would be excessive.
*Design Decision Point*
For this solution, it's decided to keep databases below 500 GB to allow efficient error checking.
맨 위로 이동
Determine Minimum Number of Databases Required
In a previous step, you determined that the database size shouldn't exceed 500 GB. If you enter a maximum database size of 500 GB into the Exchange 2010 Mailbox Server Role Requirements Calculator, the recommended minimum number of databases is 14. The exact number of databases may be adjusted in future steps to accommodate the database copy layout.
Database configuration
맨 위로 이동
Design Database Copy Layout
You can use the following steps to design a database copy layout.
Step 1: Determine number of unique Exchange databases in the DAG
The easiest way to determine the optimal number of Exchange databases to deploy is to use the Exchange 2010 Mailbox Server Role Requirements Calculator. To download the calculator, see E2010 Mailbox Server Role Requirements Calculator. For additional information about using the storage calculator, see Exchange 2010 Mailbox Server Role Requirements Calculator. Enter the appropriate information on the input worksheet and then select Yes for Automatically Calculate Number of Unique Databases / DAG.
Step 2: Determine number of database copies per Mailbox server
In a previous step, it was determined that the minimum number of unique databases that should be deployed per DAG is 14. In an active/passive configuration with four copies distributed across four servers in the primary site and four servers in the secondary site, a minimum of 16 databases is needed to support an optimal database layout.
Step 3: Determine database layout during normal operating conditions
Consider equally distributing the C1 database copies (or the copies with an activation preference value of 1) to the servers in the primary datacenter for DAG1. These are the copies that will be active during normal operating conditions.
Database copy layout during normal operating conditions
DB | MBX1 | MBX2 | MBX3 | MBX4 |
---|---|---|---|---|
DB1 |
C1 |
|||
DB2 |
C1 |
|||
DB3 |
C1 |
|||
DB4 |
C1 |
|||
DB5 |
C1 |
|||
DB6 |
C1 |
|||
DB7 |
C1 |
|||
DB8 |
C1 |
|||
DB9 |
C1 |
|||
DB10 |
C1 |
|||
DB11 |
C1 |
|||
DB12 |
C1 |
|||
DB13 |
C1 |
|||
DB14 |
C1 |
|||
DB15 |
C1 |
|||
DB16 |
C1 |
In the preceding table, the following applies:
- C1 = active copy (activation preference value of 1) during normal operations
Next, distribute the C2 database copies (or the copies with an activation preference value of 2) to the servers in the second failure domain. During the distribution, you distribute the C2 copies across as many servers in the alternate failure domain as possible to ensure that a single server failure has a minimal impact on the servers in the alternate failure domain.
Database copy layout with C2 database copies distributed
DB | MBX1 | MBX3 | MBX4 | MBX6 |
---|---|---|---|---|
DB1 |
C1 |
C2 |
||
DB2 |
C1 |
C2 |
||
DB3 |
C1 |
C2 |
||
DB4 |
C1 |
C2 |
||
DB5 |
C1 |
C2 |
||
DB6 |
C1 |
C2 |
||
DB7 |
C1 |
C2 |
||
DB8 |
C1 |
C2 |
In the preceding table, the following applies:
C1 = active copy (activation preference value of 1) during normal operations
C2 = passive copy (activation preference value of 2) during normal operations
Consider the opposite configuration for the other failure domain. Again, you distribute the C2 copies across as many servers in the alternate failure domain as possible to ensure that a single server failure has a minimal impact on the servers in the alternate failure domain.
Database copy layout with C2 database copies distributed in the opposite configuration
DB | MBX1 | MBX2 | MBX3 | MBX4 |
---|---|---|---|---|
DB9 |
C2 |
C1 |
||
DB10 |
C2 |
C1 |
||
DB11 |
C2 |
C1 |
||
DB12 |
C2 |
C1 |
||
DB13 |
C2 |
C1 |
||
DB14 |
C2 |
C1 |
||
DB15 |
C2 |
C1 |
||
DB16 |
C2 |
C1 |
In the preceding table, the following applies:
C1 = active copy (activation preference value of 1) during normal operations
C2 = passive copy (activation preference value of 2) during normal operations
Step 4: Determine database layout during server failure and maintenance conditions
Before the secondary datacenter is added and the C3 copies are distributed, examine the following server failure scenario. In the following example, if server MBX1 fails, the active database copies will automatically move to servers MBX3 and MBX4. Notice that each of the servers in the alternate failure domain is now running with six active databases and the active databases are equally distributed across the two servers.
Database copy layout during server maintenance or failure
In the preceding table, the following applies:
C1 = active copy (activation preference value of 1) during normal operations
C2 = passive copy (activation preference value of 2) during normal operations
In a maintenance scenario, you could move the active mailbox databases from the servers in the first failure domain (MBX1, MBX2) to the servers in the second failure domain (MBX3, MBX4), complete maintenance activities, and then move the active database copies back to the C1 copies on the servers in the first failure domain. You can conduct maintenance activities on all servers in the primary datacenter in two passes.
Database copy layout during server maintenance
In the preceding table, the following applies:
C1 = active copy (activation preference value of 1) during normal operations
C2 = passive copy (activation preference value of 2) during normal operations
Step 5: Add database copies to secondary datacenter to support site resiliency
The next step is to add the C3 copies (or copies with an activation preference value of 3) to the servers in the secondary datacenter to provide site resiliency. You want to distribute the C3 copies across servers in both failure domains to ensure that any issues impacting multiple Mailbox servers in the primary datacenter have a minimal impact on the servers in the secondary datacenter. In a full site failure scenario, all C3 copies in the secondary datacenter will be activated, so the distribution of database copies in relation to servers in the primary datacenter is less important.
Database copy layout to support site resiliency with C3 copies added
DB | MBX1 | MBX2 | MBX3 | MBX4 | MBX5 | MBX6 | MBX7 | MBX8 | |
---|---|---|---|---|---|---|---|---|---|
DB1 |
C1 |
C2 |
C3 |
||||||
DB2 |
C1 |
C2 |
C3 |
||||||
DB3 |
C1 |
C2 |
C3 |
||||||
DB4 |
C1 |
C2 |
C3 |
||||||
DB5 |
C1 |
C2 |
C3 |
||||||
DB6 |
C1 |
C2 |
C3 |
||||||
DB7 |
C1 |
C2 |
C3 |
||||||
DB8 |
C1 |
C2 |
C3 |
||||||
DB9 |
C2 |
C1 |
C3 |
||||||
DB10 |
C2 |
C1 |
C3 |
||||||
DB11 |
C2 |
C1 |
C3 |
||||||
DB12 |
C2 |
C1 |
C3 |
||||||
DB13 |
C2 |
C1 |
C3 |
||||||
DB14 |
C2 |
C1 |
C3 |
||||||
DB15 |
C2 |
C1 |
C3 |
||||||
DB16 |
C2 |
C1 |
C3 |
In the preceding table, the following applies:
C1 = active copy (activation preference value of 1) during normal operations
C2 = passive copy (activation preference value of 2) during normal operations
C3 = Remote Passive copy (activation preference value of 3) during normal operations
The last step in the database copy layout is to add the C4 copies (or copies with an activation preference value of 4) to the servers in the secondary datacenter to provide server resiliency during a site failure event. Distribute the C4 copies across servers in the alternate failure domains to ensure that any issues impacting Mailbox servers in the secondary datacenter have a minimal impact on the servers in the alternate failure domain in the secondary datacenter.
Database copy layout to support site resiliency with C3 and C4 copies added
DB | MBX1 | MBX2 | MBX3 | MBX4 | MBX5 | MBX6 | MBX7 | MBX8 | |
---|---|---|---|---|---|---|---|---|---|
DB1 |
C1 |
C2 |
C3 |
C4 |
|||||
DB2 |
C1 |
C2 |
C3 |
C4 |
|||||
DB3 |
C1 |
C2 |
C4 |
C3 |
|||||
DB4 |
C1 |
C2 |
C4 |
C3 |
|||||
DB5 |
C1 |
C2 |
C3 |
C4 |
|||||
DB6 |
C1 |
C2 |
C3 |
C4 |
|||||
DB7 |
C1 |
C2 |
C4 |
C3 |
|||||
DB8 |
C1 |
C2 |
C4 |
C3 |
|||||
DB9 |
C2 |
C1 |
C3 |
C4 |
|||||
DB10 |
C2 |
C1 |
C3 |
C4 |
|||||
DB11 |
C2 |
C1 |
C4 |
C3 |
|||||
DB12 |
C2 |
C1 |
C4 |
C3 |
|||||
DB13 |
C2 |
C1 |
C3 |
C4 |
|||||
DB14 |
C2 |
C1 |
C3 |
C4 |
|||||
DB15 |
C2 |
C1 |
C4 |
C3 |
|||||
DB16 |
C2 |
C1 |
C4 |
C3 |
In the preceding table, the following applies:
C1 = active copy (activation preference value of 1) during normal operations
C2 = passive copy (activation preference value of 2) during normal operations
C3 = remote passive copy (activation preference value of 3) during normal operations
C4 = remote passive copy (activation preference value of 4) during normal operations
During a site failure event, the active database distribution should be equally distributed across the servers in the secondary datacenter.
Database copy layout to support site resiliency with equal active database distribution
In the preceding table, the following applies:
C1 = active copy (activation preference value of 1) during normal operations
C2 = passive copy (activation preference value of 2) during normal operations
C3 = remote passive copy (activation preference value of 3) during normal operations
C4 = remote passive copy (activation preference value of 4) during normal operations
During a site failure event and a multiple server failure or maintenance event, the active database distribution should be equally distributed across the servers in the alternate failure domain in the secondary datacenter.
Database copy layout to support site resiliency with equal active database distribution in alternate failure domain
In the preceding table, the following applies:
C1 = active copy (activation preference value of 1) during normal operations
C2 = passive copy (activation preference value of 2) during normal operations
C3 = remote passive copy (activation preference value of 3) during normal operations
C4 = remote passive copy (activation preference value of 4) during normal operations
맨 위로 이동
Determine Storage Design
A well designed storage solution is a critical aspect of a successful Exchange 2010 Mailbox server role deployment. For more information, see 사서함 서버 저장소 디자인.
Step 1: Summarize storage requirements
The following table summarizes the storage requirements that have been calculated or determined in a previous design step.
Summary of disk space requirements
Disk space requirements | Value |
---|---|
Average mailbox size on disk (MB) |
623 |
Database capacity required (GB) |
14602 |
Log capacity required (GB) |
1855 |
Total capacity required (GB) |
16457 |
Total capacity required for two database copies (GB) |
32913 |
Total capacity required (terabytes) |
32 |
Step 2: Determine whether logs and databases will be co-located on the same LUN
In a previous step, it was decided to implement a VSS-based backup solution using EMC Replication Manager and EMC SnapView. To achieve the best restore granularity, EMC recommends you place the database and its corresponding logs in separate LUNs. With VSS-type backup solutions, this shortens the restore window and provides better performance.
*Design Decision Point*
Following an EMC best practice, databases and logs will be on different LUNs.
Step 3: Determine number of LUNs per Mailbox server
In a previous step, it was identified that each primary Mailbox server would support four active databases, and each database would have a separate LUN for databases and logs. In addition, a single LUN will be provisioned for the .vhd file (the file supporting the Hyper-V VM). There will be a total of nine LUNs for each Mailbox server.
Number of LUNs required per Mailbox server
LUN types | LUNs per server |
---|---|
Active database LUNs |
4 |
Active log LUNs |
4 |
.vhd file LUNs |
1 |
Total LUNS |
9 |
Step 4: Define building block
To simplify the remainder of the storage design steps, use a building block approach. In this solution, each database supports 625 active mailboxes. Each Mailbox server supports four databases or 2,500 active mailboxes on four database LUNs and four log LUNs. An eight-LUN building block will be used, which supports increments of 2,500 mailboxes.
Step 5: Determine IOPS requirements of the building block
In this step, calculate the transactional IOPS required to support the 2,500 active mailbox users in the building block. The maximum target concurrency is 70 percent. However, calculate IOPS for 100 percent of 2,500 mail users to ensure that IOPS capacity isn't under provisioned. In a subsequent step, you will use the IOPS requirements to determine the number of spindles to deploy for the building block. Use the following calculation:
Total transactional required IOPS for building block = IOPS per mailbox user × number of mailboxes × I/O overhead factor
= 0.15 × 2500 × 20%
= 450 IOPS per Exchange VM
Step 6: Determine initial database capacity requirements of building block
In this step, determine the database storage capacity required to support the 2,500 active mailbox users in the building block. Even though maximum concurrency is 70 percent, you need to account for capacity of 100 percent of the 2,500 mail users. In a subsequent step, you will use the capacity requirements to determine the number of spindles to deploy for the building block. Use the following calculations:
Database files capacity = (number of mailboxes × mailbox size on disk × database overhead growth factor) × (20% data overhead)
= (2500 × 623 × 1) × (1.2)
= 1869000 MB
= 1825 GB
Database catalog capacity = 10% of database files capacity
= 183 GB
Total database capacity = (database size) + (index size) ÷ 0.80 to provide 20% volume free space
= (1825 + 183) ÷ 0.8
= 2510 GB
The four databases in the building block require 2,510 GB of storage capacity.
Step 7: Determine initial log capacity requirements of building block
In this step, determine the log storage capacity required to support the 2,500 active mailbox users in the building block. Even though maximum concurrency is 70 percent, you need to account for capacity of 100 percent of the 2,500 mail users. In a subsequent step, you will use the capacity requirements to determine the number of spindles to deploy for the building block. Use the following calculations:
Building block log capacity required = number of mailbox users × number of logs per mailbox per day × log size × (number of days required to replace failed infrastructure) + (mailbox move percent overhead)
= (2500 x 30 x 1024 x 3) + (2500 x 0.01 x 500)
=237500 MB
=232 GB
Total log capacity = log capacity ÷ 0.80 to give 20% volume free space
= 232 ÷ 0.80
= 290
The four sets of logs in the building block require 290 GB of storage capacity.
Step 8 Determine number of spindles required to support IOPS requirements of the building block
In this step, determine the number of spindles required to support the IOPS requirements. In the next step, you will determine the spindle count that meets the capacity requirements.
In a previous step, you determined that the IOPS required to support the 2,500 mailbox building block was 450. In this step, calculate the number of disks required to meet the IOPS requirements. Use the following calculation:
Disk count = (user IOPS × read ratio) + write penalty (user IOPS × write ratio) ÷ IOPS capability of disk type chosen
= (450 × 0.6) + 4 × (450 × 0.4) ÷ 155
= 6.4
The IOPS requirements can be met by seven disks in RAID-5 configuration.
참고
These calculations are specific to this EMC solution. You should consult your storage vendor for guidance on spindle requirements for your chosen storage solution.
Step 9: Determine number of spindles required to support the capacity requirements of the building block
In a previous step, you determined that the 2,500 mailbox building block for an initially provisioned mailbox of 500 MB required a storage capacity of 2,510 GB for database LUNs and 290 GB for log LUNs. The useable capacity per a 450-GB spindle in a RAID-5 configuration on the CX4 model 480 is approximately 402 GB. To determine the number of spindles required, use the following calculation:
Disk count = (total capacity required) ÷ (useable capacity per spindle with RAID-5)
= (2510 GB + 290 GB) ÷ 327 GB
= 8.6
The database capacity requirements can be met with nine disks.
Step 10: Determine number of spindles per building block
In a previous step, you determined that the 2,500 mailbox users in the building block required five spindles to support the IOPS requirements and nine spindles to support the capacity requirements. The spindle count will be determined by capacity requirements. To make planning and layout easier, use 10 spindles per building block.
Step 11: Determine number of spindles per array
Each site has a single array supporting storage requirements for eight Mailbox servers. The storage requirements of each Mailbox server are represented by a building block. In a previous step, you determined that each building block requires 10 spindles. Each array will require 80 spindles.
맨 위로 이동
Determine Placement of the File Share Witness
In Exchange 2010, the DAG uses a minimal set of components from Windows failover clustering. One of those components is the quorum resource, which provides a means for arbitration when determining cluster state and making membership decisions. It's critical that each DAG member have a consistent view of how the DAGs underlying cluster is configured. The quorum acts as the definitive repository for all configuration information relating to the cluster. The quorum is also used as a tiebreaker to avoid split brain syndrome. Split brain syndrome is a condition that occurs when DAG members can't communicate with each other but are available and running. Split brain syndrome is prevented by always requiring a majority of the DAG members (and in the case of DAGs with an even number of members, the DAG witness server) to be available and interacting for the DAG to be operational.
A witness server is a server outside of a DAG that hosts the file share witness, which is used to achieve and maintain quorum when the DAG has an even number of members. DAGs with an odd number of members don't use a witness server. Upon creation of a DAG, the file share witness is added by default to a Hub Transport server (that doesn't have the Mailbox server role installed) in the same site as the first member of the DAG. If your Hub Transport server is running in a VM that resides on the same root server as VMs running the Mailbox server role, we recommend that you move the location of the file share witness to another highly available server. If the witness server and a DAG member are both guest VMs on the same root machine, you no longer have a highly available solution. You can move the file share witness to a domain controller, but because of security implications, do this only as a last resort.
*Design Decision Point*
This solution has dedicated servers in each site running the Replication Manager software. For this solution, these servers will be an excellent location for the first file share witnesses. In site 1, the file share witness for DAG1 and the alternate file share witness for DAG2 will be hosted on server RM1. In site 2, the file share witness for DAG2 and the alternate file share witness for DAG1 will be hosted on server RM2.
맨 위로 이동
Plan Namespaces
When you plan your Exchange 2010 organization, one of the most important decisions that you must make is how to arrange your organization's external namespace. A namespace is a logical structure usually represented by a domain name in Domain Name System (DNS). When you define your namespace, you must consider the different locations of your clients and the servers that house their mailboxes. In addition to the physical locations of clients, you must evaluate how they connect to Exchange 2010. The answers to these questions will determine how many namespaces you must have. Your namespaces will typically align with your DNS configuration. We recommend that each Active Directory site in a region that has one or more Internet-facing Client Access servers have a unique namespace. This is usually represented in DNS by an A record, for example, mail.contoso.com or mail.europe.contoso.com.
For more information, see 클라이언트 액세스 서버 네임스페이스 이해.
There are a number of different ways to arrange your external namespaces, but usually your requirements can be met with one of the following namespace models:
Consolidated datacenter model This model consists of a single physical site. All servers are located within the site, and there is a single namespace, for example, mail.contoso.com.
Single namespace with proxy sites This model consists of multiple physical sites. Only one site contains an Internet-facing Client Access server. The other sites aren't exposed to the Internet. There is only one namespace for the sites in this model, for example, mail.contoso.com.
Single namespace and multiple sites This model consists of multiple physical sites. Each site can have an Internet-facing Client Access server. Alternatively, there may be only a single site that contains Internet-facing Client Access servers. There is only one namespace for the sites in this model, for example, mail.contoso.com.
Regional namespaces This model consists of multiple physical sites and multiple namespaces. For example, a site located in New York City would have the namespace mail.usa.contoso.com, a site located in Toronto would have the namespace mail.canada.contoso.com, and a site located in London would have the namespace mail.europe.contoso.com.
Multiple forests This model consists of multiple forests that have multiple namespaces. An organization that uses this model could be made up of two partner companies, for example, Contoso and Fabrikam. Namespaces might include mail.usa.contoso.com, mail.europe.contoso.com, mail.asia.fabrikam.com, and mail.europe.fabrikam.com.
*Design Decision Point*
For this scenario, the regional namespaces model is selected because it's the best fit for organizations with active mailboxes in multiple sites.
The advantage of this model is that proxying is reduced because a larger percentage of users will be able to connect to a Client Access server in the same Active Directory site as their Mailbox server. This will improve the end-user experience and performance. Users who have mailboxes in a site that doesn't have an Internet-facing Client Access server will still be proxied.
This solution also has the following configuration requirements:
Multiple DNS records must be managed.
Multiple certificates must be obtained, configured, and managed.
Managing security is more complex because each Internet-facing site requires a Microsoft Forefront Threat Management Gateway computer or other reverse-proxy or firewall solution.
Users must connect to their own regional namespace. This may result in additional Help desk calls and training.
맨 위로 이동
Determine Client Access Server Array and Load Balancing Strategy
In Exchange 2010, the RPC Client Access service and the Exchange Address Book service were introduced on the Client Access server role to improve the mailbox users experience when the active mailbox database copy is moved to another Mailbox server (for example, during mailbox database failures and maintenance events). The connection endpoints for mailbox access from Microsoft Outlook and other MAPI clients have been moved from the Mailbox server role to the Client Access server role. Therefore, both internal and external Outlook connections must now be load balanced across all Client Access servers in the site to achieve fault tolerance. To associate the MAPI endpoint with a group of Client Access servers rather than a specific Client Access server, you can define a Client Access server array. You can only configure one array per Active Directory site, and an array can't span more than one Active Directory site. For more information, see RPC 클라이언트 액세스 이해 and Exchange 2010의 부하 분산 이해.
*Design Decision Point*
Because this is a two site deployment with eight servers running the Client Access server role in each site, there will be two Client Access server arrays. Because the limits of Windows NLB may be reached with eight servers to load balance in each site, hardware load balancing options will be considered.
맨 위로 이동
Determine Hardware Load Balancing Model
Use the following steps to determine a hardware load balancing model.
Step 1: Identify preferred hardware load balancing vendor
In this example, the preferred vendor is Brocade because they have been providing reliable, high performance solutions for the datacenter for the past 15 years. Brocade ServerIron intelligent application delivery and traffic management solutions have led the industry for over a decade, helping to mitigate costs and prevent losses by optimizing business-critical enterprise and service provider applications with high availability, security, multisite redundancy, acceleration, and scalability, in more than 3,000 of the world's most demanding organizations.
Step 2: Review available options from preferred vendor
Brocade offers the ServerIron ADX Series of load balancers and application delivery controllers. They support Layer 4 through 7 switching with industry leading performance. The ServerIron ADX Series comes in an intelligent, modular application delivery controller platform and is offered in various configurations.
The ServerIron ADX comes in three primary platforms: 1000, 4000, and 10000. For most Exchange environments, a version of the ADX 1000 should suffice. For organizations that need to load balance several applications in the datacenter or when large scale chassis reconfiguration or expansion is required, the unique design of the ServerIron ADX 4000 and 10000 provides a dedicated backplane to support application, data, and management functionality through specialized modules.
The ServerIron ADX 1000 offers an incremental deployment pricing model (known as pay as you grow) so that you can scale the capacity of the ServerIron ADX Series as your organization grows.
The ServerIron ADX 1000 Series includes four models of varying processor and port capacity, all based on the full hardware platform and operating software:
ADX 1008-1 1 application core and 8 x 1 gigabit Ethernet (GbE) ports
ADX 1016-2 2 application cores and 16 x 1 GbE ports
ADX 1016-4 4 application cores and 16 x 1 GbE ports
ADX 1216-4 4 application cores, 16 x 1 GbE ports, and 2 x 10 GbE ports
Depending on the model selected, a specific number of application cores, interface ports, hardware acceleration, and software capabilities are enabled. The remaining untapped capacity can be unlocked by applying license upgrade key codes.
For more information about the ADX 1000 and other ADC platforms from Brocade, see Brocade ServerIron ADX Series.
Step 3: Select a hardware load balancing model
The entry-level ServerIron ADX 1008-1 model with a single application core and 8 x 1 GbE ports is selected. This model will meet the current load balancing requirements and provide flexibility to add additional capacity and features to meet future business needs.
맨 위로 이동
Determine Hardware Load Balancing Device Resiliency Strategy
The ServerIron ADX 1000 can be configured in one of the following configurations:
Stand-alone device
Active/hot-standby
Active/active
The recommended device resiliency strategy is active/hot-standby. In a typical hot-standby configuration, one ServerIron application delivery controller is the active device and performs all the Layer 2 switching as well as the Layer 4 server load balancing switching. The other ServerIron application delivery controller monitors the switching activities and remains in a hot-standby role. If the active ServerIron application delivery controller becomes unavailable, the standby ServerIron application delivery controller immediately assumes the unavailable ServerIron application delivery controller's responsibilities. The failover from the unavailable ServerIron application delivery controller to the standby ServerIron application delivery controller is transparent to users. Both ServerIron application delivery controller switches share a common MAC address known to the clients. If a failover occurs, the clients still know the ServerIron application delivery controller by the same MAC address. The active sessions running on the clients continue, and the clients and routers don't need an Address Resolution Protocol (ARP) request for the ServerIron MAC address.
*Design Decision Point*
The load balancer device shouldn't be a single point of failure, and there is no reason to deploy an active/active configuration. The load balancer resiliency strategy will include two ServerIron ADX 1000 load balancers in an active/hot-standby configuration.
맨 위로 이동
Determine Hardware Load Balancing Methods
Exchange protocols and client access services have different load balancing requirements. Some Exchange protocols and client access services require client to Client Access server affinity. Others work without it, but display performance improvements from such affinity. Other Exchange protocols don't require client to Client Access server affinity, and performance doesn't decrease without affinity. For additional information, see Exchange 프로토콜의 부하 분산 요구 사항 and Exchange 2010의 부하 분산 이해.
Step 1: Determine load balancing method for Outlook clients
The recommended load balancing method for Outlook client traffic is Source IP Port Persistence. In this method, the load balancer looks at a client IP address and sends all traffic from a certain source/client IP to a given Client Access server. The source IP method has two limitations:
Whenever the IP address of the client changes, the affinity is lost. However, the user impact is acceptable as long as this occurs infrequently.
Having a large number of clients from the same IP address leads to uneven distribution. Distribution of traffic among the Client Access servers then depends on how many clients are arriving from a specific IP address. Clients may arrive from the same IP address because of the following:
Network address translation (NAT) or outgoing proxy servers (for example, Microsoft Forefront Threat Management Gateway) In this case, the original client IP addresses are masked by NAT or outgoing proxy server IP addresses.
Client Access server to Client Access server proxy traffic One Client Access server can proxy traffic to another Client Access server. This typically occurs between Active Directory sites, because most Exchange 2010 traffic needs to be handled by either a Client Access server in the same Active Directory site as the mailbox being accessed or a Client Access server with the same major version as the mailbox being accessed. In a single site configuration, this isn't an issue.
Step 2: Determine whether static ports will be used
When an Outlook client connects directly to the Client Access server using the RPC Client Access service and the Exchange Address Book service, the endpoint TCP ports for these services are allocated by the RPC endpoint manager. By default, this requires a large range of destination ports to be configured for load balancing without the ability to specifically target traffic for these services based on a port number. You can statically map these services to specific port numbers to simplify load balancing (and perhaps make it easier to enforce restrictions on network traffic via firewall applications or devices). If the ports for these services are statically mapped, the traffic will be restricted to port 135 (used by the RPC port mapper) and the two specific ports that were selected for these services.
*Design Decision Point*
It was decided to implement static port mapping for the RPC Client Access service and the Exchange Address Book service. These ports will be set to 60000 and 60001 respectively.
For information about how to configure static port mapping, see Exchange 프로토콜의 부하 분산 요구 사항.
맨 위로 이동
Solution Overview
The previous section provided information about the design decisions that were made when considering an Exchange 2010 solution. The following section provides an overview of the solution.
맨 위로 이동
Logical Solution Diagram
This solution consists of 32 Exchange 2010 servers deployed in a multiple site topology. Sixteen of the 32 servers are running both the Client Access and Hub Transport server roles. The other 16 servers are running the Mailbox server role. There are two namespaces, one for each site, which are load balanced across eight combination Client Access and Hub Transport servers in a Client Access server array in each site. Eight of the 16 Mailbox servers are members of one DAG, and the other eight servers are members of a second DAG. Each DAG spans both sites, with half of the Mailbox servers in the primary datacenter and the remaining servers in the secondary datacenter. The EMC Replication Manager servers in each site serve as file share witness servers, and host the primary and alternate file share witness servers for both DAGs.
Logical solution
맨 위로 이동
Physical Solution Diagram
The solution consists of four Dell PowerEdge R910 servers connected to two EMC CLARiiON CX4 model 480 storage arrays via six redundant Brocade 300 Fibre Channel switches. Redundant pairs of Brocade ServerIron ADX 1000 series devices load balance client traffic across the Client Access server arrays in both sites. Redundant Brocade FastIron Ethernet switches and NetIron MLX routers provide the underlying network infrastructure.
Physical solution
맨 위로 이동
Server Hardware Summary
The following table summarizes the physical server hardware used in this solution.
Server hardware
Component | Value or description |
---|---|
Server vendor |
Dell |
Server model |
PowerEdge R910 |
Processor |
4 × eight-core Intel Xeon 7560 2.26 GHz |
Chipset |
Intel 7500 chipset |
Memory |
192 GB (16 x 8 GB) |
Operating system |
Windows Server 2008 R2 |
Virtualization |
Hyper-V |
Internal disk |
2 × 2.5" 10,000 rpm SAS 300 GB |
RAID controller |
PERC H700 (6 gigabits per second (Gbps)) |
Host bus adapter |
Brocade 825 dual-port 8 Gbps |
Network interface |
4-port (4 × 1 gigabit Ethernet) embedded network interface card (NIC) Broadcom 5709c |
Power |
4 × 750 watts (Energy Smart PSU ) |
맨 위로 이동
Client Access and Hub Transport Server Configuration
The following table summarizes the Client Access and Hub Transport server configuration used in this solution.
Client Access and Hub Transport server configuration
Component | Value or description |
---|---|
Physical or virtual |
Hyper-V VM |
Virtual processors |
4 |
Memory |
8 GB |
Storage |
Virtual hard drive on SAN |
Operating system |
Windows Server 2008 R2 |
Exchange version |
Exchange 2010 Update Rollup 3 |
맨 위로 이동
Mailbox Server Configuration
The following table summarizes the Mailbox server configuration used in this solution.
Mailbox server configuration
Component | Value or description |
---|---|
Physical or virtual |
Hyper-V VM |
Virtual processors |
4 |
Memory |
40 GB |
Storage |
Virtual hard drive on SAN |
Pass-through storage |
Yes |
Operating system |
Windows Server 2008 R2 |
Exchange version |
Exchange 2010 Update Rollup 3 |
맨 위로 이동
Database Layout
The following diagrams summarize the database copy layout used in this solution during normal operating conditions.
Database layout of DAG1 for normal operating conditions
Database layout of DAG2 for normal operating conditions
맨 위로 이동
EMC Replication Enabler Exchange 2010
When deploying a DAG with EMC Replication Enabler Exchange 2010, the function of active and passive copies differs from their function with native DAG replication. With Replication Enabler Exchange 2010, there's one primary shared storage (source images) system per Exchange database. This primary storage synchronously replicates to the secondary storage (target images) on the remote site. Source images are shared between multiple Mailbox servers within the same site, and the target images are shared between the Mailbox servers at the remote site. These shared images work similar to a single copy cluster within the same site. When a switchover/failover is initiated, a best effort is made to move the database to one of the Mailbox servers present at the local site. An attempt to move the databases to the remote Mailbox servers happens only when Replication Enabler Exchange 2010 is unable to move the database to the servers within the local site. If there are multiple Mailbox servers within the same site, the activation preference setting determines which Mailbox server (within the site) is attempted first.
A DAG in third-party replication mode with Replication Enabler Exchange 2010 accommodates this site resiliency requirement by enabling remote images (copies) on the secondary storage at the remote site. This process is similar to performing a site switchover with a native DAG and can be accomplished by using integrated Windows PowerShell cmdlets that are available after the Replication Enabler Exchange 2010 installation.
EMC Replication Enabler Exchange 2010
맨 위로 이동
Network Switch Hardware Summary
The following table summarizes the network switch hardware used in this solution.
Network switch hardware summary
Component | Value or description |
---|---|
Vendor |
Brocade |
Model |
FastIron GS 624-P |
Power over Ethernet (PoE) ports |
24 |
Port bandwidth |
10/100/1000 Mbps RJ45 |
10-gigabit Ethernet |
2 |
맨 위로 이동
Load Balancing Hardware Summary
The following table summarizes the load balancing hardware used in this solution.
Load balancing hardware summary
Component | Value or description |
---|---|
Vendor |
Brocade |
Model |
ADX 1000 |
Licensing option |
ADX 1008-1 |
Application cores |
1 |
GbE ports |
8 |
For more information about this and other ADC platforms from Brocade, see Brocade ServerIron ADX Series.
맨 위로 이동
Storage Hardware Summary
The following table provides information about storage hardware used in this solution.
Storage hardware summary
Item | Value or description |
---|---|
Storage |
2 CLARiiON CX4 model 480 (1 per site) |
Storage connectivity (Fibre Channel, SAS, SATA, iSCSI) |
Fibre Channel |
Storage cache |
32 GB (600 MB read cache, and 10,160 MB write cache per storage port) |
Storage controllers |
2 per storage frame |
Storage ports available or used |
8 (4 per storage port) available per storage frame, 4 used (2 per storage port) |
Maximum bandwidth of storage connectivity to host |
8 × 4 Gbps (4 used in this solution) |
Total number of disks tested in solution |
80 disks per storage array (160 for both sites) |
Maximum number of spindles that can be hosted in the storage |
480 in a single storage array |
맨 위로 이동
Storage Configuration
Each of the CX4 model 480 storage arrays used in the solution were configured as illustrated in the following table.
Storage configuration
Component | Value or description |
---|---|
Storage enclosures |
2 |
Volumes per enclosure |
72 |
Volumes per Mailbox server |
9 |
Volume size (database) |
590 GB |
Volume size (log) |
80 GB |
RAID level |
RAID-5 |
Storage pools |
16 |
Storage pools per enclosure |
8 |
Disks per storage pool |
10 |
The following table illustrates how the available storage was designed and allocated between the two CX4 model 480 storage systems.
DAG1 storage configuration between CX4 model 480 storage systems
Database | Array1 | Database | Array2 | |
---|---|---|---|---|
DB1 |
C1/C2 |
DB1 |
C3/C4 |
|
DB2 |
C1/C2 |
DB2 |
C3/C4 |
|
DB3 |
C1/C2 |
DB3 |
C3/C4 |
|
DB4 |
C1/C2 |
DB4 |
C3/C4 |
|
DB5 |
C1/C2 |
DB5 |
C3/C4 |
|
DB6 |
C1/C2 |
DB6 |
C3/C4 |
|
DB7 |
C1/C2 |
DB7 |
C3/C4 |
|
DB8 |
C1/C2 |
DB8 |
C3/C4 |
|
DB9 |
C1/C2 |
DB9 |
C3/C4 |
|
DB10 |
C1/C2 |
DB10 |
C3/C4 |
|
DB11 |
C1/C2 |
DB11 |
C3/C4 |
|
DB12 |
C1/C2 |
DB12 |
C3/C4 |
|
DB13 |
C1/C2 |
DB13 |
C3/C4 |
|
DB14 |
C1/C2 |
DB14 |
C3/C4 |
|
DB15 |
C1/C2 |
DB15 |
C3/C4 |
|
DB16 |
C1/C2 |
DB16 |
C3/C4 |
DAG2 storage configuration between CX4 model 480 storage systems
Database | Array1 | Database | Array2 | |
---|---|---|---|---|
DB17 |
C3/C4 |
DB17 |
C1/C2 |
|
DB18 |
C3/C4 |
DB18 |
C1/C2 |
|
DB19 |
C3/C4 |
DB19 |
C1/C2 |
|
DB20 |
C3/C4 |
DB20 |
C1/C2 |
|
DB21 |
C3/C4 |
DB21 |
C1/C2 |
|
DB22 |
C3/C4 |
DB22 |
C1/C2 |
|
DB23 |
C3/C4 |
DB23 |
C1/C2 |
|
DB24 |
C3/C4 |
DB24 |
C1/C2 |
|
DB25 |
C3/C4 |
DB25 |
C1/C2 |
|
DB26 |
C3/C4 |
DB26 |
C1/C2 |
|
DB27 |
C3/C4 |
DB27 |
C1/C2 |
|
DB28 |
C3/C4 |
DB28 |
C1/C2 |
|
DB29 |
C3/C4 |
DB29 |
C1/C2 |
|
DB30 |
C3/C4 |
DB30 |
C1/C2 |
|
DB31 |
C3/C4 |
DB31 |
C1/C2 |
|
DB32 |
C3/C4 |
DB32 |
C1/C2 |
맨 위로 이동
Fibre Channel Switch Hardware Summary
The following table summarizes the Fibre Channel switch hardware used in this solution.
Fibre Channel switch hardware summary
Vendor | Brocade |
---|---|
Model |
300 SAN |
GbE ports |
24 |
Port bandwidth |
8 Gbps |
Aggregate bandwidth |
192 Gbps |
For more information about the Brocade 300 SAN switch or other Brocade SAN switches, see Brocade Switches.
The following table summarizes the host bus adapter (HBA) hardware used in this solution.
HBA hardware summary
Vendor | Brocade |
---|---|
Model |
825 8G Fibre Channel HBA |
Ports |
Dual |
Port bandwidth |
8 Gbps (1600 Mbps) |
맨 위로 이동
Solution Validation Methodology
Prior to deploying an Exchange solution in a production environment, validate that the solution was designed, sized, and configured properly. This validation must include functional testing to ensure that the system is operating as desired as well as performance testing to ensure that the system can handle the desired user load. This section describes the approach and test methodology used to validate server and storage design for this solution. In particular, the following tests will be defined in detail:
Performance tests
Storage performance validation (Jetstress)
Server performance validation (Loadgen)
Functional tests
Database switchover validation
Server switchover validation
Server failover validation
Datacenter switchover validation
맨 위로 이동
Storage Design Validation Methodology
The level of performance and reliability of the storage subsystem connected to the Exchange Mailbox server role has a significant impact on the overall health of the Exchange deployment. Additionally, poor storage performance will result in high transaction latency, primarily reflected in poor client experience when accessing the Exchange system. To ensure the best possible client experience, validate storage sizing and configuration via the method described in this section.
Tool Set
For validating Exchange storage sizing and configuration, we recommend the Microsoft Exchange Server Jetstress tool. The Jetstress tool is designed to simulate an Exchange I/O workload at the database level by interacting directly with the ESE, which is also known as Jet. The ESE is the database technology that Exchange uses to store messaging data on the Mailbox server role. Jetstress can be configured to test the maximum I/O throughput available to your storage subsystem within the required performance constraints of Exchange. Or, Jetstress can accept a target profile of user count and per-user IOPS, and validate that the storage subsystem is capable of maintaining an acceptable level of performance with the target profile. Test duration is adjustable and can be run for a minimal period of time to validate adequate performance or for an extended period of time to additionally validate storage subsystem reliability.
The Jetstress tool can be obtained from the Microsoft Download Center at the following locations:
The documentation included with the Jetstress installer describes how to configure and execute a Jetstress validation test on your server hardware.
Approach to Storage Validation
There are two main types of storage configurations:
Direct-attached storage (DAS) or internal disk scenarios
Storage area network (SAN) scenarios
With DAS or internal disk scenarios, there's only one server accessing the disk subsystem, so the performance capabilities of the storage subsystem can be validated in isolation.
In SAN scenarios, the storage utilized by the solution may be shared by many servers and the infrastructure that connects the servers to the storage may also be a shared dependency. This requires additional testing, as the impact of other servers on the shared infrastructure must be adequately simulated to validate performance and functionality.
Test Cases for Storage Validation
The following storage validation test cases were executed against the solution and should be considered as a starting point for storage validation. Specific deployments may have other validation requirements that can be met with additional testing, so this list isn't intended to be exhaustive:
Validation of worst case database switchover scenario In this test case, the level of I/O is expected to be serviced by the storage subsystem in a worst case switchover scenario (largest possible number of active copies on fewest servers). Depending on whether the storage subsystem is DAS or SAN, this test may be required to run on multiple hosts to ensure that the end-to-end solution load on the storage subsystem can be sustained.
Validation of storage performance under storage failure and recovery scenario (for example, failed disk replacement and rebuild) In this test case, the performance of the storage subsystem during a failure and rebuild scenario is evaluated to ensure that the necessary level of performance is maintained for optimal Exchange client experience. The same caveat applies for a DAS vs. SAN deployment: If multiple hosts are dependent on a shared storage subsystem, the test must include load from these hosts to simulate the entire effect of the failure and rebuild.
Analyzing the Results
The Jetstress tool produces a report file after each test is completed. To help you analyze the report, use the guidelines in Jetstress 2010 Test Summary Reports.
Specifically, you should use the guidelines in the following table when you examine data in the Test Results table of the report.
Jetstress results analysis
Performance counter instance | Guidelines for performance test |
---|---|
I/O Database Reads Average Latency (msec) |
The average value should be less than 20 milliseconds (msec) (0.020 seconds), and the maximum values should be less than 50 msec. |
I/O Log Writes Average Latency (msec) |
Log disk writes are sequential, so average write latencies should be less than 10 msec, with a maximum of no more than 50 msec. |
%Processor Time |
Average should be less than 80%, and the maximum should be less than 90%. |
Transition Pages Repurposed/sec (Windows Server 2003, Windows Server 2008, Windows Server 2008 R2) |
Average should be less than 100. |
The report file shows various categories of I/O performed by the Exchange system:
Transactional I/O Performance This table reports I/O that represents user activity against the database (for example, Outlook generated I/O). This data is generated by subtracting background maintenance I/O and log replication I/O from the total I/O measured during the test. This data provides the actual database IOPS generated along with I/O latency measurements required to determine whether a Jetstress performance test passed or failed.
Background Database Maintenance I/O Performance This table reports the I/O generated due to ongoing ESE database background maintenance.
Log Replication I/O Performance This table reports the I/O generated from simulated log replication.
Total I/O Performance This table reports the total I/O generated during the Jetstress test.
맨 위로 이동
Server Design Validation
After the performance and reliability of the storage subsystem is validated, ensure that all of the components in the messaging system are validated together for functionality, performance, and scalability. This means moving up in the stack to validate client software interaction with the Exchange product as well as any server-side products that interact with Exchange. To ensure that the end-to-end client experience is acceptable and that the entire solution can sustain the desired user load, the method described in this section can be applied for server design validation.
Tool Set
For validation of end-to-end solution performance and scalability, we recommend the Microsoft Exchange Server Load Generator tool (Loadgen). Loadgen is designed to produce a simulated client workload against an Exchange deployment. This workload can be used to evaluate the performance of the Exchange system, and can also be used to evaluate the effect of various configuration changes on the overall solution while the system is under load. Loadgen is capable of simulating Microsoft Office Outlook 2007 (online and cached), Office Outlook 2003 (online and cached), POP3, IMAP4, SMTP, ActiveSync, and Outlook Web App (known in Exchange 2007 and earlier versions as Outlook Web Access) client activity. It can be used to generate a single protocol workload, or these client protocols can be combined to generate a multiple protocol workload.
You can get the Loadgen tool from the Microsoft Download Center at the following locations:
The documentation included with the Loadgen installer describes how to configure and execute a Loadgen test against an Exchange deployment.
Approach to Server Validation
When validating your server design, test the worst case scenario under anticipated peak workload. Based on a number of data sets from Microsoft IT and other customers, peak load is generally equal to twice the average workload throughout the remainder of the work day. This is referred to as the peak-to-average workload ratio.
Peak load
In this Performance Monitor snapshot, which displays various counters that represent the amount of Exchange work being performed over time on a production Mailbox server, the average value for RPC operations per second (the highlighted line) is about 2,386 when averaged across the entire day. The average for this counter during the peak period from 10:00 through 11:00 is about 4,971, giving a peak-to-average ratio of 2.08.
To ensure that the Exchange solution is capable of sustaining the workload generated during the peak average, modify Loadgen settings to generate a constant amount of load at the peak average level, rather than spreading out the workload over the entire simulated work day. Loadgen task-based simulation modules (like the Outlook simulation modules) utilize a task profile that defines the number of times each task will occur for an average user within a simulated day.
The total number of tasks that need to run during a simulated day is calculated as the number of users multiplied by the sum of task counts in the configured task profile. Loadgen then determines the rate at which it should run tasks for the configured set of users by dividing the total number of tasks to run in the simulated day by the simulated day length. For example, if Loadgen needs to run 1,000,000 tasks in a simulated day, and a simulated day is equal to 8 hours (28,800 seconds), Loadgen must run 1,000,000 ÷ 28,800 = 34.72 tasks per second to meet the required workload definition. To increase the amount of load to the desired peak average, divide the default simulated day length (8 hours) by the peak-to-average ratio (2) and use this as the new simulated day length.
Using the task rate example again, 1,000,000 ÷ 14,400 = 69.44 tasks per second. This reduces the simulated day length by half, which results in doubling the actual workload run against the server and achieving our goal of a peak average workload. You don't adjust the run length duration of the test in the Loadgen configuration. The run length duration specifies the duration of the test and doesn't affect the rate at which tasks will be run against the Exchange server.
Test Cases for Server Design Validation
The following server design validation test cases were executed against the solution and should be considered as a starting point for server design validation. Specific deployments may have other validation requirements that can be met with additional testing, so this list isn't intended to be exhaustive:
Normal operating conditions In this test case, the basic design of the solution is validated with all components in their normal operating state (no failures simulated). The desired workload is generated against the solution, and the overall performance of the solution is validated against the metrics that follow.
Single server failure or single server maintenance (in site) In this test case, a single server is taken down to simulate either an unexpected failure of the server or a planned maintenance operation for the server. The workload that would normally be handled by the unavailable server is now handled by other servers in the solution topology, and the overall performance of the solution is validated.
Test Execution and Data Collection
Exchange performance data has some natural variation within test runs and among test runs. We recommend that you take the average of multiple runs to smooth out this variation. For Exchange tested solutions, a minimum of three separate test runs with durations of eight hours was completed. Performance data was collected for the full eight-hour duration of the test. Performance summary data was taken from a three to four hour stable period (excluding the first two hours of the test and the last hour of the test). For each Exchange server role, performance summary data was averaged between servers for each test run, providing a single average value for each data point. The values for each run were then averaged, providing a single data point for all servers of a like server role across all test runs.
Validation of Expected Load
Before you look at any performance counters or start your performance validation analysis, verify that the workload you expected to run matched the workload that you actually ran. Although there are many ways to determine whether the simulated workload matched the expected workload, the easiest and most consistent way is to look at the message delivery rate.
Calculating Expected Peak Message Delivery Rate
Every message profile consists of the sum of the average number of messages sent per day and the average number of messages received per day. To calculate the message delivery rate, select the average number of messages received per day from the following table.
Peak message delivery rate
Message profile | Messages sent per day | Messages received per day |
---|---|---|
50 |
10 |
40 |
100 |
20 |
80 |
150 |
30 |
120 |
200 |
40 |
160 |
This example assumes that each Mailbox server has 5,000 active mailboxes with a 150 messages per day profile (30 messages sent and 120 messages received per day), as shown in the following table.
Peak message delivery rate for 5,000 mailboxes
Description | Calculation | Value |
---|---|---|
Message profile |
Number of messages received per day |
120 |
Mailbox server profile |
Number of active mailboxes per Mailbox server |
5000 |
Total messages received per day per Mailbox server |
5000 × 120 |
600000 |
Total messages received per second per Mailbox server |
600000 ÷ 28800 |
20.83 |
Total messages adjusted for peak load |
20.83 × 2 |
41.67 |
You expect 41.67 messages per second delivered on each Mailbox server running 5,000 active mailboxes with a message profile of 150 messages per day during peak load.
Measuring Actual Message Delivery Rate
The actual message delivery rate can be measured using the following counter on each Mailbox server: MSExchangeIS Mailbox(_Total)\Messages Delivered/sec. If the measured message delivery rate is within one or two messages per second of the target message delivery rate, you can be confident that the desired load profile was run successfully.
Server Validation: Performance and Health Criteria
This section describes the Performance Monitor counters and thresholds used to determine whether the Exchange environment was sized properly and is able to run in a healthy state during extended periods of peak workload. For more information about counters relevant to Exchange performance, see 성능 및 확장성 카운터 및 임계값.
Hyper-V Root Servers
To validate the performance and health criteria of a Hyper-V root server and the applications running within VMs, you should have a basic understanding of the Hyper-V architecture and how that impacts performance monitoring.
Hyper-V has three main components: the virtualization stack, the hypervisor, and devices. The virtualization stack handles emulated devices, manages VMs, and services I/O. The hypervisor schedules virtual processors, manages interrupts, services timers, and controls other chip-level functions. The hypervisor doesn't handle devices or I/O (for example, there are no hypervisor drivers). The devices are part of the root server or installed in guest servers as part of integration services. Because the root server has a full view of the system and controls the VMs, it also provides monitoring information via Windows Management Instrumentation (WMI) and performance counters.
Processor
When validating physical processor utilization on the root server (or within the guest VM), the standard Processor\% Processor Time counter isn't very useful.
Instead, you can examine the Hyper-V Hypervisor Logical Processor\% Total Run Time counter. This counter shows the percentage of processor time spent in guest and hypervisor runtime and should be used to measure the total processor utilization for the hypervisor and all VMs running on the root server. This counter shouldn't exceed 80 percent or whatever the maximum utilization target you have designed for.
Counter | Target |
---|---|
Hyper-V Hypervisor Logical Processor\% Total Run Time |
<80% |
If you're interested in what percentage of processor time is spent servicing the guest VMs, you can examine the Hyper-V Hypervisor Logical Processor\% Guest Run Time counter. If you're interested in what percentage of processor time is spent in hypervisor, you can look at the Hyper-V Hypervisor Logical Processor\% Hypervisor Run Time counter. This counter should be below 5 percent. The Hyper-V Hypervisor Root Virtual Processor\% Guest Run Time counter shows the percentage of processor time spent in the virtualization stack. This counter should also be below 5 percent. These two counters can be used to determine what percentage of your available physical processor time is being used to support virtualization.
Counter | Target |
---|---|
Hyper-V Hypervisor Logical Processor\% Guest Run Time |
<80% |
Hyper-V Hypervisor Logical Processor\% Hypervisor Run Time |
<5% |
Hyper-V Hypervisor Root Virtual Processor\% Guest Run Time |
<5% |
Memory
You need to ensure that your Hyper-V root server has enough memory to support the memory allocated to VMs. Hyper-V automatically reserves 512 MB (this may vary with different Hyper-V releases) for the root operating system. If you don't have enough memory, Hyper-V will prevent the last VM from starting. In general, don't worry about validating the memory on a Hyper-V root server. Be more concerned with ensuring that sufficient memory is allocated to the VMs to support the Exchange roles.
Application Health
An easy way to determine whether all the VMs are in a healthy state is to look at the Hyper-V Virtual Machine Health Summary counters.
Counter | Target |
---|---|
Hyper-V Virtual Machine Health Summary\Health OK |
1 |
Hyper-V Virtual Machine Health Summary\Health Critical |
0 |
Mailbox Servers
When validating whether a Mailbox server was properly sized, focus on processor, memory, storage, and Exchange application health. This section describes the approach to validating each of these components.
Processor
During the design process, you calculated the adjusted megacycle capacity of the server or processor platform. You then determined the maximum number of active mailboxes that could be supported by the server without exceeding 80 percent of the available megacycle capacity. You also determined what the projected CPU utilization should be during normal operating conditions and during various server maintenance or failure scenarios.
During the validation process, verify that the worst case scenario workload doesn't exceed 80 percent of the available megacycles. Also, verify that actual CPU utilization is close to the expected CPU utilization during normal operating conditions and during various server maintenance or failure scenarios.
For physical Exchange deployments, use the Processor(_Total)\% Processor Time counter and verify that this counter is less than 80 percent on average.
Counter | Target |
---|---|
Processor(_Total)\% Processor Time |
<80% |
For virtual Exchange deployments, the Processor(_Total)\% Processor Time counter is measured within the VM. In this case, the counter isn't measuring the physical CPU utilization. It's measuring the utilization of the virtual CPU provided by the hypervisor. Therefore, it doesn't provide an accurate reading of the physical processor and shouldn't be used for design validation purposes. For more information, see Hyper-V: Clocks lie... which performance counters can you trust.
For validating Exchange deployments running on Microsoft Hyper-V, use the Hyper-V Hypervisor Virtual Processor\% Guest Run Time counter. This provides a more accurate value for the amount of physical CPU being utilized by the guest operating system. This counter should be less than 80 percent on average.
Counter | Target |
---|---|
Hyper-V Hypervisor Virtual Processor\% Guest Run Time |
<80% |
Memory
During the design process, you calculated the amount of database cache required to support the maximum number of active databases on each Mailbox server. You then determined the optimal physical memory configuration to support the database cache and system memory requirements.
Validating whether an Exchange Mailbox server has sufficient memory to support the target workload isn't a simple task. Using available memory counters to view how much physical memory is remaining isn't helpful because the memory manager in Exchange is designed to use almost all of the available physical memory. The information store (store.exe) reserves a large portion of physical memory for database cache. The database cache is used to store database pages in memory. When a page is accessed in memory, the information doesn't have to be retrieved from disk, reducing read I/O. The database cache is also used to optimize write I/O.
When a database page is modified (known as a dirty page), the page stays in cache for a period of time. The longer it stays in cache, the better the chance that the page will be modified multiple times before those changes are written to the disk. Keeping dirty pages in cache also causes multiple pages to be written to the disk in the same operation (known as write coalescing). Exchange uses as much of the available memory in the system as possible, which is why there aren't large amounts of available memory on an Exchange Mailbox server.
It may not be easy to know whether the memory configuration on your Exchange Mailbox server is undersized. For the most part, the Mailbox server will still function, but your I/O profile may be much higher than expected. Higher I/O can lead to higher disk read and write latencies, which may impact application health and client user experience. In the results section, there isn't any reference to memory counters. Potential memory issues will be identified in the storage validation and application health result sections, where memory-related issues are more easily detected.
Storage
If you have performance issues with your Exchange Mailbox server, those issues may be storage-related issues. Storage issues may be caused by having an insufficient number of disks to support the target I/O requirements, having overloaded or poorly designed storage connectivity infrastructure, or by factors that change the target I/O profile like insufficient memory, as discussed previously.
The first step in storage validation is to verify that the database latencies are below the target thresholds. In previous releases, logical disk counters determined disk read and write latency. In Exchange 2010, the Exchange Mailbox server that you are monitoring is likely to have a mix of active and passive mailbox database copies. The I/O characteristics of active and passive database copies are different. Because the size of the I/O is much larger on passive copies, there are typically much higher latencies on passive copies. Latency targets for passive databases are 200 msec, which is 10 times higher than targets on active database copies. This isn't much of a concern because high latencies on passive databases have no impact on client experience. But if you are using the traditional logical disk counters to measure latencies, you must review the individual volumes and separate volumes containing active and passive databases. Instead, we recommend that you use the new MSExchange Database counters in Exchange 2010.
When validating latencies on Exchange 2010 Mailbox servers, we recommend you use the counters in the following table for active databases.
Counter | Target |
---|---|
MSExchange Database\I/O Database Reads (Attached) Average Latency |
<20 msec |
MSExchange Database\I/O Database Writes (Attached) Average Latency |
<20 msec |
MSExchange Database\IO Log Writes Average Latency |
<1 msec |
We recommend that you use the counters in the following table for passive databases.
Counter | Target |
---|---|
MSExchange Database\I/O Database Reads (Recovery) Average Latency |
<200 msec |
MSExchange Database\I/O Database Writes (Recovery) Average Latency |
<200 msec |
MSExchange Database\IO Log Read Average Latency |
<200 msec |
참고
To view these counters in Performance Monitor, you must enable the advanced database counters. For more information, see How to Enable Extended ESE Performance Counters.
When you're validating disk latencies for Exchange deployments running on Microsoft Hyper-V, be aware that the I/O Database Average Latency counters (as with many time-based counters) may not be accurate because the concept of time within the VM is different than on the physical server. The following example shows that the I/O Database Reads (Attached) Average Latency is 22.8 in the VM and 17.3 on a physical server for the same simulated workload. If the values of time-based counters are over the target thresholds, your server may be running correctly. Review all health criteria to make a decision regarding server health when your Mailbox server role is deployed within a VM.
In addition to disk latencies, review the Database\Database Page Fault Stalls/sec counter. This counter indicates the rate of page faults that can't be serviced because there are no pages available for allocation from the database cache. This counter should be 0 on a healthy server.
Counter | Target |
---|---|
Database\Database Page Fault Stalls/sec |
<1 |
Also, review the Database\Log Record Stalls/sec counter, which indicates the number of log records that can't be added to the log buffers per second because the log buffers are full. This counter should average less than 10.
Counter | Target |
---|---|
Database\Log Record Stalls/sec |
<10 |
Exchange Application Health
Even if there are no obvious issues with processor, memory, and disk, we recommend that you monitor the standard application health counters to ensure that the Exchange Mailbox server is in a healthy state.
The MSExchangeIS\RPC Averaged Latency counter provides the best indication of whether other counters with high database latencies are actually impacting Exchange health and client experience. Often, high RPC averaged latencies are associated with a high number of RPC requests, which should be less than 70 at all times.
Counter | Target |
---|---|
MSExchangeIS\RPC Averaged Latency |
<10 msec on average |
MSExchangeIS\RPC Requests |
<70 at all times |
Next, make sure that the transport layer is healthy. Any issues in transport or issues downstream of transport affecting the transport layer can be detected with the MSExchangeIS Mailbox(_Total)\Messages Queued for Submission counter. This counter should be less than 50 at all times. There may be temporary increases in this counter, but the counter value shouldn't grow over time and shouldn't be sustained for more than 15 minutes.
Counter | Target |
---|---|
MSExchangeIS Mailbox(_Total)\Messages Queued for Submission |
<50 at all times |
Next, ensure that maintenance of the database copies is in a healthy state. Any issues with log shipping or log replay can be identified using the MSExchange Replication(*)\CopyQueueLength and MSExchange Replication(*)\ReplayQueueLength counters. The copy queue length shows the number of transaction log files waiting to be copied to the passive copy log file folder and should be less than 1 at all times. The replay queue length shows the number of transaction log files waiting to be replayed into the passive copy and should be less than 5. Higher values don't impact client experience, but result in longer store mount times when a handoff, failover, or activation is performed.
Counter | Target |
---|---|
MSExchange Replication(*)\CopyQueueLength |
<1 |
MSExchange Replication(*)\ReplayQueueLength |
<5 |
Client Access Servers
To determine whether a Client Access server is healthy, review processor, memory, and application health. For an extended list of important counters, see 클라이언트 액세스 서버 카운터.
Processor
For physical Exchange deployments, use the Processor(_Total)\% Processor Time counter. This counter should be less than 80 percent on average.
Counter | Target |
---|---|
Processor(_Total)\% Processor Time |
<80% |
For validating Exchange deployments running on Microsoft Hyper-V, use the Hyper-V Hypervisor Virtual Processor\% Guest Run Time counter. This provides an accurate value for the amount of physical CPU being utilized by the guest operating system. This counter should be less than 80 percent on average.
Counter | Target |
---|---|
Hyper-V Hypervisor Virtual Processor\% Guest Run Time |
<80% |
Application Health
To determine whether the MAPI client experience is acceptable, use the MSExchange RpcClientAccess\RPC Averaged Latency counter. This counter should be below 250 msec. High latencies can be associated with a large number of RPC requests. The MSExchange RpcClientAccess\RPC Requests counter should be below 40 on average.
Counter | Target |
---|---|
MSExchange RpcClientAccess\RPC Averaged Latency |
<250 msec |
MSExchange RpcClientAccess\RPC Requests |
<40 |
Transport Servers
To determine whether a transport server is healthy, review processor, disk, and application health. For an extended list of important counters, see 전송 서버 카운터.
Processor
For physical Exchange deployments, use the Processor(_Total)\% Processor Time counter. This counter should be less than 80 percent on average.
Counter | Target |
---|---|
Processor(_Total)\% Processor Time |
<80% |
For validating Exchange deployments running on Microsoft Hyper-V, use the Hyper-V Hypervisor Virtual Processor\% Guest Run Time counter. This provides an accurate value for the amount of physical CPU being utilized by the guest operating system. This counter should be less than 80 percent on average.
Counter | Target |
---|---|
Hyper-V Hypervisor Virtual Processor\% Guest Run Time |
<80% |
Disk
To determine whether disk performance is acceptable, use the Logical Disk(*)\Avg. Disk sec/Read and Write counters for the volumes containing the transport logs and database. Both of these counters should be less than 20 msec.
Counter | Target |
---|---|
Logical Disk(*)\Avg. Disk sec/Read |
<20 msec |
Logical Disk(*)\Avg. Disk sec/Write |
<20 msec |
Application Health
To determine whether a Hub Transport server is sized properly and running in a healthy state, examine the MSExchangeTransport Queues counters outlined in the following table. All of these queues will have messages at various times. You want to ensure that the queue length isn't sustained and growing over a period of time. If larger queue lengths occur, this could indicate an overloaded Hub Transport server. Or, there may be network issues or an overloaded Mailbox server that's unable to receive new messages. You will need to check other components of the Exchange environment to verify.
Counter | Target |
---|---|
MSExchangeTransport Queues(_total)\Aggregate Delivery |
<3000 |
MSExchangeTransport Queues(_total)\Active Remote Delivery Queue Length |
<250 |
MSExchangeTransport Queues(_total)\Active Mailbox Delivery Queue Length |
<250 |
MSExchangeTransport Queues(_total)\Retry Mailbox Delivery Queue Length |
<100 |
MSExchangeTransport Queues(_total)\Submission Queue Length |
<100 |
맨 위로 이동
Functional Validation Tests
You can use the information in the following sections for functional validation tests.
맨 위로 이동
Database Switchover (In-Site) Validation
A database switchover is the process by which an individual active database is switched over to another database copy (a passive copy), and that database copy is made the new active database copy.
참고
When using EMC Replication Enabler Exchange 2010, there's a separate set of Windows PowerShell cmdlets that must be used to perform database switchover tasks.
To validate that a passive copy of a database can be successfully activated on another server, run the following command.
Move-REEActiveMailboxDatabase -Identity <DatabaseName> -MailboxServer <TargetServer> -Mount
Success criteria: The active mailbox database is mounted on the specified target server. This result can be confirmed by running the following command.
Get-REEMailboxDatabaseCopyStatus <DatabaseName>
맨 위로 이동
Server Switchover (In-Site) Validation
A server switchover is the process by which all active databases on a DAG member are activated on one or more other DAG members. Like database switchovers, a server switchover can be initiated by using the Exchange Management Shell and the EMC Replication Enabler Exchange 2010 cmdlets.
To validate that all passive copies of databases on a server can be successfully activated on another server hosting a passive copy, run the following command.
Get-MailboxDatabase -Server <ActiveMailboxServer> | Move-REEActiveMailboxDatabase -MailboxServer <TargetServer> -Mount
Success criteria: The active mailbox databases are mounted on the specified target server. This can be confirmed by running the following command.
Get-REEMailboxDatabaseCopyStatus <DatabaseName>
맨 위로 이동
Server Failover Validation
A server failover occurs when the DAG member is no longer able to service the MAPI network, or when the Cluster service on a DAG member is no longer able to contact the remaining DAG members.
To validate that one copy of each of the active databases will be successfully activated on another Mailbox server hosting passive copies of the databases, turn off the server by performing one of the following actions:
Press and hold the power button on the server until the server turns off.
Pull the power cables from the server, which results in the server turning off.
참고
For EMC Replication Enabler Exchange 2010 to respond to Exchange notifications, the Replication Enabler Exchange 2010 Exchange Listener service must be running. To start the Replication Enabler Exchange 2010 Exchange Listener service, run the following cmdlet: Start-REEExchangeListener.
Success criteria: The active mailbox databases are moved and mounted on another Mailbox server in the DAG. This can be confirmed by running the following command.
Get-MailboxDatabase -Server <mailboxserver> | Get-REEMailboxDatabaseCopyStatus
맨 위로 이동
Datacenter Switchover Validation
A datacenter or site failure is managed differently from the types of failures that can cause a server or database failover. In a high availability configuration, automatic recovery is initiated by the system, and the failure typically leaves the messaging system in a fully functional state. By contrast, a datacenter failure is considered to be a disaster recovery event, and as such, recovery must be manually performed and completed for the client service to be restored and for the outage to end. The process you perform is called a datacenter switchover. As with many disaster recovery scenarios, prior planning and preparation for a datacenter switchover can simplify your recovery process and reduce the duration of your outage.
For more information, including detailed steps for performing a datacenter switchover, see 데이터 센터 전환.
There are two basic steps that you need to complete to perform a datacenter switchover, after making the initial decision to activate the second datacenter:
Activate the Mailbox servers.
Activate the Client Access servers.
The following sections describe each of these steps.
Activate Mailbox Servers
Before activating the DAG members in the second datacenter, we recommend that you validate that the infrastructure services in the second datacenter are ready for messaging service activation.
If the DAG cluster loses quorum due to a disaster at the primary datacenter, the cluster service and the EMC Replication Enabler Exchange 2010 service on the surviving DAG member servers at the secondary site will be in a stopped state. Perform the following steps:
Start these services by running the following commands from an elevated command prompt on each surviving DAG member server at the secondary site.
net start clussvc /forcequorum
net start "EMC Replication Enabler for Exchange 2010"
Activate the databases by running the following Windows PowerShell cmdlet.
Move-REEActiveMailboxDatabase -Identity <Database name> -MailboxServer <DAGMemberInSecondSite>
참고
If the replication link is broken between the primary and secondary sites and the secondary images aren't in sync with primary images, retry the preceding command with a force switch.
Move-REEActiveMailboxDatabase -Identity <Database name> -MailboxServer <DAGMemberInSecondSite> -Force
참고
By running Move-REEActiveMaiboxDatabase, Replication Enabler Exchange 2010 automatically handles the storage failover (for example, mirror promotion).
Check the event logs and review all error and warning messages to ensure that the secondary site is healthy. Follow up and correct all issues prior to mounting the databases.
Mount the databases using the following Windows PowerShell cmdlet.
Get-MailboxDatabase <DAGMemberInSecondSite> | Mount-Database
Activate Client Access Servers
Clients connect to service endpoints to access the Microsoft Exchange services and data. Activating Internet-facing Client Access servers involves changing DNS records to point to the new IP addresses to be configured for the new service endpoints. Clients will then automatically connect to the new service endpoints in one of two ways:
Clients will continue to try to connect, and should automatically connect after Time to Live (TTL) has expired for the original DNS entry, and after the entry is expired from the client's DNS cache. Users can also run the ipconfig /flushdns command from a command prompt to manually clear their DNS cache. If using Outlook Web App, the Web browser may need to be closed and restarted to clear the DNS cache used by the browser. In Exchange 2010 SP1, this browser caching issue can be mitigated by configuring the FailbackURL parameter on the Outlook Web App virtual directory owa.
As the Outlook clients start or restart, they perform a DNS lookup and obtain the new IP address for the service endpoint, which is a Client Access server or array in the second datacenter.
To validate this scenario with Loadgen, perform the following steps:
Change the DNS entry for the Client Access server array to point to the virtual IP address of the hardware load balancing server in the secondary site.
Run the ipconfig /flushdns command on all Loadgen servers.
Restart the Loadgen load.
Verify that the Client Access servers in the secondary site are servicing the load.
맨 위로 이동
Primary Datacenter Service Restoration Validation
Failback is the process of restoring service to a previously failed datacenter. The steps used to perform a datacenter failback are similar to the steps used to perform a datacenter switchover. A significant distinction is that datacenter failbacks are scheduled, and the duration of the outage is often much shorter.
중요
Don't perform the failback until the infrastructure dependencies for Exchange are reactivated, functioning, stable, and validated. If these dependencies aren't available or healthy, it's likely that the failback process will cause a longer than necessary outage, and it's possible that the process could fail altogether.
There are two basic steps that you need to complete to perform a datacenter failback, after making the initial decision to reactivate the primary datacenter:
Restore storage replication.
Perform Mailbox and Client Access servers failback.
The following sections describe each of these steps.
Restore Storage Replication
To restore your CLARiiON storage replication after a site failure, perform the following steps:
Turn on the storage at the failed site.
Restore the MirrorView and IP address links.
All consistency groups that aren't locally promoted are marked as Waiting on Admin in Navisphere. For each consistency group marked Waiting on Admin, do the following:
From Navisphere, right-click each consistency group and select Synchronize from the drop-down-menu.
Wait for the consistency groups to synchronize.
The process of restoring consistency groups that are locally promoted at the secondary site is more detailed. For each consistency group that's locally promoted, perform the following steps:
From Navisphere, destroy the consistency groups on both CLARiiON arrays. Open CG Properties and click the Force Destroy button.
Destroy the remote mirrors on the CLARiiON array at the failed site. Open Mirror Properties, select the Primary Image tab, and then click the Force Destroy button.
Remove the corresponding LUNs from the storage group on the CLARiiON array at the failed site.
Right-click each remote mirror on the CLARiiON array at the surviving site, and then choose Add Secondary Storage.
Choose the LUN from the CLARiiON array at the failed site.
Create a consistency group using the same name.
Add all remote mirrors that were part of the original consistency group.
Add the corresponding LUNs to the storage group on the CLARiiON array at the failed site.
Perform Mailbox and Client Access Server Failback
The Mailbox server role should be the first role that has to fail back to the primary datacenter. The following steps detail the Mailbox server role failback.
Start the Mailbox servers at the primary site, and then verify that the Cluster service and EMC Replication Enabler for Exchange 2010 service are started.
Update the Replication Enabler Exchange 2010 configuration by running the following Windows PowerShell cmdlet.
Update-REEDatabaseInfo
Dismount the databases being reactivated in the primary datacenter from the second datacenter using the following Windows PowerShell cmdlet.
Dismount-Database -Identity <Database name>
After dismounting the databases, move the Client Access server URLs from the second datacenter to the primary datacenter by changing the DNS record for the URLs to point to the Client Access server or array in the primary datacenter.
중요
Don't proceed to the next step until the Client Access server URLs have moved and the DNS TTL and cache entries are expired. Activating the databases in the primary datacenter prior to moving the Client Access server URLs to the primary datacenter results in an invalid configuration (for example, a mounted database that has no Client Access servers in its Active Directory site).
You can now activate or move the databases by running the following Windows PowerShell cmdlet.
Move-REEActiveMailboxDatabase -Identity <Database name> -MailboxServer <DAGMemberInPrimary Site>
Mount the databases using the following Windows PowerShell cmdlet:
Mount-Database -Identity <Database name>
맨 위로 이동
Test Facility
Testing was conducted at the Microsoft Enterprise Engineering Center, a state-of-the-art enterprise solutions validation laboratory on the Microsoft main campus in Redmond, Washington.
With more than 125 million dollars in hardware and with ongoing strong partnerships with the industry's leading original equipment manufacturers (OEMs), virtually any production environment can be replicated at the EEC. The EEC offers an environment that enables extensive collaboration among customers, partners, and Microsoft product engineers. This helps ensure that Microsoft end-to-end solutions will meet the high expectations of customers.
맨 위로 이동
Solution Validation Results
The following sections summarize the results of the functional and performance validation tests.
맨 위로 이동
Functional Validation Results
The following table summarizes the functional validation test results.
Functional validation results
Test case | Result | Comments |
---|---|---|
Database switchover |
Successful |
Completed without errors |
Server switchover |
Successful |
Completed without errors |
Server failure |
Successful |
Completed without errors |
Site failure |
Successful |
Completed without errors |
맨 위로 이동
Storage Design Validation Results
The following tables summarize the Jetstress storage validation results. This solution achieved higher than target transactional I/O while maintaining database latencies well under the 20 msec target.
Overall test result |
Pass |
Transactional I/O per second
Server | Target | Tested result without MirrorView | Tested result with MirrorView |
---|---|---|---|
MBX1 (DAG1) DB1-4 |
450 |
654 |
644 |
MBX2 (DAG1) DB5-8 |
450 |
658 |
631 |
MBX3 (DAG1) DB9-12 |
450 |
650 |
622 |
MBX4 (DAG1) DB13-16 |
450 |
658 |
627 |
MBX13 (DAG2) DB17-20 |
450 |
660 |
657 |
MBX14 (DAG2) DB21-24 |
450 |
659 |
646 |
MBX15 (DAG2) DB25-28 |
450 |
658 |
647 |
MBX16 (DAG2) DB29-32 |
450 |
661 |
634 |
Transactional I/O performance: database reads
Server | Tested result without MirrorView in reads per second | Tested result without MirrorView in average latency | Tested result with MirrorView in reads per second | Tested result with MirrorView in average latency |
---|---|---|---|---|
MBX1 (DAG1) DB1-4 |
361 |
10.8 |
363 |
10.0 |
MBX2 (DAG1) DB5-8 |
364 |
10.7 |
355 |
10.0 |
MBX3 (DAG1) DB9-12 |
360 |
10.9 |
350 |
10.2 |
MBX4 (DAG1) DB13-16 |
364 |
10.7 |
353 |
10.1 |
MBX13 (DAG2) DB17-20 |
366 |
10.8 |
370 |
10.0 |
MBX14 (DAG2) DB21-24 |
364 |
10.8 |
363 |
10.0 |
MBX15 (DAG2) DB25-28 |
365 |
10.8 |
364 |
10.0 |
MBX16 (DAG2) DB29-32 |
365 |
10.7 |
357 |
10.2 |
Transactional I/O performance: database writes
Server | Tested result without MirrorView in writes per second | Tested result without MirrorView in average latency | Tested result with MirrorView in writes per second | Tested result with MirrorView in average latency |
---|---|---|---|---|
MBX1 (DAG1) DB1-4 |
293 |
8.5 |
283 |
12.2 |
MBX2 (DAG1) DB5-8 |
295 |
8.6 |
277 |
12.9 |
MBX3 (DAG1) DB9-12 |
290 |
8.6 |
273 |
13.0 |
MBX4 (DAG1) DB13-16 |
294 |
8.5 |
274 |
13.0 |
MBX13 (DAG2) DB17-20 |
294 |
8.6 |
287 |
11.4 |
MBX14 (DAG2) DB21-24 |
294 |
8.6 |
284 |
11.6 |
MBX15 (DAG2) DB25-28 |
294 |
8.6 |
283 |
11.5 |
MBX16 (DAG2) DB29-32 |
296 |
8.8 |
278 |
11.5 |
Transactional I/O performance: log writes
Server | Tested result without MirrorView in writes per second | Tested result without MirrorView in average latency | Tested result with MirrorView in writes per second | Tested result with MirrorView in average latency |
---|---|---|---|---|
MBX1 (DAG1) DB1-4 |
222 |
3.4 |
187 |
5.7 |
MBX2 (DAG1) DB5-8 |
223 |
3.5 |
181 |
6.1 |
MBX3 (DAG1) DB9-12 |
222 |
3.5 |
180 |
6.1 |
MBX4 (DAG1) DB13-16 |
223 |
3.5 |
178 |
6.2 |
MBX13 (DAG2) DB17-20 |
225 |
3.4 |
197 |
5.1 |
MBX14 (DAG2) DB21-24 |
224 |
3.5 |
188 |
5.5 |
MBX15 (DAG2) DB25-28 |
224 |
3.5 |
188 |
5.5 |
MBX16 (DAG2) DB29-32 |
224 |
3.4 |
185 |
5.6 |
맨 위로 이동
Server Design Validation Results
The following sections summarize the server design validation results for the test cases.
Loadgen validation: test scenarios
Test | Description |
---|---|
Normal operation |
A 70 percent concurrency load for 10,000 users was simulated at one site, with each Mailbox server handling 2,500 users. |
Single server failure or single server maintenance (in site) |
The failure of a single Hyper-V host server per site was simulated. A 70 percent concurrency load was run against a single Hyper-V host with two Exchange Mailbox server VMs, each handling 5,000 users. Only three combined Client Access and Hub Transport servers handled the load. |
Site failure |
A site failure was simulated, and secondary images on standby Mailbox servers were activated. A 70 percent concurrency load was run against 20,000 users. |
Test Case: Normal Operating Conditions
This test case represents peak workload during normal operating conditions. Normal operating conditions refer to a state where all of the active and passive databases reside on the servers they were planned to run on. Because this test case doesn't represent the worst case workload, it isn't the key performance validation test. It provides a good indication of how this environment should run outside of a server failure or maintenance event. In this case, each Mailbox server is running four active and four passive databases.
In this test, the objective was to validate the entire Exchange environment under normal operating condition with a peak load. All of the Exchange VMs were operating under normal conditions. Loadgen was configured to simulate peak load. The 150-message action profile running in peak mode was expected to generate double the sent and delivered messages per second.
Validation of Expected Load
The message delivery rate verifies that tested workload matched the target workload. The actual message delivery rate was very close to target.
Counter | Target | Tested result |
---|---|---|
Message Delivery Rate Per Server |
14.6 |
14.9 |
Validation of Mailbox Servers
The following tables show the validation of Mailbox servers.
Processor
Processor utilization is below 70 percent, as expected.
Counter | Target | Tested result |
---|---|---|
Hyper-V Hypervisor Virtual Processor\% Guest Run Time |
<70% |
43 |
Storage
The storage results are good. All latencies are well under target values.
Counter | Target | Tested result |
---|---|---|
MSExchange Database\I/O Database Reads (Attached) Average Latency |
<20 msec |
9.0 |
MSExchange Database\I/O Database Writes (Attached) Average Latency |
<20 msec <Reads average |
7.0 |
Database\Database Page Fault Stalls/sec |
0 |
0 |
MSExchange Database\IO Log Writes Average Latency |
<20 msec |
5.0 |
Database\Log Record Stalls/sec |
0 |
0 |
Application Health
Exchange is very healthy, and all of the counters used to determine application health are well under target values.
Counter | Target | Tested result |
---|---|---|
MSExchangeIS\RPC Requests |
<70 |
3.0 |
MSExchangeIS\RPC Averaged Latency |
<10 msec |
2.0 |
MSExchangeIS Mailbox(_Total)\Messages Queued for Submission |
0 |
1.5 |
Validation of Client Access and Hub Transport Servers
Processor
Processor utilization is low, as expected.
Counter | Target | Tested result |
---|---|---|
Hyper-V Hypervisor Virtual Processor\% Guest Run Time |
<70% |
26 |
Storage
The storage results look good. The very low latencies should have no impact on message transport.
Counter | Target | Tested result |
---|---|---|
Logical/Physical Disk(*)\Avg. Disk sec/Read |
<20 msec |
0.008 |
Logical/Physical Disk(*)\Avg. Disk sec/Write |
<20 msec |
0.004 |
Application Health
The low RPC Averaged Latency values confirm a healthy Client Access server with no impact on client experience.
Counter | Target | Tested result |
---|---|---|
MSExchange RpcClientAccess\RPC Averaged Latency |
<250 msec |
5 |
MSExchange RpcClientAccess\RPC Requests |
<40 |
3 |
The Transport Queues counters are all well under target, confirming that the Hub Transport server is healthy and able to process and deliver the required messages.
Counter | Target | Tested result |
---|---|---|
\MSExchangeTransport Queues(_total)\Aggregate Delivery Queue Length (All Queues) |
<3000 |
2.5 |
\MSExchangeTransport Queues(_total)\Active Remote Delivery Queue Length |
<250 |
0 |
\MSExchangeTransport Queues(_total)\Active Mailbox Delivery Queue Length |
<250 |
2.0 |
\MSExchangeTransport Queues(_total)\Submission Queue Length |
<100 |
0 |
\MSExchangeTransport Queues(_total)\Retry Mailbox Delivery Queue Length |
<100 |
0.5 |
Validation of Root Server Health
Processor
As expected, the processor utilization is very low and well under target thresholds.
Counter | Target | Tested result |
---|---|---|
Hyper-V Hypervisor Logical Processor(_total)\% Guest Run Time |
<75% |
20 |
Hyper-V Hypervisor Logical Processor(_total)\% Hypervisor Run Time |
<5% |
2 |
Hyper-V Hypervisor Logical Processor(_total)\% Total Run Time |
<80% |
22 |
Hyper-V Hypervisor Root Virtual Processor(_total)\% Guest Run Time |
<5% |
3 |
Application Health
The Virtual Machine Health Summary counters indicate that all VMs are in a healthy state.
Counter | Target | Tested result |
---|---|---|
Hyper-V Virtual Machine Health Summary\Health Critical |
0 |
0 |
Test Case: Single Server Failure or Single Server Maintenance (In Site)
In this test, the objective was to validate the entire Exchange environment under physical Hyper-V host failure or maintenance operating conditions with a peak load. All VMs running on one of the Hyper-V hosts within the site were shut down to simulate a host maintenance condition. This resulted in database images (copies) being moved to other Mailbox servers, which created an operating condition of 5,000 users per Mailbox server. Only half of the combined Client Access and Hub Transport servers processed client access and mail delivery.
Validation of Expected Load
The actual message delivery rate was very close to target.
Counter |
Target |
Tested result |
Message Delivery Rate Per Server |
29.2 |
29.4 |
Validation of Mailbox Servers
Processor
Processor utilization is low, as expected.
Counter | Target | Tested result |
---|---|---|
Hyper-V Hypervisor Virtual Processor\% Guest Run Time |
<80% |
80 |
Storage
Storage results look acceptable. The average read latency is just over target. The average database write latency is higher than preferred. This is during the worst case failure scenario under peak load, which is a low occurrence event. The high latencies don't put the application health counters over target so user experience should still be acceptable.
Counter | Target | Tested result |
---|---|---|
MSExchange Database\I/O Database Reads (Attached) Average Latency |
<20 msec |
20.5 |
MSExchange Database\I/O Database Writes (Attached) Average Latency |
<20 msec |
27 |
Database\Database Page Fault Stalls/sec |
0 |
0 |
MSExchange Database\IO Log Writes Average Latency |
<20 msec |
5 |
Database\Log Record Stalls/sec |
0 |
0 |
Application Health
Exchange is very healthy, and the counters used to determine application health are well under target values.
Counter | Target | Tested result |
---|---|---|
MSExchangeIS\RPC Requests |
<70 |
6.0 |
MSExchangeIS\RPC Averaged Latency |
<10 msec |
2.0 |
MSExchangeIS Mailbox(_Total)\Messages Queued for Submission |
0 |
3.3 |
Validation of Client Access and Hub Transport Servers
Processor
Processor utilization is low, as expected.
Counter | Target | Tested result |
---|---|---|
Hyper-V Hypervisor Virtual Processor\% Guest Run Time |
<80% |
48 |
Storage
The storage results look good. The very low latencies should have no impact on message transport.
Counter | Target | Tested result |
---|---|---|
Logical/Physical Disk(*)\Avg. Disk sec/Read |
<20 msec |
0.009 |
Logical/Physical Disk(*)\Avg. Disk sec/Write |
<20 msec |
0.004 |
Application Health
The low RPC Averaged Latency values confirm a healthy Client Access server with no impact on client experience.
Counter | Target | Tested result |
---|---|---|
MSExchange RpcClientAccess\RPC Averaged Latency |
<250 msec |
12 |
MSExchange RpcClientAccess\RPC Requests |
<40 |
43 |
The Transport Queues counters are all well under target, confirming that the Hub Transport server is healthy and able to process and deliver the required messages.
Counter | Target | Tested result |
---|---|---|
\MSExchangeTransport Queues(_total)\Aggregate Delivery Queue Length (All Queues) |
<3000 |
12 |
\MSExchangeTransport Queues(_total)\Active Remote Delivery Queue Length |
<250 |
0 |
\MSExchangeTransport Queues(_total)\Active Mailbox Delivery Queue Length |
<250 |
11.5 |
\MSExchangeTransport Queues(_total)\Submission Queue Length |
<100 |
0 |
\MSExchangeTransport Queues(_total)\Retry Mailbox Delivery Queue Length |
<100 |
0.5 |
Validation of Root Server Health
Processor
As expected, the processor utilization is very low and well under target thresholds.
Counter | Target | Tested result |
---|---|---|
Hyper-V Hypervisor Logical Processor(_total)\% Guest Run Time |
<75% |
36 |
Hyper-V Hypervisor Logical Processor(_total)\% Hypervisor Run Time |
<5% |
2 |
Hyper-V Hypervisor Logical Processor(_total)\% Total Run Time |
<80% |
38 |
Hyper-V Hypervisor Root Virtual Processor(_total)\% Guest Run Time |
<5% |
3 |
Application Health
The Virtual Machine Health Summary counters indicate that all VMs are in a healthy state.
Counter | Target | Tested result |
---|---|---|
Hyper-V Virtual Machine Health Summary\Health Critical |
0 |
0 |
Test Case: Site Failure
In this test case, site failure occurs.
Validation of Expected Load
Message delivery rate is slightly higher than target, resulting in slightly higher load than the desired profile.
Counter | Target | Tested result |
---|---|---|
Message Delivery Rate Per Server |
14.6 |
15.0 |
Validation of Mailbox Servers
Processor
Processor utilization is low, as expected.
Counter | Target | Tested result |
---|---|---|
Hyper-V Hypervisor Virtual Processor\% Guest Run Time |
<70% |
43% |
Storage
The storage results look good with all latencies well under target.
Counter | Target | Tested result |
---|---|---|
MSExchange Database\I/O Database Reads (Attached) Average Latency |
<20 msec |
9 |
MSExchange Database\I/O Database Writes (Attached) Average Latency |
<20 msec <Reads average |
6 |
Database\Database Page Fault Stalls/sec |
0 |
0 |
MSExchange Database\IO Log Writes Average Latency |
<20 msec |
5.0 |
Database\Log Record Stalls/sec |
0 |
0 |
Application Health
Exchange is very healthy, and all of the counters used to determine application health are well under target values.
Counter | Target | Tested result |
---|---|---|
MSExchangeIS\RPC Requests |
<70 |
3 |
MSExchangeIS\RPC Averaged Latency |
<10 msec |
2 |
MSExchangeIS Mailbox(_Total)\Messages Queued for Submission |
0 |
0.3 |
Validation of Client Access and Hub Transport Servers
Processor
Processor utilization is low, as expected.
Counter | Target | Tested result |
---|---|---|
Hyper-V Hypervisor Virtual Processor\% Guest Run Time |
<70% |
41 |
Storage
The storage results look good. The very low latencies should have no impact on message transport.
Counter | Target | Tested result |
---|---|---|
Logical/Physical Disk(*)\Avg. Disk sec/Read |
<20 msec |
0.010 |
Logical/Physical Disk(*)\Avg. Disk sec/Write |
<20 msec |
0.005 |
Application Health
The low RPC Averaged Latency values confirm a healthy Client Access server with no impact on client experience.
Counter | Target | Tested result |
---|---|---|
MSExchange RpcClientAccess\RPC Averaged Latency |
<250 msec |
12 |
MSExchange RpcClientAccess\RPC Requests |
<40 |
8 |
The Transport Queues counters are all well under target, confirming that the Hub Transport server is healthy and able to process and deliver the required messages.
Counter | Target | Tested result |
---|---|---|
\MSExchangeTransport Queues(_total)\Aggregate Delivery Queue Length (All Queues) |
<3000 |
13 |
\MSExchangeTransport Queues(_total)\Active Remote Delivery Queue Length |
<250 |
0 |
\MSExchangeTransport Queues(_total)\Active Mailbox Delivery Queue Length |
<250 |
12 |
\MSExchangeTransport Queues(_total)\Submission Queue Length |
<100 |
0 |
\MSExchangeTransport Queues(_total)\Retry Mailbox Delivery Queue Length |
<100 |
1 |
Validation of Root Server Health
Processor
As expected, the processor utilization is very low and well under target thresholds.
Counter | Target | Tested result |
---|---|---|
Hyper-V Hypervisor Logical Processor(_total)\% Guest Run Time |
<75% |
41 |
Hyper-V Hypervisor Logical Processor(_total)\% Hypervisor Run Time |
<5% |
2 |
Hyper-V Hypervisor Logical Processor(_total)\% Total Run Time |
<80% |
43 |
Hyper-V Hypervisor Root Virtual Processor(_total)\% Guest Run Time |
<5% |
3.5 |
Application Health
The Virtual Machine Health Summary counters indicate that all VMs are in a healthy state.
Counter | Target | Tested result |
---|---|---|
Hyper-V Virtual Machine Health Summary\Health Critical |
0 |
0 |
맨 위로 이동
결론
This white paper provides an example of how to design, test, and validate an Exchange Server 2010 solution for customer environments with 20,000 mailboxes in multiple sites deployed on Dell, EMC, and Brocade hardware. The step-by-step methodology in this document walks through the important design decision points that help address key challenges while ensuring that the customer's core business requirements are met.
맨 위로 이동
추가 정보
For the complete Exchange 2010 documentation, see Exchange Server 2010.
For additional information related to EMC, Dell, and Brocade, see the following resources:
EMC CLARiiON CX4-480: EMC CLARiiON Virtual Provisioning
R910 product spec sheet: Dell PowerEdge R910
Dell Exchange 2010 page: Dell and Exchange Server 2010
Dell Exchange 2010 architecture models white paper: Exchange 2010 on Dell: Two Architecture Models for Improving User Productivity on a More Cost-Efficient Infrastructure
Dell Exchange 2010 advisor tool: Exchange 2010 Advisor
Brocade NetIron MLX and FastIron Ethernet switches and routers: Brocade MLX Series and Brocade FastIron SX Series
Brocade ServerIron ADX family of application delivery controllers: Application Delivery Controllers
Brocade ServerIron ADX and Microsoft Exchange Server 2010: Deploying the Brocade ServerIron ADX with Microsoft Exchange Server 2010
Brocade SAN switches (Brocade 300 SAN switch): Brocade 300 Switch
This document is provided "as-is." Information and views expressed in this document, including URL and other Internet Web site references, may change without notice. You bear the risk of using it.
This document does not provide you with any legal rights to any intellectual property in any Microsoft product. You may copy and use this document for your internal, reference purposes.
맨 위로 이동