Best Practices for Configuring Exchange Back-End Storage
There are several recommendations for maximizing the availability of Exchange data on back-end servers:
Storage group configuration Plan your storage group and database configurations to maximize performance and recoverability.
Mailbox server storage sizing Plan the size of your databases to ensure adequate performance and recoverability.
Back-end server partitioning Store your Windows operating system files, Exchange application files, Exchange database files, and transaction log files on separate disks to increase fault tolerance and optimize recovery.
Exchange data storage Implement RAID solutions on your disks to match the type of data on each disk.
Hard disk space Ensure that you provide enough disk space to ensure performance and recoverability.
Disk performance and I/0 throughput Consider hard disk performance as part of your storage solution.
Disk defragmentation Defragment your disks using online defragmentation, and if necessary, offline defragmentation.
Virtual memory optimization Consider tuning your server to ensure efficient and reliable message transfer.
There are other ways to enhance availability and performance of your Exchange 2003 back-end servers. For information about tuning Exchange 2003 back-end servers, see the Exchange Server 2003 Performance and Scalability Guide.
Storage Group Configuration
The Exchange store uses two types of databases: mailbox stores and public folder stores. These stores are organized into storage groups. All of the databases in a storage group share a single set of transaction log files, a single backup schedule, and a single set of logging and backup-related settings.
Your disaster recovery strategy has an important role in determining how many storage groups and databases your storage solution should support. Specifically, your recovery plan should state your company's restore time requirements. It is this requirement that dictates your storage configuration. How you configure your storage groups depends on which edition of Exchange 2003 you use:
If you are using Exchange Server 2003 Standard Edition, each Exchange server can have one storage group that contains one mailbox store and one public folder store.
If you are using Exchange Server 2003 Enterprise Edition, each server can have as many as four storage groups, each of which contains as many as five databases (either mailbox stores or public folder stores).
Consider using descriptive naming conventions for the names of your storage groups, mailbox stores, and public folder stores. Using descriptive naming conventions can be useful for maintenance and troubleshooting purposes.
Using either Exchange Server 2003 Standard Edition or Exchange Server 2003 Enterprise Edition, you can create a recovery storage group in addition to your other storage groups. Recovery storage groups are used to recover mailbox data when restoring data from a backup. For more information about recovery storage groups, see Using Exchange Server 2003 Recovery Storage Groups.
If you are using Exchange Server 2003 Enterprise Edition, you can use multiple mailbox stores to increase the reliability and recoverability of your Exchange organization. If users are spread across multiple mailbox stores, the loss of a single store impacts only a subset of the users rather than the entire organization. In addition, reducing the number of mailboxes per store reduces the time to recover a damaged store from a backup.
In general, when you distribute your users among multiple databases, your system uses more resources than if you were to place all the users on a single database. However, due to the potential reduction in the time it takes to recover an individual mailbox store, the benefits of using multiple stores usually outweigh the resource costs. For more information about disk performance, see "Disk Performance and I/O Throughput" later in this topic.
Public Folder Storage
To disperse public folders across multiple servers, you can use multiple public folder stores. Furthermore, to increase your system's ability to handle user traffic, you can place multiple replicas of the same folder on several servers. If you have multiple routing groups, you may want to distribute folders among the routing groups. This provides users with easy access to folders that they use most often.
Mailbox Server Storage Sizing
Before you select the service design options for your mailbox servers, you must first determine how much data you need to store. By accurately determining the existing and projected volumes of your mail data, you can appropriately design your storage systems so that you do not have to expand them immediately following deployment.
Use the following formula to calculate the amount of data you can store on a single server:
(Number of mailboxes × maximum size of mailbox limit) × adjustment factor
The adjustment factor (for example, using a value such as 1.5) provides extra space for data that does not count against mailbox quotas, including messages that are held in the deleted item retention store and the deleted mailbox retention store. Quotas make space usage patterns much more predictable. In Exchange environments where mailbox quotas are not in use, enabling quotas is an important consideration. Moreover, creating separate mailbox databases and using multiple system policies with different quota limits simplifies the administrative process.
For users who need large mailboxes, it is possible to have one or more databases with no limits.
You can use the total volume of data to derive an estimate of how much disk space is required for the server. However, calculating this estimate depends on how many mailbox databases will be homed on the server.
Backup and Restore Performance Considerations
In Exchange 2003 Enterprise Edition, there is no set limit on the size of an individual database. The lack of built-in limits makes it difficult to decide on server sizing and how to divide data among databases. As a result, to determine how your server data should be sized and divided, first consider the following:
What are your time requirements for restoring a single database?
How fast is the restore mechanism you have selected?
To illustrate this process, consider the following scenario:
An SLA that provides for a maximum outage time of four hours is combined with a backup system that can back up and restore 75 GB per hour. Because recoveries, on average, take twice as long as backups, the maximum sizing for all data on a server would be 150 GB. (The 150 GB represent the amount of data that could feasibly be restored within the two-hour window.)
After determining the maximum sizing, you would decide how to partition the 150 GB into smaller chunks. Because all users on the server are in one database, placing all data in a single database provides no flexibility. The decision is not whether or not to partition, but how many partitions to use. A common solution would be to take the server's maximum data size and divide it equally into databases. For example, you could divide a 150-GB database into five 30-GB databases (five being the maximum number of databases possible within a single storage group). Unfortunately, this solution prevents concurrent backups and restores of multiple databases. This solution also prevents mounting additional databases for testing or repair purposes. A better solution would be to use multiple storage groups (in this case, two storage groups with four databases each), with each database expected to use approximately 20 GB. This solution would allow two databases from different storage groups to be backed up or restored simultaneously (assuming that you have suitable backup hardware). Also, the small database file size allows you to move the files easily between volumes or servers.
Single-Instance Storage Considerations
It is also important to plan how you will group mailboxes into mailbox databases. Because Exchange can maintain single-instance storage within individual databases, workgroup users are kept in the same database when possible. With single-instance storage, you can restore certain user groups more quickly. For example, consider a bank whose currency-trading operations demand higher levels of availability than their other operations. Using the previous 150-GB server example, creating an additional storage group with one database for only currency traders' mailboxes would allow for faster backup and restore processes. For business-critical mailboxes, another option would be to use the Volume Shadow Copy service to provide even faster recoveries. The cost disadvantage of Volume Shadow Copy service hardware requirements is thus minimized.
Best Practices for Partitioning Back-End Servers
To increase fault tolerance and provide for easier troubleshooting, you should partition your disks so that the following files are located on separate disks:
Windows Server 2003 files
Exchange application files
Exchange database files
Exchange transaction log files
In general, partitioning your hard disks in this manner can increase performance and reduce the amount of data you need to recover. The remainder of this section describes the benefits of locating each of these files on separate disks.
Benefits to Locating Exchange Application Files and Windows Server 2003 Files on Their Own Disks
Locating your Exchange application files and Windows Server 2003 files on their own disks has the following benefits:
Improved performance There are some noticeable performance benefits for Exchange 2003 servers. For example, the server can read Windows files or Exchange files in any order necessary without moving the disk drive head as much as if the applications were on one disk.
Improved fault tolerance A single point of failure is no longer an issue. For example, if the disks where Exchange 2003 is installed fail, Windows Server 2003 continues to function.
Benefits to Locating Exchange Transaction Log Files and Database Files on Their Own Disks
In Exchange, transaction log files are accessed sequentially, and databases are accessed randomly. In accordance with general storage principles, you should separate the transaction log files (sequential I/O) from databases (random I/O) to maximize I/O performance and increase fault tolerance. Storing sequentially accessed files separately keeps the disk heads in position for sequential I/O, which reduces the amount of time required to locate data. Specifically, you should move each set of transaction log files to its own array, separate from storage groups and databases.
By default, Exchange stores database files and transaction log files in the following folder:
This folder exists in the same partition on which you install Exchange 2003.
Default location of Exchange database and transaction log files
Locating your Exchange transaction log files and database files on separate disks has the following benefits:
Easier management of Exchange data Each set of files is assigned a separate drive letter. Having each set of files represented by its own drive letter helps you keep track of which partitions you must back up in accordance with your disaster recovery method.
Improved performance You significantly increase hard disk I/O performance.
Minimize impact of a disaster Depending on the type of failure, placing databases and storage groups on separate disks can significantly minimize data loss. For example if you keep your Exchange databases and transaction log files on the same physical hard disk and that disk fails, you can recover only the data that existed at your last backup. For more information, see "Considering Locations of Your Transaction Log Files and Database Files" in the Exchange Server 2003 Disaster Recovery Planning Guide.
The following table and figure illustrate a possible partitioning scheme for an Exchange server that has six hard disks, including two storage groups, each containing four databases. Because the number of hard disks and storage groups on your Exchange server may be different than the number of hard disks and storage groups used in this example, apply the logic of this example as it relates to your own server configuration. In the following table, note that drives E, F, G, and H may point to external storage devices.
Exchange hard disk partitioning scheme
Fixed Disk 1
Drive C (NTFS)—Windows operating system files and swap file.
Fixed Disk 2
Drive D (NTFS)—Exchange files and additional server applications (such as antivirus software and resource kits).
Fixed Disk 3
Drive E (NTFS)—Transaction log files for storage group 1.
Fixed Disk 4
Drive F (NTFS)—Database files for storage group 1.
Fixed Disk 5
Drive G (NTFS)—Transaction log files for storage group 2.
Fixed Disk 6
Drive H (NTFS)—Database files for storage group 2.
Fault tolerant hard disk setup with 6 disks
Whether you are storing Exchange database files on a server or on an advanced storage solution such as a SAN, you can apply the partitioning recommendations presented in this section. In addition, you should incorporate technologies such as disk mirroring (RAID-1) and disk striping with parity (RAID-5 or RAID-6, depending on the type of data that is being stored).
For more information about Exchange 2003 transaction log files, databases, and storage groups, see "Understanding Exchange 2003 Database Technology" in the Exchange Server 2003 Disaster Recovery Planning Guide.
Storing Exchange Data
This section provides information to help you properly configure the location and RAID levels for the following types of Exchange data:
Database files (.edb and .stm files)
Transaction log files
SMTP Queue directory data
Content indexing files
An Exchange database consists of a rich-text .edb file and a Multipurpose Internet Mail Extensions (MIME) content .stm file.
The .edb file stores the following items:
All of the MAPI messages
Tables used by the Store.exe process to locate all messages
Checksums of both the .edb and .stm files
Pointers to the data in the .stm file
The .stm file contains messages that are transmitted with their native Internet content. Because access to these files is generally random, they can be placed on the same disk.
By default, Exchange stores database files in the following folder:
This folder exists in the same partition on which you install Exchange 2003.
Database File Considerations
As you plan your storage solution for these files, implement a solution that ensures reliability. RAID-0 is not a recommended option. After reliability, your storage solution is based on a choice between optimizing performance (RAID-1) and optimizing capacity (RAID-5). If possible, use RAID-1 (or RAID-0+1) for these files.
You can store public folders on a RAID-5 array because data in public folders is usually written once and read many times. RAID-5 provides improved read performance.
Transaction Log Files
The most important aspect of a storage group is its transaction logs. Even if you use only the default First Storage Group, you need to consider your transaction log configuration to be sure that you can recover data if the stores are damaged.
In standard Exchange transaction logging, each store transaction (such as creating or modifying a message) in a storage group is written to a log file and then to the Exchange store. All of the stores in a storage group share a single set of transaction logs. The logging process ensures that records of transactions exist if a store is damaged between backups. In many cases, recovering a damaged store means restoring the store from a backup, replaying any backed up log files, and then replaying the most recent log files to recover transactions that were made after the last backup.
If a disaster occurs, and you must rebuild a server, you use the latest transaction log files to recover your databases. If you have access to the latest backup and the transaction log files since the backup, you can recover all of your data. However, if you lose any of the transaction log files, the data that was not committed to the database since the last backup is permanently lost.
For detailed information about how transaction logs function, see "Understanding Exchange 2003 Database Technology" in the Exchange Server 2003 Disaster Recovery Planning Guide.
By default, Exchange stores transaction log files in the following folder:
This folder exists in the same partition on which you install Exchange 2003.
Transaction Log File Considerations
As you plan the location of your Exchange transaction log files, consider the following:
You can significantly improve the performance and fault tolerance of Exchange servers by placing each set of transaction log files on a separate drive.
Because each storage group has its own set of transaction log files, the number of dedicated transaction log drives for your server should equal the number of planned storage groups. With a SAN solution, you can select a product to easily partition the virtualized space into separate virtual drives for storage groups and transaction log files.
In addition, because transaction log files are critical to the operation of a server, you should protect the drives against failure, ideally by hardware mirroring using RAID. A RAID-0+1 configuration (in which data is mirrored and then striped) is recommended.
Distribute the database drives across many Small Computer System Interface (SCSI) channels or controllers, but configure them as a single logical drive to minimize SCSI bus saturation.
An example disk configuration is as follows:
C:\ System and boot (mirror set)
D:\ Page file
E:\ Transaction log files for storage group 1 (mirror set)
F:\ Transaction log files for storage group 2 (mirror set)
G:\ Database files for both storage groups (multiple drives configured as hardware stripe set with parity
The following drives should always be formatted for NTFS:
Partition containing Exchange binaries
Partitions containing transaction log files
Partitions containing database files
Partitions containing other Exchange files
SMTP Queue Directory
The SMTP Queue directory has an important role in the Simple Mail Transfer Protocol (SMTP) message queuing process. The SMTP Queue directory stores SMTP messages until they are written to a database (public or private, depending on the type of message) or sent to another server or connector. Because the SMTP queuing process is write-intensive, it is important to configure your system for maximum performance.
Typically, messages are stored in the SMTP queue for a short time. However, in some situations (particularly when downstream processes fail), the SMTP queue could be required to store a large amount of data. Therefore, your storage solution for the SMTP queue should optimize performance before considering capacity and reliability.
By default, Exchange stores SMTP messages in the following folder:
This folder exists in the same partition on which you install Exchange 2003. In some scenarios (for example, when you configure a bridgehead server), you can improve the performance of the Exchange 2003 server if you move the Mailroot folder to a different hard disk or partition.
SMTP Queue Considerations
As you plan the location of your SMTP queue data, consider the following:
Do not assume that a RAID-0 array is the best storage solution for SMTP queues. Generally, RAID-0 is acceptable only if mail loss is acceptable. RAID-1 is a good solution because it gives some measure of reliability while providing adequate throughput. However, if you are looking for the highest performance and reliability, using RAID-0+1 for the SMTP queue is worth the extra investment.
In Exchange 2003, you can now use Exchange System Manager to change the location of the Queue directory. In Exchange System Manager, this option is available from the Message tab of the SMTP virtual server object.
Content Indexing Files
Content indexing causes excessive paging while the databases are being scanned, as well as excessive writes to the content indexing file. As a result, the content indexing file should not be located on the same disk as the page file (although that is the default location). Because the content indexing file is a random-access file, it can be placed on the same drive as the databases, provided that the disk subsystem can handle the load.
Hard Disk Space Considerations
Ensure that you have adequate hard disk capacity for your Exchange servers. You should have enough space on your hard disk to restore both the database and the log files.
You could have a backup that is too large to restore to its original location. For example, a normal backup performed once a week, plus six days of differential backups, might require more disk space during a restore than your server has available. Whether the restore requires more disk space than you have available depends on how many log files are generated during a week. For example, a server generating 2,000 log files in a week amounts to 10 GB of log file space, in addition to the space required for the database.
Performing normal backups on a daily basis reduces the amount of space required to restore your Exchange databases. The reason for this reduced space is that normal backups delete the transaction log files up to the time that you perform the backup. If you need to restore your Exchange databases, perform normal backups on a daily basis to ensure that you do not have to restore more than one day's worth of log files.
Also, you should never let your database drive (the hard disk containing the .edb and .stm files) become more than half full. Although a database drive that is half full results in unused disk space, it can still reduce extended server downtime for the following reasons:
You can restore databases faster than with a full drive (especially if the file system is fragmented).
You can perform offline defragmentation on the same physical disk instead of copying databases over to a maintenance server (a task that takes much longer than copying database files to a temporary directory on the same physical hard disk).
You can back up a copy of the databases to the same physical disk before you restore them, which enables you to attempt to repair the databases if a problem occurs during the restore process (for example, if the existing backup contains errors). For this reason, it is recommended that you move or copy the current database and log files before restoring a database. For information about restoring Exchange databases, see the Exchange Server 2003 Disaster Recovery Operations Guide.
Given the large size of the average database, copying your most current database to a different physical disk drive or to another server may add several hours to your downtime. However, if you have sufficient local disk space on the same physical drive, you can move the current database files to another folder by using a command prompt or Windows Explorer before you perform the restore.
Disk Performance and I/O Throughput
Having sufficient disk I/O throughput to support a specific number of users is just as important as having sufficient disk space. This is especially important for disk-intensive applications such as Exchange 2003. In general, the speed and number of physical disks have the largest influence on the overall storage system performance. If large or slow disks are used on a SAN to provide the required storage space, disk I/O requirements (and not storage space) become the deciding factor for sizing the storage configuration. In such a case, more disks may be required, not for the additional storage space, but for the increased I/O provided by the additional spindles.
For example, consider a properly sized and balanced logical unit number (LUN) composed of ten 18-GB 15,000 RPM disk drives for a database set with 50-megabyte (MB) mailbox quotas. Five 36-GB 15,000 RPM disk drives would provide the same amount of storage but with only half the disk spindles and, therefore, only half the I/O operations per second (which may be inadequate disk I/O performance). It is also important to note that doubling the LUN size by using ten 36-GB 15,000 RPM disk drives does not necessarily mean that the mailbox quota can be increased from 50 to 100 MB. Although such a configuration is likely to provide equal or better I/O per second performance, the potential doubling of database sizes will likely extend the database recovery window beyond what is called for in the SLA.
Following the performance assumptions in the Exchange 2003 MAPI Messaging Benchmark version 3 (MMB3), on an average, each user generates 0.5 I/O requests per second to the database that contains their mailbox. By analyzing disk I/O ratings, it is possible to estimate the required spindle counts from an I/O perspective.
For more information about MMB3, see Exchange Server 2003 MAPI Messaging Benchmark 3 (MMB3).
To help explain this concept, the following table lists sample disk I/O per second ratings.
Sample disk I/O per second ratings
|Disk speed||Rated I/O per second||MMB3 users per disk|
The disk I/O ratings for a 7,200 RPM disk (listed in Table 4.3) show that one 7,200 RPM spindle provides enough disk I/O for 200 concurrent MMB3 users. A stripe set for a single storage group containing 2,000 users would require ten 7,200 RPM disks. In a RAID 0+1 set (which offers vastly better recoverability than a plain striped array), 20 disks are required for the 2,000-user storage group's databases.
As per the data in Table 4.3, supporting 2,000 MMB3 users (each with a 50-MB mailbox quota) in a storage group using 36-GB 10,000 RPM disks requires eight disks in a stripe set (or 16 for a RAID 0+1 set). Eight 36-GB disks translate into a 288-GB LUN, which far exceeds the storage size requirement of a 110-GB LUN. In this case, with 360-GB 10,000 RPM disks available, the I/O requirements, and not the storage size, drive the overall storage requirements.
Optimizing disk I/O on the SAN is one of the largest performance-enhancing steps you can take in your Exchange organization. Each SAN vendor has different options and requirements for doing so. Therefore, you should understand and design SAN implementations with specific I/O requirements in mind. For Exchange, these requirements include:
Exchange uses 4-KB pages as its native I/O size, even though many transactions may result in read or write requests for multiple pages.
Transaction log LUNs should be optimized for sequential writes because logs are always written to, and read from, sequentially.
Database LUNs should be optimized for an appropriate weight of random reads and writes. This weight can be experimentally determined by using the Exchange Stress and Performance (ESP) and Jetstress tools.
Recoverability and SLA requirements should be considered.
For more information about how to plan your storage system to maximize performance and availability, see the Solution Accelerator for MSA Enterprise Messaging.
For information about planning your storage architecture, see "MSA Storage Architecture" in the MSA Reference Architecture Kit.
Disk defragmentation involves rearranging data on a server's hard disks to make the files more contiguous for more efficient reads. Defragmenting your hard disks helps increase disk performance and helps ensure that your Exchange servers run smoothly and efficiently.
Because severe disk fragmentation can cause performance problems, run a disk defragmentation program (such as Disk Defragmenter) on a regular basis or when server performance levels fall below normal. Because more disk reads are necessary when backing up a heavily fragmented file system, make sure that your disks are recently defragmented.
Exchange databases also require defragmentation. However, fragmentation of Exchange data occurs within the Exchange database itself. Specifically, Exchange database defragmentation refers to rearranging mailbox store and public folder store data to fill database pages more efficiently, thereby eliminating unused storage space.
There are two types of Exchange database defragmentation: online and offline.
By default, on Exchange 2003 servers, online defragmentation occurs daily between 01:00 (1:00 A.M.) and 05:00 (5:00 A.M.). Online defragmentation automatically detects and deletes objects that are no longer being used. This process provides more database space without actually changing the file size of the databases that are being defragmented.
To increase the efficiency of defragmentation and backup processes, schedule your maintenance processes and backup operations to run at different times.
The following are two ways to schedule database defragmentation:
To schedule database defragmentation for an individual database, use the Maintenance interval option on the Database tab of a mailbox store or public folder store object.
To schedule database defragmentation for a collection of mailbox stores and public folder stores, use the Maintenance interval option on the Database (Policy) tab of a mailbox store or a public folder store policy.
For information about how to create a mailbox store policy or public folder policy, see "Create a Mailbox Store Policy" and "Create a Public Folder Store Policy" in Exchange 2003 Help.
Offline defragmentation involves using the Exchange Server Database Utilities (Eseutil.exe). Eseutil.exe creates a new database, copies the old database records to the new one, and then discards unused pages, resulting in a new compact database file. To reduce the physical file size of the databases, you must perform an offline defragmentation in the following situations:
After performing a database repair (using Eseutil /p)
After moving a considerable amount of data from an Exchange database.
When an Exchange database is much larger than it should be.
You should consider an offline defragmentation only if many users are moved from the Exchange 2003 server or after a database repair. Performing offline defragmentation when it is not needed could result in decreased performance.
When using Eseutil.exe to defragment your Exchange databases, consider the following:
To rebuild the new defragmented database on an alternate location, run Eseutil.exe in defragmentation mode (using the command Eseutil /d) and include the /p switch. Including the additional /p switch during a defragmentation operation allows you to preserve your original defragmented database (in case you need to revert to this database). Using this switch also significantly reduces the amount of time it takes to defragment a database.
Because offline defragmentation alters the database pages completely, you should create new backups of Exchange 2003 databases immediately after offline defragmentation. If you use the Backup utility to perform your Exchange database backups, create new Normal backups of your Exchange databases. If you do not create new Normal backups, previous Incremental or Differential backups will not work because they reference database pages that were re-ordered by the defragmentation process.
Optimizing Memory Usage
Virtual address space is important to consider when deploying an Exchange messaging system. A server's virtual address space usage determines a mailbox server's overall performance and scalability. When virtual memory runs low, performance decreases dramatically. Although Exchange 2003 automatically optimizes usage for small-sized to medium-sized servers, additional tuning is necessary for servers with more than 1 GB of physical memory.
For information about the effects of virtual memory fragmentation, as well as guidelines for optimizing memory usage, see the Exchange Server 2003 Performance and Scalability Guide.
Other Windows and Exchange Configuration Issues
There are many configuration recommendations to consider as you plan a highly available messaging system. However, a detailed explanation of these configuration recommendations is beyond the scope of this guide. For complete information about configuring your messaging system for high availability, scalability, and performance, see the Exchange Server 2003 Performance and Scalability Guide.