Understanding Transaction Logging
[This is pre-release documentation and subject to change in future releases. This topic's current status is: Writing Not Started.]
This topic describes the details of transaction logging in Microsoft Exchange Server 2007 and includes a brief description of circular logging.
Exchange Server transaction logging is a robust recovery mechanism of the Extensible Storage Engine (ESE) that is designed to reliably restore an Exchange database to a consistent state after any sudden stop of the database. The logging mechanism is also used when restoring online backups.
Exchange Transaction Logging
Before changes are made to an Exchange database file, Exchange writes the changes to a transaction log file. After a change has been safely logged, it can then be written to the database file. It is common for these changes to become available to end users just after the changes have been secured to the transaction log, but before they have been written to the database file.
Exchange employs a sophisticated internal memory management system that is tuned for high performance and can efficiently manage the caching of dozens of gigabytes (GBs) of database pages. Therefore, physically writing out changes to the database file is a low-priority task during normal operation.
If a database suddenly stops, cached changes are not lost just because the memory cache was destroyed. When the database restarts, Exchange scans the log files, and reconstructs and applies any changes not yet written to the database file. This process is called replaying log files. The database is structured so that Exchange can determine whether any operation in any log file has already been applied to the database, needs to be applied to the database, or does not belong to the database.
Rather than write all log information to a single large file, Exchange uses a series of log files, each exactly one megabyte, or 1,024 kilobytes (KB), in size. When a log file is full, Exchange closes it and renames it with a sequential number. The first log that is filled ends with the name Enn00000001.log. The nn refers to a two-digit number known as the base name or log prefix.
Log files for each storage group are distinguished by file names with numbered prefixes (for example, E00, E01, E02, or E03). The log file currently open for a storage group is simply named Enn.log—it does not have a sequence number until it has been filled and closed.
The checkpoint file (Enn.chk) tracks how far Exchange has progressed in writing logged information to the database files. There is a checkpoint file for each log stream, and a separate log stream for each storage group. Within a single storage group, all the databases share a single log stream. Thus, a single log file often contains operations for multiple databases.
Log files are numbered in a hexadecimal manner, so the log file after E0000000009.log is E000000000A.log, not E0000000010.log. You can convert log file sequence numbers to their decimal values by using the Windows Calculator (Calc.exe) application in Scientific mode. To do this, run Calc.exe, and then, from the View menu, click Scientific.
To view the decimal sequence number for a specific log file, you can examine its header by using the Exchange Server Database Utilities (Eseutil.exe) tool. The first 4-KB page of each log file contains header information that describes and identifies the log file and the databases it belongs to. The command Eseutil /ml [log file name] displays the header information. For more information about Eseutil, see Eseutil.
If you use the wrong switch for displaying a header (for example, by using /ml with a database header instead of /mh), an error is displayed or the header information that is displayed may be garbled or incorrect.
You cannot view the header of a database while it is mounted. You also cannot view the header of the current log file (Enn.log) while any database in the storage group is mounted. Exchange holds the current log file open as long as one database is using it. You can, however, view the checkpoint file header while databases are mounted. Exchange updates the checkpoint file every thirty seconds, and its header is viewable except during the moment when an update is occurring.
As an Exchange administrator, it is valuable to understand Exchange file headers. If you understand the file headers, you can determine which database and log files belong together and which files are needed for successful recovery.
In the following log file header example, note the first four lines.
Base name: e00 Log file: e00.log lGeneration: 11 (0xB) Checkpoint: (0xB,7DC,6F)
These log file header lines show that this log file is the current log file because the log file name does not have a sequence number. The
lGeneration line shows that when the log is filled and closed, its sequence number will be
B, corresponding to the decimal value
11. The base name is
e00, and therefore the final log file name will be E000000000B.log.
Checkpoint value in the previous header example is not actually read from the log file header, but it is displayed as if it were. Eseutil.exe reads the
Checkpoint value directly from Enn.chk, so you do not have to enter a separate command to learn where the checkpoint file is. If the checkpoint file has been destroyed, the
Checkpoint value reads
NOT AVAILABLE. In this case, the checkpoint is in the current log file (
0xB), and the numbers
6F indicate how far into the log file the checkpoint is. Note that you will seldom have a practical need for this information.
If the checkpoint file is destroyed, Exchange can still recover and replay log files appropriately. But to do so, Exchange begins scanning log files, beginning with the oldest file available, instead of starting at the checkpoint log. Exchange skips data that has already been applied to the database and works sequentially through the logs until data that must be applied is encountered.
Typically, it takes only one or two seconds for Exchange to scan a log file that has already been applied to the database. If there are operations in a log file that must be written to the database, it can take anywhere from 10 seconds to several minutes to apply them. On average, a log file's contents can be written to the database in 30 seconds or less.
When an Exchange database shuts down normally, all outstanding data is written to the database files. After normal shutdown, the database file set is considered consistent, and Exchange detaches it from its log stream. This means that the database files are now self-contained—they are completely up to date. The transaction logs are not required to start the database files.
You can tell whether a database has been shut down cleanly by running the command Eseutil /mh and examining the file headers.
With all databases in a storage group disconnected and in a Clean Shutdown state, all log files can be safely deleted without affecting the databases. If you were then to delete all log files, Exchange would generate a new sequence of logs starting with Enn00000001.log. You could even move the database files to a different server or storage group that has existing log files, and the databases would attach themselves to a different log stream.
Although you can delete the log files after all databases in a storage group have been shut down, doing so will affect your ability to restore older backups and roll forward. The current database no longer needs the existing log files, but they may be necessary if you must restore an older database.
If a database is in a Dirty Shutdown state, all existing transaction logs from the checkpoint forward must be present before you can mount the database again. If these logs are unavailable, you must repair the database by running the command Eseutil /p to make the database consistent and ready to start.
If you have to repair a database, some data will be lost. Data loss is frequently minimal; however, it may be catastrophic. After running Eseutil /p on a database, you should completely repair the database with the following two operations: First, run Eseutil/d to defragment the database. This operation discards and rebuilds all database indexes and space trees. Second, run the Information Store Integrity Checker (Isinteg.exe) tool in its –fix mode. This tool scans the database for logical inconsistencies that are created by discarding outstanding transaction logs. For example, there may be references in the database that are not up to date with each other. Isinteg.exe attempts to correct such problems with the minimum data loss possible.
In addition to allowing Exchange to recover reliably from an unexpected database stop, transaction logging is also essential to making and restoring online backups. For more information about making and restoring online backups, see Database Backup and Restore.
Although it is not recommended as a best practice, you can configure Exchange to save disk space by enabling circular logging. Circular logging allows Exchange to overwrite transaction log files after the data that the log files contain has been committed to the database. However, if circular logging is enabled, you can recover data only up until the last full backup.
In the standard transaction logging that is used by Exchange 2007, each database transaction in a storage group is written to a log file and then to the database. When a log file reaches one megabyte (MB) in size, it is renamed and a new log file is created. Over time, this results in a set of log files. If Exchange stops unexpectedly, you can recover the transactions by replaying the data from these log files into the database. Circular logging overwrites and reuses the first log file after the data it contains has been written to the database.
In Exchange 2007, circular logging is disabled by default. By enabling it, you reduce drive storage space requirements. However, without a complete set of transaction log files, you cannot recover any data more recent than the last full backup. Therefore, in a normal production environment, circular logging is not recommended.
For information about how to enable and disable circular logging in Exchange 2007, see How to Enable or Disable Circular Logging for a Storage Group.
Continuous Replication and Circular Logging
You can combine circular logging with continuous replication. In this configuration, you have a new type of circular logging called continuous replication circular logging (CRCL), which is different from the ESE circular logging described earlier in this topic. Whereas ESE circular logging is performed and managed by the Microsoft Exchange Information Store service, CRCL is performed and managed by the Microsoft Exchange Replication Service.
When enabled, ESE circular logging does not generate additional log files and instead overwrites the current log file when needed. However, in a continuous replication environment, log files are needed for log shipping and replay. As a result, when you enable CRCL, the current log file is not overwritten and closed log files are generated for the log shipping and replay process. Specifically, the Microsoft Exchange Replication Service manages CRCL so that log continuity is maintained, and logs are not deleted by the log deleter if they are still needed for replication. Therefore, enabling CRCL should not negatively affect replication.
In the release to manufacturing (RTM) version of Exchange 2007, combining circular logging with cluster continuous replication (CCR) or local continuous replication (LCR) is supported. However, we do not recommend this because it does not allow a roll-forward recovery after a backup has been restored. Exchange 2007 Service Pack 1 (SP1) also allows storage groups in a CCR, LCR or standby continuous replication (SCR) environment to have circular logging enabled. However, this practice is also not recommended for the reason indicated previously. When enabled in either of these environments, the functionality is CRCL and not ESE circular logging (also known as Joint Engine Technology (JET) circular logging). In a CCR, LCR, or SCR environment, you should always use the following process to enable or disable circular logging:
- Suspend continuous replication by using the Suspend-StorageGroupCopy cmdlet.
- Enable or disable circular logging. For detailed steps about how to enable or disable circular logging, see How to Enable or Disable Circular Logging for a Storage Group.
- Dismount and then mount the database in the storage group that is being enabled or disabled for circular logging.
- Resume continuous replication by using the Resume-StorageGroupCopy cmdlet.
For storage groups in an LCR environment, before running the Enable-StorageGroupCopy cmdlet to turn on LCR for a storage group, you must make sure that the current circular logging setting is detected and utilized by the Microsoft Exchange Information Store service by dismounting and then mounting the database in the storage group. While the Microsoft Exchange Information Store service requires that you dismount and then mount the database to detect and utilize the configuration change, the Microsoft Exchange Replication service is able to detect and utilize the configuration change dynamically and without any restart. Therefore, if the preceding procedure is not performed, a database can end up in a situation where the Microsoft Exchange Replication service considers circular logging to be off (or on) while the Microsoft Exchange Information Store service considers circular logging to be in the opposite state. This can result in log files being truncated prematurely.
Disabling LCR or SCR allows for backups or circular logging to delete log files without copying, but there is no disable option in CCR. Regardless of whether CRCL is enabled, in CCR, logs that have not replicated are never deleted.
To keep the amount of transactional logs that are created under control, administrators sometimes enable circular logging during large mailbox moves. Because a mailbox move is an operation that involves two different databases, you can enable circular logging for both databases during the mailbox move.
Because circular logging interferes with the ability to bring the storage group up to date in case of failure, we do not recommend using circular logging for extended periods of time.
You can enable CRCL in CCR, LCR, and SCR environments for the purposes of mailbox moves or other scenarios that cause a large buildup of log files. However, during the time that CRCL is enabled, you cannot perform any backups. As a result, you must make sure that:
- Mailbox moves do not interfere with backup operations, because if CRCL is turned on, you cannot perform backups.
- CRCL is disabled after the mailbox moves have been completed so that backups can resume.