Transaction Log File Replay: Soft Recovery and Hard Recovery in Exchange Server 2003

 

As used in Microsoft® Exchange Server 2003, the word recovery must be distinguished from the word restore. Restore is the act of putting database and log files back into place on a server, and recovery is the act of replaying transaction logs into the restored database.

Types of Recovery

There are two forms of recovery:

  • Soft recovery   A transaction log replay process that occurs when a database is re-mounted after an unexpected stop, or when transaction logs are replayed into an offline file copy backup of a database.

  • Hard recovery   A transaction log replay process that occurs after restoring a database from an online backup.

Soft Recovery

In the default soft recovery scenario, an external event unexpectedly stops an Exchange database, but the database and log files remain intact and in place. When the database is mounted again, Exchange reads the checkpoint file and begins to replay the transaction log that is listed as the checkpoint log. If no checkpoint file exists, replay begins with the oldest log file available in the transaction log folder for the storage group.

Exchange writes to the database files completed transactions found in the log file that have not already been written and reverses any incomplete transactions. Exchange never begins writing a transaction into the database files until all the operations composing it have been secured to the log files. You do not need to physically undo or back out a transaction in the database if all uncommitted transaction logs present at the time of the unexpected stop are present when replay begins.

Important

A fundamental assumption of the soft recovery process is that no database or log files have been moved, deleted, or destroyed by the failure—or by the administrator after the failure.

If you remove any required transaction logs from the replay sequence, Exchange fails soft recovery immediately. If needed transaction logs are missing, you must either perform recovery with an older, restored copy of the database (one that does not require those logs), or you must repair the database with the Exchange Server Database Utilities (Eseutil.exe) tool.

Some of the fundamental rules of transaction log file replay are:

  • You cannot replay log files from one database against a different one.   The operations inside a log file are low-level. You will not see anything inside a log file such as "Deliver Message A to Mailbox B." A better example of a log file operation is "Write this stream of 123 bytes to byte offset 456 on database page 7890."

    Imagine that you gave someone instructions for editing a document, and your instructions are "On page five, paragraph four, in the third sentence, insert the phrase 'to be or not to be' after the second word." If these instructions were applied to a document other than the one intended, the result would be random corruption of the document. Likewise, if the wrong log files were played against an Exchange database, a similar result would occur. Exchange therefore has multiple safeguards to prevent such corruption.

    If you defragment or repair an Exchange database, transaction logs that previously were associated with this database can no longer be replayed into it. If you try to replay log files after a defragmentation or repair, Exchange skips the inappropriate transaction logs. Again, consider the analogy of editing the document. If a paragraph has been moved, edited, or deleted since the instructions were created, applying the out-of-date instructions would be as destructive as applying them to an entirely different document.

  • You cannot replay log files unless all uncommitted log files from the time the database was last running are available.   You must have all log files starting from the checkpoint at the time the database was backed up. You can then replay log files from this point as long as they follow an unbroken sequence. If there is a single log file missing in the middle or from the beginning of the sequence, replay stops there.

  • You cannot replay log files if the database files have been moved to a different file path location.   This restriction does not apply if you are using Exchange 2000 Server SP2 or later because Eseutil.exe handles replay even if there has been a path change. The sections below describe how the replay process works in more detail.

  • You cannot replay log files if the checkpoint file points to the wrong log.   Exchange treats a checkpoint log as if it were the first log available and ignores all older log files. If you restore an older file copy of the database, the checkpoint will be too far ahead, and Exchange tries to start log file replay from a log file that is too new. You can solve this problem by removing the checkpoint file; thus forcing Exchange to scan all available logs. (If you restore an online backup, hard recovery ignores the checkpoint file.)

  • You cannot replay log files if any database files for the storage group have been removed.   All databases that were running at the time of an unexpected stop must still be present for soft recovery to succeed. This limitation can be overcome by using Eseutil.exe to run soft recovery.

    If soft recovery runs for other databases in a storage group while one database is missing, future log file replay situations may be complicated. By failing soft recovery, Exchange gives the administrator a chance to analyze the situation and decide whether to proceed without the database.

Advanced Soft Recovery Scenarios

In most cases, the best way to run soft recovery is to mount any database in a storage group. Because all databases in a storage group share a single stream of log files, soft recovery occurs at the level of the entire storage group and not at the level of the individual database.

In some special circumstances, there are advantages to running soft recovery using Eseutil.exe. The most common scenarios are:

  • You want to recover a storage group that has a missing database.

  • You want to recover an individual database "out of place" without affecting other databases or the storage group's log files.

The complete syntax for the Eseutil.exe soft recovery function, listing all possible switches, is:

ESEUTIL /r enn /L[path to log files] /s[path to checkpoint file] /d[path to database file] /i

Example: ESEUTIL /r e01 /Lf:\mdbdata /sc:\exchsrvr\mdbdata /dg:\mdbdata /i

Note

Eseutil.exe command line parameters are not case sensitive; they are mixed in case as shown above to avoid confusion between the "L" and "I" characters.

The above example shows the recovery of the databases for a storage group in which the log file prefix is E01, the log files reside in f:\mdbdata, the checkpoint file resides in c:\exchsrvr\mdbdata, the database and streaming files reside in g:\mdbdata, and missing databases are ignored (because of the /i switch at the end of the command).

The minimum Eseutil.exe command line needed to run soft recovery is:

ESEUTIL /r Enn

This command works only if run from a prompt set to the transaction logs directory. You should also be aware of the following when using Eseutil.exe to run soft recovery:

  • If you do not specify any file paths on the command line, Eseutil.exe uses your current command prompt directory as the default for both log files and the checkpoint file.

  • Database files do not have to be in the log file path. The log files record the database paths, and therefore Eseutil.exe discovers all database paths by reading the log files. Use the /D switch to override the paths stored in the log files only when you are sure the paths in the log files are incorrect.

  • If the checkpoint file is not present in the same path as the transaction logs, all log files are scanned during replay, rather than starting replay from the checkpoint log. You can copy an existing checkpoint file temporarily to the log file path. After soft recovery is complete, Exchange no longer uses this copy of the checkpoint file in normal database operation.

    If the information in the checkpoint file is incorrect, soft recovery fails but does not harm the database. You can try recovery again after removing the checkpoint file or finding the correct one. A checkpoint file is not essential to successful recovery, but it can save significant time if you have a large number of log files.

If you want to begin recovery when a database is missing from the storage group, you can use the command:

ESEUTIL /r Enn /i

The /i switch means ignore missing databases. If you use this switch and then mount the missing database, Exchange prompts you to create a new database. If you intend to restore the old database at some point, you will not be able to replay the new data into it. You now have two separate versions of the same logical database.

This scenario, where one database in the storage group has been replaced by an empty database, is one in which the recovery storage group can help. You can mount the extra database in the recovery storage group, and use ExMerge to add the contents of one database to the other.

If you want to begin recovery "out of place" to recover a single database without affecting other databases in the storage group, you should create a new, empty folder and move the database files that you want to recover, the transaction logs that you want to replay, and a checkpoint file (if desired) into this path. This path must not contain other database files.

Once you have isolated your databases and logs together into a folder by themselves, run the following command from that folder:

ESEUTIL /r Enn /i /d

By using the /d switch with no path designation, you override the database path set in the log files. In addition, because no other databases are available in this folder, you hide the other databases on the server from this particular recovery process.

If you do not use the /d parameter correctly, the recovery process can affect other databases on the server. Even in the worst case, the recovery process will not damage other databases. However, recovery may fail on the database that you are working with. This recovery operation may even impact future log file replay capabilities against other databases.

Note

The likelihood of errors increases as the command line becomes more complex. As a general rule, then, minimize the specified path information on the command line when using Eseutil.exe. In this case, change to the directory where the files are located and include the \exchsrvr\bin directory in your system path.

To run soft recovery, the last log file in the replay sequence must be named Enn.log. If the final log file has already been closed and numbered, you must rename the log before soft recovery will succeed. This requirement does not mean that, where the current Enn.log file has been damaged or destroyed, you can ignore it and rename the previous log in the sequence Enn.log. In Exchange 2000, the Logs Required value in the database header lists the minimum sequence of logs required for recovery, starting from the checkpoint log and continuing to the current log. In earlier versions of Exchange, although no Logs Required value existed to enforce the presence of required logs, recovery still failed if the last log needed was not found. The only difference between Exchange 2000 and later versions was that recovery would fail at the end of log replay instead of at the beginning.

Hard Recovery

Hard recovery must be completed after restoring from online backup. Hard recovery is a log file replay process that is similar to soft recovery, but there are some important differences. In hard recovery:

  • Patch information must be applied to the database during log file replay.

  • The checkpoint file is ignored. Restore.env is used instead of the checkpoint file to determine from which log file recovery should start.

    Exchange 5.5 administrators may be familiar with the Restore in Progress registry key. Restore.env replaces the functionality of that key in Exchange 2000. You can view the contents of the Restore.env file by running the command Eseutil /cm.

  • If the database has been restored to a different path than that from which it was backed up, log file replay succeeds, ignoring the database paths listed in the log files.

  • Restored transaction log files replay first from a temporary folder designated by the administrator before restore. Log files from the normal transaction log folder may also be replayed.

  • Hard recovery does not fail if other databases in the storage group are missing.

Database files (.edb and .stm) restored from an online backup set are restored to the normal paths defined for the database. Restore begins by overwriting existing databases files. If there is any chance that you might need the existing database files in the future, you must move them or back them up before restoring from online backup. Take into consideration that restore of the online backup could fail for any number of reasons. Even if the existing database files cannot be started at the moment, they are probably still repairable, and data could still be salvaged if necessary.

As you begin restoration of an online backup, Exchange prompts you to provide a temporary folder location. The backup program restores transaction log files from the backup set to this location, not to the normal transaction log file path. The backup program also creates the Restore.env file in the temporary folder.

The function of Restore.env in hard recovery is similar to that of the checkpoint file in soft recovery. Restore.env defines the range of transaction log files that should be present in the temporary folder for hard recovery. If you place extra logs in the temporary folder—logs that are outside the range listed in Restore.env—they are not replayed and the recovery process may delete them automatically.

You may have extra log files to replay that are not from an online backup set. In this case, place those logs in the normal transaction logs folder for the storage group and not in the temporary folder. After hard recovery finishes replaying the logs restored from the backup set, the process checks the normal transaction log folder to see if the next log in sequence is available.

Note

If you are restoring to an alternate server, or you have deleted and re-created the original database, only transaction logs in the temporary folder are replayed. Transaction logs in the normal database folder are not replayed. This distinction avoids transaction log replay conflicts in cases where Exchange knows that the database to which it is restoring is not the same as that from which it was backed up. A database restored in this circumstance is called a victimized database.

Note

You can play additional transaction logs for a victimized database by placing them in the temporary folder. In this special case, the recovery process does not delete or ignore them, but does replay them. If you are in doubt about the environment to which you are restoring, place copies of additional transaction logs in both the temporary folder and the normal database folder. Regardless of the victimization status of the database, the recovery process will replay one or the other set of logs. When you restore to a recovery storage group, replay works the same as if you were restoring to an original storage group. You can place additional logs in the recovery storage group database folder, and additional logs placed in the temporary folder will be ignored and deleted.

For example, suppose that six log files, E0000003.log through E0000008.log, are restored from backup into the temporary folder. After these log files have been played, recovery now looks in the running log folder for an E0000009.log file that belongs to the same log sequence. Internal markers in the log files identify them as belonging together. The decision to keep replaying is not made only on the basis of the log file name.

If log file 9 is found, replay continues as long as the next log in the series is available. If log file 9 does not exist, the recovery process creates a new log file called E00.log in the temporary folder. This log file is used only to record the changes in the database needed to shut it down in a consistent state. At this point, the database is completely recovered. It automatically restarts and attaches to the most recent log file in the storage group. The recovery process then deletes all files in the temporary directory.