How it works: Exchange 2007 CCR and transaction logs

The background workings of a CCR can be quite mysterious but are, all in all, not that difficult to understand. First of all we need to know the following things:

  • There are two fundamental elements for the Exchange store:
    The transaction logs and the database
  • Transaction logs are a maximum of 1 MB in size (in Exchange 2003 this was 5MB) except for Transport logs which are still 5 MB
  • Log files are stored in sequence and in the following format:
    E(number of the storage group)(8 digit hexadecimal number).log
  • The checkpoint file (EDB.chk) stores which log files have been written to the database
  • Log files are only cleared if a successful full backup has been performed

The Exchange ESE

There are five (5) subcomponents that make up the ESE and define the process of moving database in to the database!

Log buffers

When a transaction is first received it is stored in the log buffer. These log buffers are used to hold the received information in the servers memory before the data is written to the transaction logs. The following parameters define a log buffer:

  • Each buffer unit is the size of a disk sector (512 bytes)
  • JET will perform sanitation so that they are a minimum of 128 sectors, a maximum of 10,240 sectors and that the largest 63KB boundary is aligned.

Log writer

Once the log buffers are filled up the data is moved on to the disk and in to the log files. This committing of the data is performed in a synchronous fashion and very fast to assure that a system failure would not cause data loss.

IS Buffers

The IS buffer is the first step to turning the transactions to actual data. Grouped in 4KB pages allocated from memory by Exchange, they serve the purpose of caching the database pages before being written to disk.

When the pages are first created they are marked as clean as there is no data written to them. Once the ESE plays the transactions from the logs into the empty pages in the memory it changes the status to “dirty”.

Version store

Since ESE writes multiple different transactions to a single page in the memory the version store is needed to, not only keep track of these transaction, but also to manage them. It structures the pages that occur as they occur.

Lazy writer

When the ESE gets to the point the dirty pages need to be flushed out of the memory it calls the lazy writer to move the pages from the cache buffers to the disk. As there is a large number of transactions going on at once it is the job of the lazy writer to prioritize the transactions and subsequently handle its task without overloading the disk I/O subsystem.

At this point the transactions in the memory have become static information on the disk and the dirty memory caches are cleaned up and ready to be used again!

The checkpoint file

There are two notable components of the database checkpoint, Namely the checkpoint file and the checkpoint depth.

The checkpoint file (edb.chk) is a file maintained by the database engine for every log file sequence to keep track of the data that has not yet been written to the database file on the disk. It is a pointer in the log sequence that indicates where in the log file the information store needs to start recovery in case of a failure. Without the checkpoint file the information store would start replay from the beginning of the oldest log file on the disk and it would have to check every page in every log file to determine whether it had already been written to the disk.

The checkpoint depth is a threshold which defines when the ESE begins to aggressively flush dirty pages to the database.

ESE Cache

The ESE cache is an area of memory reserved to the information store process (running under store.exe) used to store database pages in memory, reducing read I/Os  and increasing the performance of Exchange.

The EDB file

As the main repository for the mailbox data the fundamental construct of the edb file is that of a table. Hence the need to run ISINTEG in case the edb file gets corrupted as only ISINTEG can fix the tables in the EDB file.

How it comes together

Imagine a client sends a new message. The page that requires updating is read from the file, placed in the ESE cache whilst the log buffer gets notified and records the new transaction in the memory of the server.

These changes are recorded by the database engine but not immediately written to disk. The changes are held in the ESE cache and marked as dirty bits so signal they have not yet been written to the database (committed if you would like to call it that). The version store is used to keep track of the changes, guaranteeing consistency.

As the database pages are changed, the log buffer is notified to commit the changes and the changes get written to a transaction log file. Eventually the dirty database pages are flushed to the database files and the checkpoint is advance.

The CCR functionality

Now that we know how the transactions get written to the log files we can go further and explain how the CCR gets to replicating and replaying these changes! 

In a CCR we have two nodes, an active and a passive node, one which pulls transaction files to the other. The active node is where all the clients with their MAPI sessions will connect to, the passive stands by in case of the failure of the active node with its own copy of the database and log files.

The technology behind the CCR is based on an asynchronous copy, meaning that the passive copy does not get the information from the transaction log replayed in the database at the same time as the active node. In fact, the passive node cannot pull the transaction log file over to replay it unless the active nodes store process has released the log file (aka is done replaying the log file to the database).

So if we continue on our earlier premise, one the checkpoint on the active node is advanced the passive node pulls the log files it still needs to replay over and replays them in to the database.
