Q&A on Latches in the SQL Server Engine….

Article
07/08/2009

I recently received a request to shed some light on a few advanced questions about latches and SQL Server. These were good questions so I thought the information might be useful to share with the community. I’ll provide this information in the form of Q&A as I was asked (I’ve paraphrased the questions):

Question: What kind of latch does SQL Server use when reading a page from disk?

Answer: Anytime you talk about latching and database pages, think in terms of BUF (buffer) latches. So to read a page from disk and put this into cache, the worker thread will take out a EX (Exclusive) latch on the BUF associated with the page. Then the read is issues for the page. Then a SH latch is taken out by this same thread on this BUF structure. Since these are not compatible, the thread will wait. Makes sense since we need to wait for the page to be read in from disk before trying to look at it. Any other thread needing to read this page will also wait since you need a SH latch on a BUF to read the page in cache. When the read has completed, the EX latch is released, and now the SH latch is acquired and the thread(s) can read the page in cache.

When a thread has to wait on a BUF latch, the engine will look at the BUF structure to see if a special bit is turned on called BUF_IO. This is set when a page associated with the BUF is currently being read from or written to disk. This is how the engine knows that if your thread is waiting on a latch, that it is waiting on an “IO Latch”. The wait_type will therefore look something like PAGEIOLATCH_SH as opposed to PAGELATCH_SH.

Notice above, I said a SH latch is not compatible with an EX latch. Latches have compatibilities just like locks. Because I couldn’t find this documented anywhere, I published here a table that describes latch compatibility (Y=Compatible;N=Not Compatible)

	KP	SH	UP	EX	DT
KP	Y	Y	Y	Y	N
SH	Y	Y	Y	N	N
UP	Y	Y	N	N	N
EX	Y	N	N	N	N
DT	N	N	N	N	N

You can see from the table, that a SH and EX latch are not compatible. So what are UP, KP, and DT latches?. Well, I will answer that as you read on for the other questions I was asked.

Question: I have a question about latches and writing database pages to disk. The SQL Books Online in a section called Writing Pages says “Just before a page is written, the form of page protection specified in the database is added to the page. If torn page protection is added, the page must be latched EX(clusively) for the I/O. This is because the torn page protection modifies the page, making it unsuitable for any other thread to read. If checksum page protection is added, or the database uses no page protection, the page is latched with an UP(date) latch for the I/O. This latch prevents anyone else from modifying the page during the write, but still allows readers to use it.” Can you explain this further especially why you us an EX latch in one case and an UP latch in another?

Answer: Remember the compatibility table above. An EX latch prevents writers and readers. For torn-page protection, the engine must actually modify contents of the page because we place a 2-bit signature at every 512 bytes on the page. While we are doing this, we don’t want anyone to write or read the page. If you are using checksum though, we only have to change a specific field in the header. No problem to let someone read the page, but we don’t want someone else to write it. So we use an UP latch, because this allows others to read it (i.e. obtain a SH latch) but others can’t write it. Same thing applies for the situation when we have no data protection on the page (PAGE_VERIFY=NONE)

Question: I see in sys.dm_os_wait_stats something called PAGELATCH_KP and PAGELATCH_DT. What the heck (remember I’m from Texas. You are allowed to say “heck”) are these for?

KP stands for “Keep” latch and DT stands for “Destroy” latch. Take a look back above at the compatibility table for KP. Notice that KP is compatible with all latch types except for DT. This means it is very “lightweight” and can be acquired with just about every latch type, except for DT. So what is a keep latch used for? Well, think of it as a “reference count” on the structure associated with the latch such as a BUF. When a thread just wants to “look” at a BUF it can acquire a KP latch and the acquire whatever other latch type it needs (SH for read, etc). The advantage here is that a KP latch ensures that no other thread destroys the BUF it is trying to look at. This is because to destroy a BUF, you need to acquire a DT latch. Notice that DT latch is not compatible with any other type of latch. This make sense because when you need to “destroy” the structure, you don’t want any other thread even accessing it. So what does “destroy” means when talking about a BUF (i.e. a page). Well one example is simply taking a BUF that is “clean” (not modified) and putting it on the free list of buffers to allow another thread to use it for a new page or a page to read in from disk. Moving an existing BUF to the “free list” is kind of like getting rid of a page out of cache. We don’t want any other thread even trying to access this page, so we take out a DT latch during this operation. If another thread then needed this page we removed from cache, the operation of reading it from disk would then kick in. Now you can see how a KP latch helps. It allows anyone to look at a BUF knowing it will not be destroyed “underneath it”.

Bob Ward
Microsoft

Q&A on Latches in the SQL Server Engine….

Additional resources