COLD, WARM, HOT … CLEAN, DIRTY, FREE … What Does All This Mean?

I was asked about these terms from a blog reader.  As I thought about these terms and how to blog about them I became concerned that I might just be as clear as mud, but I will do my best.

Each of the terms means something within a specific context.   For example, when your teenager says something is “BAD” they really mean is it GOOD.  You have to understand the context in which the term is being used to fully understand the meaning.

COLD, WARM, HOT

Take a metal pole as an example.   If I tell you the metal pole is hot you don’t want to touch it and get burnt.  If I tell you it is cold you might envision my fathers favorite Christmas movie where Ralphy’s tongue gets stuck to a frozen pole. 

Let’s talk about these terms in light of a computer and more specifically SQL Server.  There are two main avenues that these three terms are associated with; Synchronization and Resources (Memory, I/O, CPU, etc.)

When talking about synchronization objects (Lock, Latch, Spinlock, Mutex, Semaphore, Event, …) the term HOT is analogous to contention and bottlenecks.   You will often hear this described in terms of ‘The lock is hot’, ‘The latch is hot’, ‘The spinlock is hot.’  When expressed in this context it means you are experiencing a bottleneck on a specific resource.   Usually hot indicates you have many threads/workers attempting to access the same, shared resource.  To simplify this discussion think about your CPU.  The more work on the system the more CPU is consumed.   The CPU has a property of requiring electricity and the physical translation of that energy into CPU activity results in heat.   The more activity the HOTTER the CPU which generally kicks in the cooling fans.   The less work on the system the COOLER the CPU.

image

When you experience a HOT object the goal is to reduce the contention and bottleneck.   This could be something such as partitioning the table or adding an index.  There are is a wide range of remediation activities depending on the type of bottleneck.    For internal structures SQL Server often applies partitioning designs.   For example, SQL Server may employ a per NODE or per CPU based design.   SQL Server can even dynamically partition Latches and Memory Objects to reduce a HOT spot.

COLD synchronization objects are the opposite of HOT.  For example, a value may need to be protected but it is not written to or read frequently.  It is protected by a synchronization object but seldom accessed.

If you talk about database pages HOT is a good thing. If the database page is held in the SQL Server Buffer Pool, main ram it is called HOT.  In fact, during SQL Server startup, as pages are read in and hashed it is termed the warm up phase.   The concept is fast access to the data on the page.  

The closest to the executing code is RAM.   The local SSD used for Buffer Pool Extension is WARM because it is commonly faster than going to the stable storage (COLD, Disk.)

image 

You may hear the same discussion around CPU access to memory.   If you study a CPU you will find various cache levels (L1, L2, …)  In fact, the further away from the CPU the colder the memory is considered.   Having the memory the instructions need in CPU cache allows for faster response.   Having to load data from remote memory locations or secondary cache lines is considered COLD (takes longer.)

You may have experienced this in your own environment.   The production server supports the application (HOT), a secondary site allows (WARM) access and in a disaster you can access a backup (COLD.)

The best way to describe this concept is syrup.   The colder the temperature of syrup the longer you have to wait for the pour to enjoy your breakfast.  When accessing resources your often want HOT access.

CLEAN, DIRTY, FREE

SQL Server tracks database pages in RAM or BPE (Buffer Pool Extension) with a BUF structure.   The BUF structure contains a pointer to the database page in RAM or BPE, a latch used to protect physical changes, … and a status.   The status is where the CLEAN, DIRTY, FREE, ON LRU, HASHED, … state is stored.

When the database page is

  • On the free list the status if FREE
  • Loaded in memory and associated with a database (HASHED for fast lookup), available for LRU (removal based on memory pressure and reference activity)
  • Loaded in memory, associated with a database and data has been modified (DIRTY) – DIRTY implies the database page in memory does not match what is one disk
  • Loaded in memory, associated with a database and data matches the version on disk (CLEAN)

The status is used by various activities.  As an example, Checkpoint and Lazy Writer only write DIRTY pages after making sure the associated log records have been secured.   As soon as the write completes the DIRTY status is removed and the BUF status can again indicate CLEAN.

In the background, lazy writer performs constant page scanning.   Randomly, CLEAN pages are selected and the checksum is re-validated.  If the page is DIRTY the checksum can’t be re-validated as the checksum is only updated when output to disk and until the write is completes it can’t be considered CLEAN.

Lazy Writer helps maintain the free list.  When Lazy Writer encounters a DIRTY status, the page has to be written to disk first.   When the status is CLEAN the page can be removed from the HASH table and directly returned to the the free list (FREE.)

Summary

It could easily take several chapters in a book to fully describe each of these topics and describe the various contexts.  Hopefully it is a bit clearer then mud but I won’t claim that I made it crystal clear.   When you are talking to someone and hear these types of terms be sure to get the correct context to help you make the right assumptions and decisions.

Bob Dorr - Principal Software Engineer SQL Server