Design: Ephemeral cache explored
Many of .net developers are familiar with the ephemeral model included in the garbage collector. The idea of using generations in order to manage the lifespan of the objects is extremely helpful when you need to control the life of an object. Incidentally, this is applicable to custom cache models where content expiration is an issue.
There are many ways to control the content expiration on a cache and all of them bring to the table pros and cons. Most of the challenges come from the cache size and how the expiration can be controlled using time stamped data. Time structures are expensive and bulky, and browsing the content in order to find the expired ones can be a painful tasks. Many techniques have evolved in order to solve the problem like date indexing or date bucking, but they require careful control over the manipulation and sometimes even index persistence.
The main problem arises when we manipulate a big cache, if we use time stamped a secondary thread, responsible for removing garbage, will analyse the list of entries. If the entry time stamp does meet the expiration criteria the entry is removed. This not only can take resources, also affects the storage fragmentation.
The same concept of the garbage collection can be also applied to our cache service, if we split the cache in generations we can control the cache quality, moving commonly content to a more secure and less intrusive list. When a new query arrives the cache service will retrieve the information from the resources (i.e. a DB). The call is responded and a secondary process inserts the content on the generation 0 cache. This generation tends to grow rapidly as all the call responses are cached. As this cache will grow very quickly using time stamp expiration will be very expensive.
Now, most of this cache will be loaded in memory just because a random single request read from a record that maybe will not be used for the rest of the cache lifespan. This is a real waste of resources. Applying cleaning processes to the cache will leave it very fragmented, reducing the performance. If we want to differentiate commonly used data and non common we can implement a new generation, where the data accessed on the generation 0 is promoted to the generation 1. The second cache will be smaller and will have more useful information, making the first choice for a new query.
We can see on the graphic that the call first go to the older generation, if the result is not there it will go to the first generation. Once is found the entry can be promoted. With this model the first generation can be purged quite often, as is information only requested once (note that an access counter can also be added to the generations adding rate rules like “you need 5 requests in order to be promoted”). What is more, the model can be correctly designed to support more generations keeping a healthy cache with accurate data loaded.
As the higher generations contain valuable information, it is a good idea to persist those caches, otherwise when the service is restarted we are going to lose the valuable tracking information. I tend to use file storages on generation 1 and even DB storage for generation 2. This means that the most valuable generation (2) will be able to be loaded by other caches in our distributed environment. This increases efficiency as the caches will contain only valuable information.
I have personally tried this model on large cache (4 million entries) and the performance and resource consumption are balanced with the response time. The volatile cache is dropped by slides avoiding fragmentation and can be triggered by time or even by resource pressure. If you need more information regarding the implementation do not hesitate to contact me.