Caching Implies Policy

A colleague of mine once said those words in a meeting and they really hit a chord with me. I think there's a lot of meat in those three words.

We often reach for caches to improve performance. However, it is vitally important to make a deliberate, thoughtful, and justifiable-on-the-numbers choice about policy. It is often the case with framework components that you find yourself too “low” in the architecture stack to understand the usage patterns, or to be experiencing uniform ones. Hence you cannot make excellent choices about caching policy which ultimately dooms your cache to mediocrity. Under those circumstances it's almost always a bad idea to do implicit caching.

In the managed world, caching has three hidden costs:

Cached Object Age

This one is very hard to avoid. In order for your cache to be useful it is highly likely that you are saving some object or objects under a cache key. Rather than allow those objects to die when the whatever code is using them no longer needs them, the cache keeps them around at least a bit longer in case those very same objects are used again. The danger of doing this is that by letting the objects age you may be letting those objects get into an older generation, increasing the cost of reclaiming that memory. Depending on what the cache-hit rate, and the volume of objects going through the cache, that could turn into a very bad idea. Extra generation 2 collects could easily erase all your savings.

To do the best job, you need to have a good idea what the lifetime the objects should be and choose your cache policy accordingly.

Cached Object Finalization

This is sort of the same as the previous one. Many caching schemes use weak-references and finalizers to arrange for the recycling of objects. This isn't automatically a bad idea, but again the presence of finalizers causes objects to live longer (and creates work for the finalizer thread). Additionally, because at least one more thread is involved (the finalizer thread), it may be necessary to add synchronization features to your class that could otherwise be avoided. See this posting for more thoughts on finalizers

Transparency of Implementation

Once you decide to put implicit caching into your class, you may be stuck with it forever, or you may find your hands tied on the policy. The darned thing about class features is that customers tend to use them -- the nerve :) -- if you've made a choice that turned out to be not everything you wanted, you might have to live with it because changing it would affect some of your customers very negatively. 

On the other hand, if your caching policy choice happens far enough up the architecture stack then there's a good chance you had it right in the first place, because you have more context about what your customer needs at that point, and also a good chance you can change it for the better later, for the same reason.

Cache Wisely.

Comments

  • Anonymous
    January 22, 2004
    As always, some really useful advice.
    Thanks, Rico.
  • Anonymous
    February 09, 2004
    Rico,

    We are having many problems with GC spiking to 60% under moderate load. Our application is very object heavy. Needless to say, I've been reading a lot of your blogs. =)

    We have tons of un-utilized RAM, so I was thinking instead of recreating new objects on every page view, we could create them once then cache them for future use. What is the performance pro/con of this tactic?

    Tyson
  • Anonymous
    February 09, 2004
    Tell you what I think I should do a full blog entry on pros and cons of recycling rather than just a little one-off response, I think it's a worthwhile topic.

    Very soon (like today or tomorrow). I promise.
  • Anonymous
    May 04, 2004
    Hi Rico,
    It would be really woth doing full blog and also if you could throw some light on these issues :- Below are my observations and concerns
    In Whidbey which is coming a year later , MS added database cache invalidation etc.
    But what are the different approaches that I can use now and what does Microsoft recommend , that is of real value.
    i.e should I create Remote cache server and access Cache over remoting
    or Should I do something that Whidbey is doing manually now i.e every time item is retrieved from Cache should I ping DB to see if it has changed.
    Whats are the pros and cons of each etc.
    Regards,
    Mandeep
  • Anonymous
    May 04, 2004