March 2011

Volume 26 Number 03

Cache Integration - Building and Using Custom OutputCache Providers in ASP.NET

By Brandon Satrom | March 2011

If you’re a Web developer, in the past you may have utilized the output-caching facility provided by ASP.NET. Introduced with the first version of the Microsoft .NET Framework, ASP.NET output caching can improve the performance of serving content to site visitors by retrieving that content from a cache, bypassing re-execution of pages or controllers. This saves your application expensive database calls when returning data that doesn’t update frequently, or that can be stale for periods of time.

The ASP.NET output cache uses an in-memory storage mechanism and, until the .NET Framework 4, it wasn’t possible to override or replace the default cache with your own implementation. With the new OutputCacheProvider type, it’s now possible to implement your own mechanism for caching page output in ASP.NET.

In this article, I’ll discuss two such custom mechanisms. First, using MongoDB—a popular document-oriented database—I’ll create my own provider to facilitate output caching in a simple ASP.NET MVC application. Then, using the same application, I’ll quickly swap out my custom provider to leverage features of Azure—specifically, the new DistributedCache provider that leverages Azure infrastructure to provide a distributed, in-memory cache in the cloud.

Output Caching in ASP.NET

In ASP.NET Web Forms applications, output caching can be configured by adding an OutputCache Page directive to any ASP.NET page or user control:

<%@ OutputCache Duration="60" Location="Any" VaryByParam="name" %>

For ASP.NET MVC applications, output caching is available using an action filter that ships with ASP.NET MVC, and which can be leveraged as an attribute on any controller action:

[OutputCache(Duration=60, VaryByParam="none")]

“Duration” and “VaryByParam” are required in ASP.NET MVC 1 and 2 applications (VaryByParam is optional in ASP.NET MVC 3), and both mechanisms provide several other attributes and parameters that enable developers to control how content is cached (several VaryByX parameters), where it’s cached (Location) and capabilities for setting cache invalidation dependencies (SqlDependency).

For traditional output caching, nothing else is needed to implement this functionality in your applications. The OutputCache type is an HttpModule that runs when your application starts and goes to work when a page directive or action filter is encountered. Upon the first request of the page or controller in question, ASP.NET will take the resulting content (HTML, CSS, JavaScript files and so on) and place each item in an in-memory cache, along with an expiration and a key to identify that item. The expiration is determined by the Duration property, and the key is determined by a combination of the path to the page and any necessary VaryBy values—for example, query string or parameter values if the VaryByParam property is provided. So, consider a controller action defined in this manner:

[OutputCache(Duration=20, VaryByParam="vendorState")]
Public ActionResult GetVendorList(string vendorState)
{
  // Action logic here.
}

In this case, ASP.NET will cache an instance of the resulting HTML view for each occurrence of vendorState (for example, one for Texas, one for Washington and so on) as that state is requested. The key by which each instance is stored, in this case, will be a combination of the path and the vendorState in question.

If, on the other hand, the VaryByParam property is set to “none,” ASP.NET will cache the result of the first execution of GetVendorList and will deliver the same cached version to all subsequent requests, regardless of the value of the vendorState parameter passed into that action. The key—by which this instance is stored when no VaryByParam value is provided—would just be the path. A simplified view of this process is depicted in Figure 1.

image: The ASP.NET Output Caching Process

Figure 1 The ASP.NET Output Caching Process

Beyond the Duration parameter—used to control the life of the item in the cache—and a handful of VaryBy parameters (VaryByParam, VaryByHeader, VaryByCustom, VaryByControl and VaryByContentEncoding) used to control the granularity of cached items, the output cache can be configured to control the location of cached content (client, server or downstream proxy). In addition, ASP.NET 2.0 introduced a SqlDependency attribute, which allows developers to specify database tables that a page or control depends upon so that, in addition to time expiration, updates to your underlying source data can also cause cached items to expire.

Although the .NET Framework 2.0 and 3.0 introduced several enhancements to the default cache provider, the provider itself remained the same: an in-memory store, with no extension points or way to provide your own implementation. The in-memory cache is a perfectly acceptable option in most cases, but can, at times, contribute to diminished site performance as server resources are maxed out and memory becomes scarce. What’s more, the default caching provider mechanism automatically evicts cached resources—regardless of specified duration—when memory does become scarce, which leaves the developer with little control over how cached resources are managed.

Extensible Output Caching in ASP.NET

The release of the .NET Framework 4 introduced a new facility that enables developers to create their own output cache providers and easily plug those providers into new or existing applications with only minor changes to the application and its configuration. These providers are free to use whatever storage mechanism for cached information that they choose, such as local disks, relational and non-relational databases, the cloud or even distributed caching engines such as that provided in Windows Server AppFabric. It’s even possible to use multiple providers for different pages in the same application.

Creating your own output cache provider is as simple as creating a new class that derives from the new System.Web.Caching.OutputCacheProvider abstract class and overriding the four methods that ASP.NET requires to work with cached items. The framework definition for the OutputCacheProvider class is listed here (see bit.ly/fozTLc for more information):

public abstract class OutputCacheProvider : ProviderBase
{
  public abstract object Get(string key);
  public abstract object Add(string key, object entry, DateTime utcExpiry);
  public abstract void Set(string key, object entry, DateTime utcExpiry);
  public abstract void Remove(string key);
}

Once you’ve implemented these four methods, all that’s left is to add the new provider to your web.config, specify it as the default and add an OutputCache directive or attribute to your application. I’ll cover these steps in detail as I walk through the creation of our own output cache provider that uses a document database called MongoDB. But first, it may be helpful to introduce a little context around the tool we’ll be using to build our custom provider.

NoSQL, Document Databases and MongoDB

For much of the past few decades, the preferred application storage mechanism has been the relational database management system (RDBMS), which stores data and relationships in tables. SQL Server and Oracle are examples of RDBMSes, as are most of the popular commercial and open source databases currently in use.

However, not all problems requiring storage fit into the same transactional mold. In the late ’90s, as the Internet expanded and many sites grew to manage large volumes of data, it became obvious that the relational model provided less-than-ideal performance on certain types of data-intensive applications. Examples include indexing large volumes of documents, delivering Web pages on high-traffic sites or streaming media to consumers.

Many companies addressed their growing storage needs by turning to NoSQL databases, a class of lightweight database that doesn’t expose a SQL interface, fixed schemas or pre-defined relationships. NoSQL databases are used heavily by companies such as Google Inc. (BigTable), Amazon.com Inc. (Dynamo) and Facebook (which has a store of more than 50TB for inbox searches) and are experiencing steady growth in popularity and use.

It’s important to note that, while some have used the term NoSQL as a rallying cry to call for the abandonment of all RDBMSes, others emphasize the value of utilizing both types of storage. NoSQL databases were conceived to solve a class of problem that RDBMSes couldn’t—not to replace these systems outright. The discriminating developer would be wise to understand both systems and utilize each where appropriate, even at times mixing both types of storage in a single application.

One situation well-suited for a NoSQL database is output caching. NoSQL databases are ideal for working with transient or temporary data, and cached pages from an ASP.NET application certainly fit that bill. One popular NoSQL option is MongoDB (mongodb.org), a document-oriented NoSQL database used by Shutterfly, Foursquare, The New York Times and many others. MongoDB is a fully open source database written in C++ , with drivers for nearly every major programming language, C# included. We’ll use MongoDB as the storage mechanism for our custom output cache provider.

Building a Custom OutputCacheProvider Using MongoDB

To get started, you’ll want to go to mongodb.org to download and install the tool. The documents at mongodb.org/display/DOCS/Quickstart should tell you everything you need to know to install MongoDB on Windows, Mac OS X and Unix. Once you’ve downloaded MongoDB and tested things out with the shell, I recommend installing the database as a service using the following command from the installation directory (be sure to run cmd.exe as an administrator):

C:\Tools\MongoDB\bin>mongod.exe --logpath C:\Tools\MongoDB\Logs --directoryperdb --install

MongoDB will install itself as a service on your computer and will use C:\Data\db as the default directory for all its databases. The option --diretoryperdb tells MongoDB to create a root directory for every database you create.

After running the previous command, type the following to start the service:

net start MongoDB

Once you have things up and running, you’ll need to install a driver library to work with MongoDB in .NET. There are several options available; I’ll be using the mongodb-csharp driver created by Sam Corder (github.com/samus/mongodb-csharp).

We have MongoDB installed, and we have a driver that we can use within a .NET application, so now it’s time to create our custom output cache provider. To do this, I created a new class library called DocumentCache and added two classes: DocumentDatabaseOutputCacheProvider and CacheItem.

The first is my provider, a public class that subclasses the abstract OutputCacheProvider. The beginning implementation is depicted in Figure 2.

Figure 2 A Starter OutputCacheProvider Class

public class DocumentDatabaseOutputCacheProvider : OutputCacheProvider  
{ 
  readonly Mongo _mongo; 
  readonly IMongoCollection<CacheItem> _cacheItems; 
         
  public override object Get(string key) 
  { 
    return null; 
  } 
  
  public override object Add(string key, object entry, DateTime utcExpiry) 
  { 
    return null; 
  } 
  
  public override void Set(string key, object entry, DateTime utcExpiry) 
  { 
    return; 
  } 
  
  public override void Remove(string key) 
  { 
    return; 
  } 
}

Notice that the second private variable in Figure 2 references CacheItem, the other class I need to create in my project. CacheItem exists to contain the relevant details that my output cache provider needs to work with both ASP.NET and my database, but it isn’t an object needed external to my provider. As such, I define CacheItem as an internal class, as shown here:

[Serializable] 
internal class CacheItem 
{ 
  public string Id { get; set; } 
  public byte[] Item { get; set; } 
  public DateTime Expiration { get; set; } 
}

Id maps to the key provided to me by ASP.NET. You’ll recall that the key is a combination of the path and any VaryBy conditions defined in your page directive or action attribute. The Expiration field corresponds to the Duration parameter, and the Item property is the item to be cached.

We’ll start implementing our provider by setting things up in the constructor of our DocumentDatabaseOutputCacheProvider class. Because we know that ASP.NET maintains a single instance of our provider for the entire life of the application, we can perform some setup work in our constructor, like this:

readonly Mongo _mongo; 
readonly IMongoCollection<CacheItem> _cacheItems; 
         
public DocumentDatabaseOutputCacheProvider() 
{ 
  _mongo = new Mongo(); 
  _mongo.Connect(); 
             
  var store = _mongo.GetDatabase("OutputCacheDB"); 
  _cacheItems = store.GetCollection<CacheItem>(); 
}

The constructor creates a new instance of the Mongo type and connects to the server using the default location (localhost). It then asks MongoDB for the OutputCacheDB database and for an IMongoCollection of our CacheItem type. Because MongoDB is a schema-less database, creating databases on the fly is supported. Your first call to _mongo.GetDatabase(“OutputCacheDB”) will return an instance of a new database, and that database will be created on disk when the first insert occurs.

Now let’s implement the Add method, as shown in Figure 3.

Figure 3 Implementing the Add Method

public override object Add(string key, object entry, DateTime utcExpiry) 
{ 
  key = MD5(key); 
  var item = _cacheItems.FindOne(new { _id = key }); 
  if (item != null) { 
    if (item.Expiration.ToUniversalTime() <= DateTime.UtcNow) { 
      _cacheItems.Remove(item); 
    } else { 
      return Deserialize(item.Item); 
    } 
  } 
  
  _cacheItems.Insert(new CacheItem 
  { 
    Id = key, 
    Item = Serialize(entry), 
    Expiration = utcExpiry 
  }); 
  
  return entry
}

The first thing I do in each method is call the MD5 method on the passed-in key. This method—omitted for brevity, but which is available in the online source code download—generates a database-friendly MD5 hash based on the key that ASP.NET provides to me. Then, I call my IMongoCollection<CacheItem> type, _cacheItems, to query the underlying database for the key in question. Notice the anonymous type (new { _id = key}) passed into the FindOne method. Querying MongoDB is primarily done via selector objects or template documents that specify one or more fields in a document to match in the database. _id is the key that MongoDB uses to store documents, and—by convention of the driver I’m using—that property is automatically mapped to the Id property of my CacheItem class. So when I save a new cache item, as you see in the _cacheItems.Insert method shown in Figure 3, the key is assigned using the Id property, which MongoDB uses to populate the internal _id field of the record. MongoDB is a key-value store, so each CacheItem object is stored using binary-serialized JSON that looks like the following:

{ "_id" : ObjectId(Id), "CacheItem": new CacheItem { Id = key, Item = entry, Expiration = utcExpiry } }

If I find a CacheItem with the same key as the one passed in, I check the expiration of that item against the current UTC time. If the item hasn’t expired, I binary deserialize it using a private method (available in the online source code) and return the existing item. Otherwise, I insert a new item into my store, binary serialize it and return the passed-in entry.

Once I’ve implemented adding items to the cache, I can add the Get method, which will find and return a cached item by key (or null if a result isn’t found) as shown in Figure 4.

Figure 4 Implementing the Get Method

public override object Get(string key) 
{ 
  key = MD5(key);  
  var cacheItem = _cacheItems.FindOne(new { _id = key }); 
  
  if (cacheItem != null) { 
    if (cacheItem.Expiration.ToUniversalTime() <= DateTime.UtcNow) { 
       _cacheItems.Remove(cacheItem); 
      } else { 
        return Deserialize(cacheItem.Item); 
    } 
  } 
    
  return null; 
}

As with the Add method, the Get method also checks the expiration of the item if it exists in the database and, if it has expired, removes it and returns null. If the item exists and hasn’t expired, it’s returned.

Now, let’s implement the Remove method, which accepts a key and removes the item matching that key from the database, as shown here:

public override void Remove(string key) 
{ 
  key = MD5(key); 
  _cacheItems.Remove(new { _id = key }); 
}

Just as with the code our driver uses to get a database that doesn’t yet exist, MongoDB doesn’t complain if we attempt to remove an item that isn’t found in our database. It simply does nothing.

According to our abstract base class, there’s still one final method we need to implement to have a functional custom output cache provider, the Set method. I’ve included it in Figure 5.

Figure 5 Implementing the Set Method

Public Override Void Set(string key, object entry, DateTime utcExpiry)
{
  key = MD5(key); 
  var item = _cacheItems.FindOne(new { _id = key }); 
  
  if (item != null) 
  { 
    item.Item = Serialize(entry); 
    item.Expiration = utcExpiry; 
    _cacheItems.Save(item); 
  } 
  else 
  { 
    _cacheItems.Insert(new CacheItem 
    { 
      Id = key, 
      Item = Serialize(entry), 
      Expiration = utcExpiry 
    }); 
  }
}

At a glance, it may seem that the Add and Set methods are identical, but there’s a key difference between their intended implementation. According to the MSDN Library documents on the OutputCacheProvider class (bit.ly/fozTLc), the Add method of a custom provider should look for a value in the cache that matches the specified key and, if it exists, do nothing to the cache and return the saved item. If that item doesn’t exist, Add should insert it.

The Set method, on the other hand, should always put its value into the cache, inserting the item if it doesn’t exist and overwriting it if it does. You’ll notice, in Figure 3 for Add and Figure 5 for Set, that these methods behave as specified.

With those four methods implemented, we’re now ready to put our provider to work.

Using the MongoDB OutputCacheProvider in ASP.NET MVC

Once we’ve compiled our custom provider, we can add that provider to any ASP.NET application with a few lines of configuration. After adding a reference to the assembly that contains the provider, add the following text to your web.config file in the <system.web> section:

<caching> 
  <outputCache defaultProvider="DocumentDBCache"> 
    <providers> 
      <add name="DocumentDBCache" 
       type="DocumentCache.DocumentDatabaseOutputCacheProvider, DocumentCache" /> 
    </providers> 
  </outputCache>
</caching>

The <providers> element defines all of the custom providers you want to add to your application and defines a name and type for each. Because you can have multiple custom providers in a single application, you’ll also want to specify the defaultProvider attribute, as I do in the preceding code snippet.

My sample application is a simple ASP.NET MVC site with a CustomersController. In that controller is an action called TopCustomers, which returns a list of the top customers for my business. This information is the result of complex calculations and several database queries in my SQL Server database and is only updated once an hour. For these reasons, it’s an ideal candidate for caching. So I add an OutputCache attribute to my action, like so:

[OutputCache(Duration = 3600, VaryByParam = "none")] 
public ActionResult TopCustomers() 
{ 
  var topCustomers = _repository.GetTopCustomers(); 
  return View(topCustomers); 
}

Now, if I run the site and navigate to my TopCustomers page, my custom provider will roll into action. First, my Get method will be called, but because this page isn’t yet cached, nothing will be returned. The controller action will then execute and return the TopCustomers view, as depicted in Figure 6.

image: The Cached TopCustomers View

Figure 6 The Cached TopCustomers View

ASP.NET will then call my custom cache provider, executing the Set method, and the item will be cached. I’ve set the duration to 3,600 seconds—or 60 minutes—and every subsequent request for that time period will use the cached item returned by my Get method, bypassing re-execution of my Controller Action. If any underlying data is changed, updates will be reflected on the first execution after expiration, and that new information will then be cached for an hour. If you want to see MongoDB in action, you have a couple of options. You can open your browser and navigate to https://localhost:28107/, which displays the log, as well as recent queries and statistics about your database. Or, you can run mongo.exe from the bin directory of your MongoDB installation and query your database via the Mongo Shell. For more information about using these tools, see mongodb.org.   

Using the DistributedCache Provider

So what if everything I’ve discussed so far is more than you’d care to dive into? Perhaps you want to leverage an alternative caching mechanism, but you have neither the time nor the desire to roll your own? You’ll be happy to know that, since the introduction of extensible output caching, many alternatives—commercial, open source and provided by Microsoft— are either available or in development. One such example is a cloud-based, distributed in-memory cache: the DistributedCache provider currently available as a part of Azure. If you’re already building cloud-based applications, Azure Caching can speed up access to data for those applications and, because caching is delivered as a cloud service, the setup is simple and requires no overhead to maintain.

At the time of this writing, Azure Caching is part of the Azure Community Technology Preview October Release, so you can access caching features without an active Azure account. However, if you’re an MSDN subscriber, I highly recommend activating your Azure benefits at Microsoft Azure. Go to azure.microsoft.com/ and create an account to use the developer preview features.

Once you’ve created a Labs account, click on the Add Service Namespace link to enable Azure services (see Figure 7).

image: Azure Labs Summary Page

Figure 7 Azure Labs Summary Page

After you’ve set up your service namespace, click on the Cache link, and take note of the service URL and authentication token listed in the cache section (see Figure 8). You’ll need this information to configure your application to use the DistributedCache provider.

image: Azure Labs Cache Settings Page

Figure 8 Azure Labs Cache Settings Page

Next, you’ll need to download and install the Azure SDK (click the download link on the Cache page in the portal). After the installation is complete, you’re ready to configure Azure Caching for your application.

You’ll need to add references to several assemblies that the SDK installation placed on your machine. Using the same ASP.NET MVC application you used for your custom Document database cache, navigate to the SDK install location (default is C:\Program Files*\Azure SDK\V2.0\Assemblies\Cache) and add references to each assembly contained within.

Once you’ve done that, open your web.config and add the following <configSections> entry, keeping any existing configuration sections:

<configSections> 
  <section name="dataCacheClient"  
     type="Microsoft.ApplicationServer.Caching.DataCacheClientSection, 
           Microsoft.ApplicationServer.Caching.Core"  
     allowLocation="true" allowDefinition="Everywhere"/>
</configSections>

Next, create the <dataCacheClient> section, replacing the <host> name, cachePort and <messageSecurity> authorizationInfo properties with details from your portal account, like so:

<dataCacheClient deployment="Simple"> 
  <hosts> 
    <host name="yournamespace.cache.appfabriclabs.com" cachePort="your port" /> 
  </hosts> 
  <securityProperties mode="Message"> 
    <messageSecurity authorizationInfo="your authentication token"> 
    </messageSecurity> 
  </securityProperties> 
</dataCacheClient>

Then, find the <caching> section under <system.web> and add the following provider entry after the entry you created for your custom provider:

<add name="DistributedCache" 
     type="Microsoft.Web.DistributedCache.DistributedCacheOutputCacheProvider, 
      Microsoft.Web.DistributedCache" 
     cacheName="default" />

Finally, change the defaultProvider attribute on the <outputCache> element to “DistributedCache.” The DistributedCacheOutputCacheProvider is a subclass type of the abstract OutputCacheProvider, just like our MongoDB implementation. Now, build and run your application and navigate to the Top Customers page. Try adding a customer while the list is still cached and notice that, as with our MongoDB implementation, the list will remain cached as long as you specify.

Wrapping Up

In this article, I discussed ASP.NET output caching, classical uses of the default in-memory cache, and new extensible caching facilities provided using the OutputCacheProvider abstract class in the .NET Framework 4. I talked about NoSQL and document databases and how these types of systems are ideal for working with transient data, such as cached output. We used MongoDB to build a sample output cache and used that within an ASP.NET MVC application. Finally, we moved our output cache to the cloud, and with minor setup and configuration and no code changes whatsoever, we were able to swap out caching mechanisms in our application.

Extensible output caching is just one of the many great new features in ASP.NET 4, and I hope this exploration of the feature—and of the technologies that can be leveraged along with it—has been useful. To learn more about MongoDB, go to mongodb.org. To learn more about Microsoft Azure, go to windowsazure.com.


Brandon Satrom works as a developer evangelist for Microsoft in Austin, Texas. He blogs at userinexperience.com and can be found on Twitter: @BrandonSatrom.

Thanks to the following technical experts for reviewing this article: Brian H. Prince and Clark Sell