Token cache serialization

After Microsoft Authentication Library (MSAL) acquires a token, it caches that token. Public client applications (desktop and mobile apps) should try to get a token from the cache before acquiring a token by another method. Acquisition methods on confidential client applications manage the cache themselves. This article discusses default and custom serialization of the token cache in MSAL.NET.

Summary

The recommendation is:

  • When writing mobile apps, caching is already pre-configured by MSAL.
  • When writing a desktop application, use the cross-platform token cache as explained in desktop apps.
  • When writing new confidential client applications (web apps, web APIs, or service-to-service or daemon apps, use Microsoft.Identity.Web as a higher-level API. It offers integration with ASP.NET Core, ASP.NET Classic, and works standalone as well.
  • Existing confidential client applications that leverage MSAL.NET directly can continue to do so.
  • Web apps and web APIs should use a distributed token cache (e.g., Redis, SQL Server, Azure Cosmos DB) in conjunction with a constrained memory cache.
  • Encryption at rest can be optionally configured using ASP.NET Core Data Protection.
  • Web apps may also rely on session cookies; however, this option is not recommended due to cookie size.
  • Service-to-service and daemon apps may rely on memory caching only. If your app serves many tenants, configure an eviction policy.
  • Managed identity tokens are cached in memory only.

The Microsoft.Identity.Web.TokenCache NuGet package provides token cache serialization within the Microsoft.Identity.Web library. The library provides integration with both ASP.NET Core and ASP.NET Classic, and its abstractions can be used to drive other web app or API frameworks.

Note

The examples below are for ASP.NET Core. For ASP.NET the code is similar, see the ms-identity-aspnet-wepapp-openidconnect web app sample for a reference implementation.

Extension method Description
AddInMemoryTokenCaches Creates a temporary cache in memory for token storage and retrieval. In-memory token caches are faster than other cache types, but their tokens aren't persisted between application restarts, and you can't control the cache size. In-memory caches are good for applications that don't require tokens to persist between app restarts. Use an in-memory token cache in apps that participate in machine-to-machine auth scenarios like services, daemons, and others that use AcquireTokenForClient (the client credentials grant). In-memory token caches are also good for sample applications and during local app development. Microsoft.Identity.Web versions 1.19.0+ share an in-memory token cache across all application instances.
AddSessionTokenCaches The token cache is bound to the user session. This option isn't ideal if the ID token contains many claims, because the cookie becomes too large.
AddDistributedTokenCaches The token cache is an adapter against the ASP.NET Core IDistributedCache implementation. It enables you to choose between a distributed memory cache, a Redis cache, a distributed NCache, or a SQL Server cache. For details about the IDistributedCache implementations, see Distributed memory cache.

In-memory token cache

Here's an example of code that uses the in-memory cache in the ConfigureServices method of the Startup class in an ASP.NET Core application:

using Microsoft.Identity.Web;

public class Startup
{
 const string scopesToRequest = "user.read";
  
  public void ConfigureServices(IServiceCollection services)
  {
   // code before
   services.AddAuthentication(OpenIdConnectDefaults.AuthenticationScheme)
           .AddMicrosoftIdentityWebApp(Configuration)
             .EnableTokenAcquisitionToCallDownstreamApi(new string[] { scopesToRequest })
                .AddInMemoryTokenCaches();
   // code after
  }
  // code after
}

AddInMemoryTokenCaches is suitable in production if you request app-only tokens. If you use user tokens, consider using a distributed token cache.

Token cache configuration code is similar between ASP.NET Core web apps and web APIs.

Distributed token caches

Here are examples of possible distributed caches:

// or use a distributed Token Cache by adding
   services.AddAuthentication(OpenIdConnectDefaults.AuthenticationScheme)
           .AddMicrosoftIdentityWebApp(Configuration)
             .EnableTokenAcquisitionToCallDownstreamApi(new string[] { scopesToRequest }
               .AddDistributedTokenCaches();

// Distributed token caches have a L1/L2 mechanism.
// L1 is in memory, and L2 is the distributed cache
// implementation that you will choose below.
// You can configure them to limit the memory of the 
// L1 cache, encrypt, and set eviction policies.
services.Configure<MsalDistributedTokenCacheAdapterOptions>(options => 
  {
    // Optional: Disable the L1 cache in apps that don't use session affinity
    //                 by setting DisableL1Cache to 'true'.
    options.DisableL1Cache = false;
    
    // Or limit the memory (by default, this is 500 MB)
    options.L1CacheOptions.SizeLimit = 1024 * 1024 * 1024; // 1 GB

    // You can choose if you encrypt or not encrypt the cache
    options.Encrypt = false;

    // And you can set eviction policies for the distributed
    // cache.
    options.SlidingExpiration = TimeSpan.FromHours(1);
  });

// Then, choose your implementation of distributed cache
// -----------------------------------------------------

// good for prototyping and testing, but this is NOT persisted and it is NOT distributed - do not use in production
services.AddDistributedMemoryCache();

// Or a Redis cache
// Requires the Microsoft.Extensions.Caching.StackExchangeRedis NuGet package
services.AddStackExchangeRedisCache(options =>
{
 options.Configuration = "localhost";
 options.InstanceName = "SampleInstance";
});

// You can even decide if you want to repair the connection
// with Redis and retry on Redis failures. 
services.Configure<MsalDistributedTokenCacheAdapterOptions>(options => 
{
  options.OnL2CacheFailure = (ex) =>
  {
    if (ex is StackExchange.Redis.RedisConnectionException)
    {
      // action: try to reconnect or something
      return true; //try to do the cache operation again
    }
    return false;
  };
});

// Or even a SQL Server token cache
// Requires the Microsoft.Extensions.Caching.SqlServer NuGet package
services.AddDistributedSqlServerCache(options =>
{
 options.ConnectionString = _config["DistCache_ConnectionString"];
 options.SchemaName = "dbo";
 options.TableName = "TestCache";
});

// Or an Azure Cosmos DB cache
// Requires the Microsoft.Extensions.Caching.Cosmos NuGet package
services.AddCosmosCache((CosmosCacheOptions cacheOptions) =>
{
    cacheOptions.ContainerName = Configuration["CosmosCacheContainer"];
    cacheOptions.DatabaseName = Configuration["CosmosCacheDatabase"];
    cacheOptions.ClientBuilder = new CosmosClientBuilder(Configuration["CosmosConnectionString"]);
    cacheOptions.CreateIfNotExists = true;
});

For more information, see:

The usage of distributed cache is featured in the ASP.NET Core web app tutorial in the phase 2-2 token cache.

Monitor cache hit ratios and cache performance

MSAL exposes important metrics as part of AuthenticationResult.AuthenticationResultMetadata object. You can log these metrics to assess the health of your application.

Metric Meaning When to trigger an alarm?
DurationTotalInMs Total time spent in MSAL, including network calls and cache. Alarm on overall high latency (> 1 second). Value depends on token source. From the cache: one cache access. From Microsoft Entra ID: two cache accesses plus one HTTP call. First ever call (per-process) takes longer because of one extra HTTP call.
DurationInCacheInMs Time spent loading or saving the token cache, which is customized by the app developer (for example, save to Redis). Alarm on spikes.
DurationInHttpInMs Time spent making HTTP calls to Microsoft Entra ID. Alarm on spikes.
TokenSource Source of the token. Tokens are retrieved from the cache much faster (for example, ~100 ms versus ~700 ms). Can be used to monitor and alarm the cache hit ratio. Use with DurationTotalInMs.
CacheRefreshReason Reason for fetching the access token from the identity provider. Use with TokenSource.

Size approximations

When using a token cache, it's important to consider the potential size of the cache, especially for highly-available and distributed applications. When users log in, there will be a cache entry for each user, around 7KB in size. The size will be larger if you are calling several downstream APIs. For service-to-service authentication, there will be a cache entry for each tenant and downstream API, around 2KB in size.

Detailed estimates are listed below.

Application flows (AcquireTokenForClient, AcquireTokenForManagedIdentity)

  • Only access tokens are cached. One token about 2-3KB when persisted. There will be 1 token per app client ID * tenants * downstream resources. For example a multi-tenanted app serving 1000 tenants and needing tokens for Graph and SharePoint will use: 3KB * 1000 * 2 i.e. approximately 6 MB.

Web site calling downstream web API (AcquireTokenByAuthCode)

  • Access tokens – 4KB; 1 token per app client ID * user * tenant * downstream resource.
  • Refresh token – 2KB; 1 token per client app ID * user.
  • ID token – 2KB; 1 token per client app ID * user * number of tenants where that user logs in.

Note

We strongly recommend using the higher level APIs from Microsoft.Identity.Web for this and not MSAL directly. The caching considerations are the same.

Web API calling other web API (AcquireTokenOnBehalfOf)

Same as for web site scenario, but there will be 1 node for each session, not for each user. By default, MSAL identifies a session by hashing the upstream assertion, but this can be changed. See Long Running OBO Processes.

Note

We strongly recommend using the higher level APIs from Microsoft.Identity.Web for this and not MSAL directly. The caching considerations are the same.

Token cache types

MSAL.NET operates with two types of token caches - user and application.

The application token cache which holds access tokens for this application. It's maintained and updated silently when calling AcquireTokenForClient.

The user token cache holds ID tokens, access tokens, and refresh tokens for accounts MSAL.NET interacts with. It's used and updated silently if needed when calling AcquireTokenSilent. It is updated by each token acquisition method, with the exception of AcquireTokenForClient which only uses the application cache.

Next steps

The following samples illustrate token cache serialization.

Sample Platform Description
active-directory-dotnet-desktop-msgraph-v2 Desktop (WPF) Windows Desktop .NET (WPF) application that calls the Microsoft Graph API. Diagram that shows a topology with a desktop app client flowing to Microsoft Entra ID by acquiring a token interactively and to Microsoft Graph.
active-directory-dotnet-v1-to-v2 Desktop (console) Set of Visual Studio solutions that illustrate the migration of Azure AD v1.0 applications (using ADAL.NET) to Microsoft identity platform applications (using MSAL.NET).
ms-identity-aspnet-webapp-openidconnect ASP.NET (net472) Example of token cache serialization in an ASP.NET MVC application (using MSAL.NET).