Monitoring applications using MSAL.NET

To ensure authentication services using MSAL.NET are running correctly, MSAL provides many ways to monitor its behavior so that issues can be identified and addressed before they occur in production. The incorrect use of MSAL (as it relates to token lifecycle and cache) doesn't lead to immediate failures, however sometimes they'll bubble up under high traffic scenarios after the app is in production for a period of time.

For example, if only one instance of a confidential client application is used and MSAL isn't configured to serialize the token cache, the cache will grow forever. Another issue arises when creating a new confidential client application and not utilizing the cache which will lead to issues such as throttling from the identity provider. For recommendations on how to utilize MSAL appropriately, see High Availability.

Logging

One of the tools MSAL provides to combat production issues is logging errors when MSAL isn't properly configured. It's critical to enable logging whenever possible to monitor logs for errors and help in the diagnosis of problematic events. See Logging in MSAL.NET for details.

The following errors will be logged in MSAL:

  • When using an authority ending in /common or /organizations for client credential authentication (AcquireTokenForClient(IEnumerable<String>)).
    • The current authority is targeting the /common or /organizations endpoint which is not recommended. See Client credential flows for more details.
  • When the default internal token cache is used while using confidential client applications.
    • The default token cache provided by MSAL is not designed to be performant when used in confidential client applications. Refer to Token cache serialization in MSAL.NET for more details.

Metrics

In addition to logging, MSAL exposes important metrics in AuthenticationResult.AuthenticationResultMetadata. See Add monitoring around MSAL operations for more details.

  • DurationTotalInMs - total time spent in MSAL acquiring a token, including network calls and cache operations. Create an alert on overall high latency (more than 1 second). Note that the first ever token acquisition call usually makes an extra HTTP call.

  • DurationInCacheInMs - time spent loading or saving the token cache, which is customized by the app developer (for example, save to Redis). Create an alert on spikes.

    Note

    To understand how to customize token caching, see Token cache serialization in MSAL.NET.

  • DurationInHttpInMs - time spent making HTTP calls to the identity provider. Create an alert on spikes.

  • TokenSource- indicates the source of the token - typically the cache or the identity provider. Tokens are retrieved from the cache much faster (for example, ~100 ms versus ~700 ms). This metric can be used to monitor the cache hit ratio.

  • CacheRefreshReason - specifies the reason for fetching the access token from the identity provider. See CacheRefreshReason. Use in conjunction with TokenSource.

  • TokenEndpoint - the actual token endpoint URI used to fetch the token. Useful to understand how MSAL resolves the tenant in silent calls and the region in regional calls.

    Note

    Regionalization is available only to internal Microsoft applications.

  • RegionDetails - the details about the region used to make call, such as the region used and any auto-detection error.

    Note

    Regionalization is available only to internal Microsoft applications.

OpenTelemetry

Starting with MSAL 4.58.0, the library supports OpenTelemetry - a set of APIs that enable instrumentation, generation, and collection of telemetry data in a consistent and standardized manner. To get started, ensure that you;

  1. Install the latest version of MSAL.NET.
  2. Add the OpenTelemetry package dependency to your project.
  3. Add an exporter dependency, that allows you to export logs, for example, the Console exporter for OpenTelemetry.NET.

Note

While the console exporter is a good start for local debugging and diagnostics, it's not the best choice for production-deployed applications. We recommend checking out the official exporter documentation to learn more about available options. If you are hosting applications on Azure, you may consider ingesting OpenTelemetry data in Azure Data Explorer or Azure Monitor.

In your application initialization code, prior to bootstrapping the MSAL authentication client (e.g., PublicClientApplication or ConfidentialClientApplication), declare a new MeterProvider instance, using the following code.

using var meterProvider = Sdk.CreateMeterProviderBuilder()
    .AddMeter("MicrosoftIdentityClient_Common_Meter")
    .AddConsoleExporter()
    .Build();

This will initialize the meter provider and use the built-in MSAL.NET meter (MicrosoftIdentityClient_Common_Meter) that captures a series of counters and histograms. When a console exporter is used, you should see the output being piped directly in the terminal:

Example of OpenTelemetry outputting metrics to the terminal

The following section outlines the supported counters and histograms for the default meter.

Counters

msalsuccess_counter

Counter to capture aggregation of successful requests in MSAL.

Metadata
Field Description
MsalVersion Version of MSAL used.
Platform .NET SKU used.
ApiId ID for the API used for token acquisition.
TokenSource Source of token (e.g., identity provider or cache).
CacheRefreshReason Reason for cache refresh.
CacheLevel L1, L2, or Unknown when the custom cache is used but level is not recorded.

msalfailure_counter

Counter to capture aggregation of failed requests in MSAL.

Metadata
Field Description
MsalVersion Version of MSAL used.
Platform .NET SKU used.
ErrorCode Microsoft Entra ID error code in case of MsalServiceException, MsalErrorCode in case of MsalClientException or name of the exception in case it isn't an MsalException.

Histograms

MsalTotalDuration_1a_histogram

Histogram to capture total latency in milliseconds for token acquisition through MSAL.

Metadata
Field Description
MsalVersion Version of MSAL used.
Platform .NET SKU used.
ApiId ID for the API used for token acquisition.
CacheLevel L1, L2, or Unknown when the custom cache is used but level is not recorded.
TokenSource Source of token (e.g., identity provider or cache).

MsalDurationInL1CacheInUs_1b_histogram

Histogram to capture latency when an L1 cache is used. Values are in microseconds for token acquisition through MSAL.

Metadata
Field Description
MsalVersion Version of MSAL used.
Platform .NET SKU used.
ApiId ID for the API used for token acquisition.

MsalDurationInL2Cache_1a_histogram

Histogram to capture L2 cache latency in milliseconds for token acquisition through MSAL.

Metadata
Field Description
MsalVersion Version of MSAL used.
Platform .NET SKU used.
ApiId ID for the API used for token acquisition.

MsalDurationInHttp_1a_histogram

Histogram to capture HTTP latency in milliseconds for token acquisition through MSAL.

Metadata
Field Description
MsalVersion Version of MSAL used.
Platform .NET SKU used.
ApiId ID for the API used for token acquisition.

Additional information

For additional information on the use of OpenTelemetry with .NET applications, refer to .NET observability with OpenTelemetry.