Monitoring Azure Key Vault

When you have critical applications and business processes relying on Azure resources, you want to monitor those resources for their availability, performance, and operation. For Azure Key Vault, it is important to monitor your service as you start to scale, because the number of requests sent to your key vault will rise. This has a potential to increase the latency of your requests and, in extreme cases, cause your requests to be throttled, which will impact the performance of your service.

This article describes the monitoring data generated by Key Vault. Key Vault uses Azure Monitor. If you are unfamiliar with the features of Azure Monitor common to all Azure services that use it, read Monitoring Azure resources with Azure Monitor.

Monitoring overview page in Azure portal

The Overview page in the Azure portal for each key vault includes the following metrics on the "Monitoring" tab:

  • Total requests
  • Average Latency
  • Success ratio

You can select "additional metrics" (or the "Metrics" tab in the left-hand sidebar, under "Monitoring") to view these metrics as well:

  • Overall service API latency
  • Overall vault availability
  • Overall vault saturation
  • Total service API hits
  • Total service API results

Key Vault insights

Some services in Azure have a special focused pre-built monitoring dashboard in the Azure portal that provides a starting point for monitoring your service. These special dashboards are called "insights".

Key Vault insights provides comprehensive monitoring of your key vaults by delivering a unified view of your Key Vault requests, performance, failures, and latency. For full details, see Monitoring your key vault service with Key Vault insights.

Monitoring data

Key Vault collects the same kinds of monitoring data as other Azure resources that are described in Monitoring data from Azure resources.

See Monitoring Key Vault data reference for detailed information on the metrics and logs metrics created by Key Vault.

Collection and routing

Platform metrics and the Activity log are collected and stored automatically, but can be routed to other locations by using a diagnostic setting.

Resource Logs are not collected and stored until you create a diagnostic setting and route them to one or more locations.

See Create diagnostic setting to collect platform logs and metrics in Azure for the detailed process for creating a diagnostic setting using the Azure portal, CLI, or PowerShell. When you create a diagnostic setting, you specify which categories of logs to collect. The categories for Key Vault are listed in Key Vault monitoring data reference.

To create a diagnostic setting for you key vault, see Enable Key Vault logging. The metrics and logs you can collect are discussed in the following sections.

Analyzing metrics

You can analyze metrics for Key Vault with metrics from other Azure services using metrics explorer by opening Metrics from the Azure Monitor menu. See Analyze metrics with Azure Monitor metrics explorer for details on using this tool.

For a list of the platform metrics collected for Key Vault, see Monitoring Key Vault data reference metrics

Analyzing logs

Data in Azure Monitor Logs is stored in tables where each table has its own set of unique properties.

All resource logs in Azure Monitor have the same fields followed by service-specific fields. The common schema is outlined in Azure Monitor resource log schema

The Activity log is a type of platform log in Azure that provides insight into subscription-level events. You can view it independently or route it to Azure Monitor Logs, where you can do much more complex queries using Log Analytics.

For a list of the types of resource logs collected for Key Vault, see Monitoring Key Vault data reference

For a list of the tables used by Azure Monitor Logs and queryable by Log Analytics, see Monitoring Key Vault data reference

Sample Kusto queries

Important

When you select Logs from the Key Vault menu, Log Analytics is opened with the query scope set to the current key vault. This means that log queries will only include data from that resource. If you want to run a query that includes data from other key vaults, or data from other Azure services, select Logs from the Azure Monitor menu. See Log query scope and time range in Azure Monitor Log Analytics for details.

Here are some queries that you can enter into the Log search bar to help you monitor your Key Vault resources. These queries work with the new language.

  • Are there any clients using old TLS version (<1.2)?

    AzureDiagnostics
    | where TimeGenerated > ago(90d) 
    | where ResourceProvider =="MICROSOFT.KEYVAULT" 
    | where isnotempty(tlsVersion_s) and strcmp(tlsVersion_s,"TLS1_2") <0
    | project TimeGenerated,Resource, OperationName, requestUri_s, CallerIPAddress, OperationVersion,clientInfo_s,tlsVersion_s,todouble(tlsVersion_s)
    | sort by TimeGenerated desc
    
  • Are there any slow requests?

    // List of KeyVault requests that took longer than 1sec. 
    // To create an alert for this query, click '+ New alert rule'
    let threshold=1000; // let operator defines a constant that can be further used in the query
    
    AzureDiagnostics
    | where ResourceProvider =="MICROSOFT.KEYVAULT" 
    | where DurationMs > threshold
    | summarize count() by OperationName, _ResourceId
    
  • Are there any failures?

    // Count of failed KeyVault requests by status code. 
    // To create an alert for this query, click '+ New alert rule'
    
    AzureDiagnostics
    | where ResourceProvider =="MICROSOFT.KEYVAULT" 
    | where httpStatusCode_d >= 300 and not(OperationName == "Authentication" and httpStatusCode_d == 401)
    | summarize count() by requestUri_s, ResultSignature, _ResourceId
    // ResultSignature contains HTTP status, e.g. "OK" or "Forbidden"
    // httpStatusCode_d contains HTTP status code returned
    
  • Input deserialization errors

    // Shows errors caused due to malformed events that could not be deserialized by the job. 
    // To create an alert for this query, click '+ New alert rule'
    
    AzureDiagnostics
    | where ResourceProvider == "MICROSOFT.KEYVAULT" and parse_json(properties_s).DataErrorType in ("InputDeserializerError.InvalidData", "InputDeserializerError.TypeConversionError", "InputDeserializerError.MissingColumns", "InputDeserializerError.InvalidHeader", "InputDeserializerError.InvalidCompressionType")
    | project TimeGenerated, Resource, Region_s, OperationName, properties_s, Level, _ResourceId
    
  • How active has this KeyVault been?

    // Line chart showing trend of KeyVault requests volume, per operation over time. 
    // KeyVault diagnostic currently stores logs in AzureDiagnostics table which stores logs for multiple services. 
    // Filter on ResourceProvider for logs specific to a service.
    
    AzureDiagnostics
    | where ResourceProvider =="MICROSOFT.KEYVAULT" 
    | summarize count() by bin(TimeGenerated, 1h), OperationName // Aggregate by hour
    | render timechart
    
    
  • Who is calling this KeyVault?

    // List of callers identified by their IP address with their request count.  
    // KeyVault diagnostic currently stores logs in AzureDiagnostics table which stores logs for multiple services. 
    // Filter on ResourceProvider for logs specific to a service.
    
    AzureDiagnostics
    | where ResourceProvider =="MICROSOFT.KEYVAULT"
    | summarize count() by CallerIPAddress
    
  • How fast is this KeyVault serving requests?

    // Line chart showing trend of request duration over time using different aggregations. 
    
    AzureDiagnostics
    | where ResourceProvider =="MICROSOFT.KEYVAULT" 
    | summarize avg(DurationMs) by requestUri_s, bin(TimeGenerated, 1h) // requestUri_s contains the URI of the request
    | render timechart
    
  • What changes occurred last month?

    // Lists all update and patch requests from the last 30 days. 
    // KeyVault diagnostic currently stores logs in AzureDiagnostics table which stores logs for multiple services. 
    // Filter on ResourceProvider for logs specific to a service.
    
    AzureDiagnostics
    | where TimeGenerated > ago(30d) // Time range specified in the query. Overrides time picker in portal.
    | where ResourceProvider =="MICROSOFT.KEYVAULT" 
    | where OperationName == "VaultPut" or OperationName == "VaultPatch"
    | sort by TimeGenerated desc
    

Alerts

Azure Monitor alerts proactively notify you when important conditions are found in your monitoring data. They allow you to identify and address issues in your system preemptively. You can set alerts on metrics, logs, and the activity log.

If you are creating or running an application which runs on Azure Key Vault, Azure Monitor Application Insights may offer additional types of alerts.

Here are some common and recommended alert rules for Azure Key Vault -

  • Key Vault Availability drops below 100% (Static Threshold)
  • Key Vault Latency is greater than 1000ms (Static Threshold)
  • Overall Vault Saturation is greater than 75% (Static Threshold)
  • Overall Vault Saturation exceeds average (Dynamic Threshold)
  • Total Error Codes higher than average (Dynamic Threshold)

See Alerting for Azure Key Vault for more details.

Next steps