Collect Prometheus metrics from AKS cluster (preview)
Article
15 minutes to read
This article describes how to configure your Azure Kubernetes Service (AKS) cluster to send data to Azure Monitor managed service for Prometheus. When you configure your AKS cluster to send data to Azure Monitor managed service for Prometheus, a containerized version of the Azure Monitor agent is installed with a metrics extension. You just need to specify the Azure Monitor workspace that the data should be sent to.
Note
The process described here doesn't enable Container insights on the cluster even though the Azure Monitor agent installed in this process is the same one used by Container insights. See Enable Container insights for different methods to enable Container insights on your cluster. See Collect Prometheus metrics with Container insights for details on adding Prometheus collection to a cluster that already has Container insights enabled.
Open the Azure Monitor workspaces menu in the Azure portal and select your cluster.
Select Managed Prometheus to display a list of AKS clusters.
Select Configure next to the cluster you want to enable.
Prerequisites
Register the AKS-PrometheusAddonPreview feature flag in the Azure Kubernetes clusters subscription with the following command in Azure CLI: az feature register --namespace Microsoft.ContainerService --name AKS-PrometheusAddonPreview.
The aks-preview extension needs to be installed using the command az extension add --name aks-preview. For more information on how to install a CLI extension, see Use and manage extensions with the Azure CLI.
Aks-preview version 0.5.122 or higher is required for this feature. You can check the aks-preview version using the az version command.
Install metrics addon
Use az aks update with the -enable-azuremonitormetrics option to install the metrics addon. Following are multiple options depending on the Azure Monitor workspace and Grafana workspace you want to use.
Create a new default Azure Monitor workspace.
If no Azure Monitor Workspace is specified, a default Azure Monitor Workspace is created in the DefaultRG-<cluster_region> following the format DefaultAzureMonitorWorkspace-<mapped_region>.
This Azure Monitor Workspace is in the region specific in Region mappings.
az aks update --enable-azuremonitormetrics -n <cluster-name> -g <cluster-resource-group>
Use an existing Azure Monitor workspace.
If the Azure Monitor workspace is linked to one or more Grafana workspaces, then the data is available in Grafana.
az aks update --enable-azuremonitormetrics -n <cluster-name> -g <cluster-resource-group> --azure-monitor-workspace-resource-id <workspace-name-resource-id>
Use an existing Azure Monitor workspace and link with an existing Grafana workspace.
This creates a link between the Azure Monitor workspace and the Grafana workspace.
az aks update --enable-azuremonitormetrics -n <cluster-name> -g <cluster-resource-group> --azure-monitor-workspace-resource-id <azure-monitor-workspace-name-resource-id> --grafana-resource-id <grafana-workspace-name-resource-id>
The output for each command looks similar to the following:
Following are optional parameters that you can use with the previous commands.
--ksm-metric-annotations-allow-list is a comma-separated list of Kubernetes annotations keys that will be used in the resource's labels metric. By default the metric contains only name and namespace labels. To include more annotations provide a list of resource names in their plural form and Kubernetes annotation keys, you would like to allow for them. A single * can be provided per resource instead to allow any annotations, but that has severe performance implications.
--ksm-metric-labels-allow-list is a comma-separated list of more Kubernetes label keys that is used in the resource's labels metric. By default the metric contains only name and namespace labels. To include more labels provide a list of resource names in their plural form and Kubernetes label keys, you would like to allow for them. A single * can be provided per resource instead to allow any labels, but that has severe performance implications.
--enable-windows-recording-rules lets you enable the recording rule groups required for proper functioning of the windows dashboards.
Use annotations and labels.
az aks update --enable-azuremonitormetrics -n <cluster-name> -g <cluster-resource-group> --ksm-metric-labels-allow-list "namespaces=[k8s-label-1,k8s-label-n]" --ksm-metric-annotations-allow-list "pods=[k8s-annotation-1,k8s-annotation-n]"
Register the AKS-PrometheusAddonPreview feature flag in the Azure Kubernetes clusters subscription with the following command in Azure CLI: az feature register --namespace Microsoft.ContainerService --name AKS-PrometheusAddonPreview.
If the Azure Managed Grafana instance is in a subscription other than the Azure Monitor Workspaces subscription, register the Azure Monitor Workspace subscription with the Microsoft.Dashboard resource provider following this documentation.
The Azure Monitor workspace and Azure Managed Grafana workspace must already be created.
The template needs to be deployed in the same resource group as the Azure Managed Grafana workspace.
Users with 'User Access Administrator' role in the subscription of the AKS cluster can be able to enable 'Monitoring Data Reader' role directly by deploying the template.
Retrieve required values for Grafana resource
From the Overview page for the Azure Managed Grafana instance in the Azure portal, select JSON view.
If you're using an existing Azure Managed Grafana instance that already has been linked to an Azure Monitor workspace, then you need the list of Grafana integrations. Copy the value of the azureMonitorWorkspaceIntegrations field. If it doesn't exist, then the instance hasn't been linked with any Azure Monitor workspace.
Resource ID for the Azure Monitor workspace. Retrieve from the JSON view on the Overview page for the Azure Monitor workspace.
azureMonitorWorkspaceLocation
Location of the Azure Monitor workspace. Retrieve from the JSON view on the Overview page for the Azure Monitor workspace.
clusterResourceId
Resource ID for the AKS cluster. Retrieve from the JSON view on the Overview page for the cluster.
clusterLocation
Location of the AKS cluster. Retrieve from the JSON view on the Overview page for the cluster.
metricLabelsAllowlist
Comma-separated list of Kubernetes labels keys that will be used in the resource's labels metric.
metricAnnotationsAllowList
Comma-separated list of more Kubernetes label keys that will be used in the resource's labels metric.
grafanaResourceId
Resource ID for the managed Grafana instance. Retrieve from the JSON view on the Overview page for the Grafana instance.
grafanaLocation
Location for the managed Grafana instance. Retrieve from the JSON view on the Overview page for the Grafana instance.
grafanaSku
SKU for the managed Grafana instance. Retrieve from the JSON view on the Overview page for the Grafana instance. Use the sku.name.
Open the template file and update the grafanaIntegrations property at the end of the file with the values that you retrieved from the Grafana instance. This is similar to the following:
In this json, full_resource_id_1 and full_resource_id_2 were already in the Azure Managed Grafana resource JSON, and they're added here to the ARM template. If you have no existing Grafana integrations, then don't include these entries for full_resource_id_1 and full_resource_id_2.
The final azureMonitorWorkspaceResourceId entry is already in the template and is used to link to the Azure Monitor Workspace resource ID provided in the parameters file.
Prerequisites
Register the AKS-PrometheusAddonPreview feature flag in the Azure Kubernetes clusters subscription with the following command in Azure CLI: az feature register --namespace Microsoft.ContainerService --name AKS-PrometheusAddonPreview.
The Azure Monitor workspace and Azure Managed Grafana workspace must already be created.
The template needs to be deployed in the same resource group as the Azure Managed Grafana workspace.
Users with 'User Access Administrator' role in the subscription of the AKS cluster can be able to enable 'Monitoring Data Reader' role directly by deploying the template.
Minor Limitation while deploying through bicep
Currently in bicep, there's no way to explicitly "scope" the Monitoring Data Reader role assignment on a string parameter "resource ID" for Azure Monitor Workspace (like in ARM template). Bicep expects a value of type "resource | tenant" and currently there's no rest api spec for Azure Monitor Workspace. So, as a workaround, the default scoping for Monitoring Data Reader role is on the resource group and thus the role is applied on the same Azure monitor workspace (by inheritance) which is the expected behavior. Thus, after deploying this bicep template, the Grafana resource will get read permissions in all the Azure Monitor Workspaces under the subscription.
Retrieve required values for Grafana resource
From the Overview page for the Azure Managed Grafana instance in the Azure portal, select JSON view.
If you're using an existing Azure Managed Grafana instance that already has been linked to an Azure Monitor workspace, then you need the list of Grafana integrations. Copy the value of the azureMonitorWorkspaceIntegrations field. If it doesn't exist, then the instance hasn't been linked with any Azure Monitor workspace.
The main bicep template creates all the required resources and uses two modules for creating the dcra and monitor metrics profile resources from the other two bicep files.
Parameter
Value
azureMonitorWorkspaceResourceId
Resource ID for the Azure Monitor workspace. Retrieve from the JSON view on the Overview page for the Azure Monitor workspace.
azureMonitorWorkspaceLocation
Location of the Azure Monitor workspace. Retrieve from the JSON view on the Overview page for the Azure Monitor workspace.
clusterResourceId
Resource ID for the AKS cluster. Retrieve from the JSON view on the Overview page for the cluster.
clusterLocation
Location of the AKS cluster. Retrieve from the JSON view on the Overview page for the cluster.
metricLabelsAllowlist
Comma-separated list of Kubernetes labels keys that will be used in the resource's labels metric.
metricAnnotationsAllowList
Comma-separated list of more Kubernetes label keys that will be used in the resource's labels metric.
grafanaResourceId
Resource ID for the managed Grafana instance. Retrieve from the JSON view on the Overview page for the Grafana instance.
grafanaLocation
Location for the managed Grafana instance. Retrieve from the JSON view on the Overview page for the Grafana instance.
grafanaSku
SKU for the managed Grafana instance. Retrieve from the JSON view on the Overview page for the Grafana instance. Use the sku.name.
Open the template file and update the grafanaIntegrations property at the end of the file with the values that you retrieved from the Grafana instance. This is similar to the following:
In this json, full_resource_id_1 and full_resource_id_2 were already in the Azure Managed Grafana resource JSON, and they're added here to the ARM template. If you have no existing Grafana integrations, then don't include these entries for full_resource_id_1 and full_resource_id_2.
The final azureMonitorWorkspaceResourceId entry is already in the template and is used to link to the Azure Monitor Workspace resource ID provided in the parameters file.
Prerequisites
Register the AKS-PrometheusAddonPreview feature flag in the Azure Kubernetes clusters subscription with the following command in Azure CLI: az feature register --namespace Microsoft.ContainerService --name AKS-PrometheusAddonPreview.
The Azure Monitor workspace and Azure Managed Grafana workspace must already be created.
Download Azure policy rules and parameters and deploy
Download the main Azure policy rules template from here and save it as AddonPolicyMetricsProfile.rules.json.
Download the parameter file from here and save it as AddonPolicyMetricsProfile.parameters.json in the same directory as the rules template.
Create the policy definition using a command like: az policy definition create --name "(Preview) Prometheus Metrics addon" --display-name "(Preview) Prometheus Metrics addon" --mode Indexed --metadata version=1.0.0 category=Kubernetes --rules .\AddonPolicyMetricsProfile.rules.json --params .\AddonPolicyMetricsProfile.parameters.json
After creating the policy definition, go to Azure portal -> Policy -> Definitions and select the Policy definition you created.
Select 'Assign' and then go to the 'Parameters' tab and fill in the details. Then select 'Review + Create'.
Now that the policy is assigned to the subscription, whenever you create a new cluster, which does not have Prometheus enabled, the policy will run and deploy the resources. If you want to apply the policy to existing AKS cluster, create a 'Remediation task' for that AKS cluster resource after going to the 'Policy Assignment'.
Now you should see metrics flowing in the existing linked Grafana resource, which is linked with the corresponding Azure Monitor Workspace.
In case you create a new Managed Grafana resource from Azure portal, please link it with the corresponding Azure Monitor Workspace from the 'Linked Grafana Workspaces' tab of the relevant Azure Monitor Workspace page. Assign the role 'Monitoring Data Reader' to the Grafana MSI on the Azure Monitor Workspace resource so that it can read data for displaying the charts, using the instructions below.
From the Overview page for the Azure Managed Grafana instance in the Azure portal, select JSON view.
Copy the value of the principalId field for the SystemAssigned identity.
From the Access control (IAM) page for the Azure Managed Grafana instance in the Azure portal, select Add and then Add role assignment.
Select Monitoring Data Reader.
Select Managed identity and then Select members.
Select the system-assigned managed identity with the principalId from the Grafana resource.
Select Select and then Review+assign.
Deploy template
Deploy the template with the parameter file using any valid method for deploying Resource Manager templates. See Deploy the sample templates for examples of different methods.
Limitations
Ensure that you update the kube-state metrics Annotations and Labels list with proper formatting. There's a limitation in the Resource Manager template deployments that require exact values in the kube-state metrics pods. If the Kubernetes pod has any issues with malformed parameters and isn't running, then the feature won't work as expected.
A data collection rule and data collection endpoint is created with the name MSProm-\<short-cluster-region\>-\<cluster-name\>. These names can't currently be modified.
You must get the existing Azure Monitor workspace integrations for a Grafana workspace and update the Resource Manager template with it, otherwise it will overwrite and remove the existing integrations from the grafana workspace.
Enable windows metrics collection
As of version 6.4.0-main-02-22-2023-3ee44b9e, windows metric collection has been enabled for the AKS clusters. Onboarding to the Azure Monitor Metrics Addon will enable the windows daemonset pods to start running on your nodepools. Both Windows Server 2019 and Windows Server 2022 are supported. Follow the steps below to enable the pods to collect metrics from your windows node pools.
Manually install the windows exporter on AKS nodes to access windows metrics.
Enable the following collectors:
While onboarding, enable the recording rules required for the default dashboards.
For CLI include the option --enable-windows-recording-rules.
For ARM template, Bicep, or Policy, set enableWindowsRecordingRules to true in the parameters file.
If the cluster is already onboarded to Azure Monitor Metrics, to enable windows recording rule groups use this ARM template and Parameters file to create the rule groups.
Verify Deployment
Run the following command to verify that the DaemonSet was deployed properly on the linux nodepools:
kubectl get ds ama-metrics-node --namespace=kube-system
The number of pods should be equal to the number of nodes on the cluster. The output should resemble the following:
User@aksuser:~$ kubectl get ds ama-metrics-node --namespace=kube-system
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
ama-metrics-node 1 1 1 1 1 <none> 10h
Run the following command to verify that the DaemonSet was deployed properly on the windows nodepools:
kubectl get ds ama-metrics-win-node --namespace=kube-system
The output should resemble the following:
User@aksuser:~$ kubectl get ds ama-metrics-node --namespace=kube-system
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
ama-metrics-win-node 3 3 3 3 3 <none> 10h
Run the following command to which verify that the ReplicaSets were deployed properly:
kubectl get rs --namespace=kube-system
The output should resemble the following:
User@aksuser:~$kubectl get rs --namespace=kube-system
NAME DESIRED CURRENT READY AGE
ama-metrics-5c974985b8 1 1 1 11h
ama-metrics-ksm-5fcf8dffcd 1 1 1 11h
Feature Support
ARM64 and Mariner nodes are supported.
HTTP Proxy is supported and will use the same settings as the HTTP Proxy settings for the AKS cluster configured with these instructions.
Limitations
CPU and Memory requests and limits can't be changed for Container insights metrics addon. If changed, they'll be reconciled and replaced by original values in a few seconds.
Azure Monitor Private Link (AMPLS) isn't currently supported.
Only public clouds are currently supported.
Uninstall metrics addon
Currently, Azure CLI is the only option to remove the metrics addon and stop sending Prometheus metrics to Azure Monitor managed service for Prometheus.
Install the aks-preview extension using the following command:
Upgrade your az cli version to the latest version and ensure that the aks-preview version you're using is at least '0.5.132'. Find your current version using the az version.
az extension add --name aks-preview
Use the following command to remove the agent from the cluster nodes and delete the recording rules created for the data being collected from the cluster along with the Data Collection Rule Associations (DCRA) that link the DCE or DCR with your cluster. This doesn't remove the DCE, DCR, or the data already collected and stored in your Azure Monitor workspace.
az aks update --disable-azuremonitormetrics -n <cluster-name> -g <cluster-resource-group>
Region mappings
When you allow a default Azure Monitor workspace to be created when you install the metrics addon, it's created in the region listed in the following table.