Collect Prometheus metrics from an AKS cluster

This article describes how to configure your Azure Kubernetes Service (AKS) cluster to send data to Azure Monitor managed service for Prometheus. When you perform this configuration, a containerized version of the Azure Monitor agent is installed with a metrics extension. This sends data to the Azure Monitor workspace that you specify.

Note

The process described here doesn't enable Container insights on the cluster. However, both processes use the Azure Monitor agent. For different methods to enable Container insights on your cluster, see Enable Container insights..

The Azure Monitor metrics agent's architecture utilizes a ReplicaSet and a DaemonSet. The ReplicaSet pod scrapes cluster-wide targets such as kube-state-metrics and custom application targets that are specified. The DaemonSet pods scrape targets solely on the node that the respective pod is deployed on, such as node-exporter. This is so that the agent can scale as the number of nodes and pods on a cluster increases.

Prerequisites

Note

Contributor permission is enough for enabling the addon to send data to the Azure Monitor workspace. You will need Owner level permission in case you're trying to link your Azure Monitor Workspace to view metrics in Azure Managed Grafana. This is required because the user executing the onboarding step, needs to be able to give the Azure Managed Grafana System Identity Monitoring Reader role on the Azure Monitor Workspace to query the metrics.

Enable Prometheus metric collection

Use any of the following methods to install the Azure Monitor agent on your AKS cluster and send Prometheus metrics to your Azure Monitor workspace.

Note

Azure Managed Grafana is not available in the Azure US Government cloud currently.

There are multiple options to enable Prometheus metrics on your cluster from the Azure portal.

New cluster

When you create a new AKS cluster in the Azure portal, you can enable Prometheus, Container insights, and Grafana from the Integrations tab.

Screenshot of integrations tab for new AKS cluster.

From the Azure Monitor workspace

This option enables Prometheus metrics on a cluster without enabling Container insights.

  1. Open the Azure Monitor workspaces menu in the Azure portal and select your workspace.

  2. Select Monitored clusters in the Managed Prometheus section to display a list of AKS clusters.

  3. Select Configure next to the cluster you want to enable.

    Screenshot that shows an Azure Monitor workspace with a Prometheus configuration.

From an existing cluster monitored with Container insights

This option adds Prometheus metrics to a cluster already enabled for Container insights.

  1. Open the Kubernetes services menu in the Azure portal and select your AKS cluster.

  2. Click Insights.

  3. Click Monitor settings.

    Screenshot of button for monitor settings for an AKS cluster.

  4. Click the checkbox for Enable Prometheus metrics and select your Azure Monitor workspace.

  5. To send the collected metrics to Grafana, select a Grafana workspace. See Create an Azure Managed Grafana instance for details on creating a Grafana workspace.

    Screenshot of monitor settings for an AKS cluster.

  6. Click Configure to complete the configuration.

See Collect Prometheus metrics from AKS cluster (preview) for details on verifying your deployment and limitations

From an existing cluster

This options enables Prometheus, Grafana, and Container insights on a cluster.

  1. Open the clusters menu in the Azure portal and select Insights.

  2. Select Configure monitoring.

  3. Container insights is already enabled. Select the checkboxes for Enable Prometheus metrics and Enable Grafana. If you have existing Azure Monitor workspace and Garafana workspace, then they're selected for you. Click Advanced settings to select alternate workspaces or create new ones.

    Screenshot that shows that show the dialog box to configure Container insights with Prometheus and Grafana.

  4. Click Configure to save the configuration.

Enable Windows metrics collection

Note

There is no CPU/Memory limit in windows-exporter-daemonset.yaml so it may over-provision the Windows nodes
For more details see Resource reservation

As you deploy workloads, set resource memory and CPU limits on containers. This also subtracts from NodeAllocatable and helps the cluster-wide scheduler in determining which pods to place on which nodes. Scheduling pods without limits may over-provision the Windows nodes and in extreme cases can cause the nodes to become unhealthy.

As of version 6.4.0-main-02-22-2023-3ee44b9e of the Managed Prometheus addon container (prometheus_collector), Windows metric collection has been enabled for the AKS clusters. Onboarding to the Azure Monitor Metrics add-on enables the Windows DaemonSet pods to start running on your node pools. Both Windows Server 2019 and Windows Server 2022 are supported. Follow these steps to enable the pods to collect metrics from your Windows node pools.

  1. Manually install windows-exporter on AKS nodes to access Windows metrics. Enable the following collectors:

    • [defaults]
    • container
    • memory
    • process
    • cpu_info

    Deploy the windows-exporter-daemonset YAML file:

        kubectl apply -f windows-exporter-daemonset.yaml
    
  2. Apply the ama-metrics-settings-configmap to your cluster. Set the windowsexporter and windowskubeproxy Booleans to true. For more information, see Metrics add-on settings configmap.

  3. Enable the recording rules that are required for the out-of-the-box dashboards:

    • If onboarding using the CLI, include the option --enable-windows-recording-rules.
    • If onboarding using an ARM template, Bicep, or Azure Policy, set enableWindowsRecordingRules to true in the parameters file.
    • If the cluster is already onboarded, use this ARM template and this parameter file to create the rule groups.

Verify deployment

  1. Run the following command to verify that the DaemonSet was deployed properly on the Linux node pools:

    kubectl get ds ama-metrics-node --namespace=kube-system
    

    The number of pods should be equal to the number of Linux nodes on the cluster. The output should resemble the following example:

    User@aksuser:~$ kubectl get ds ama-metrics-node --namespace=kube-system
    NAME               DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
    ama-metrics-node   1         1         1       1            1           <none>          10h
    
  2. Run the following command to verify that the DaemonSet was deployed properly on the Windows node pools:

    kubectl get ds ama-metrics-win-node --namespace=kube-system
    

    The number of pods should be equal to the number of Windows nodes on the cluster. The output should resemble the following example:

    User@aksuser:~$ kubectl get ds ama-metrics-node --namespace=kube-system
    NAME                   DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
    ama-metrics-win-node   3         3         3       3            3           <none>          10h
    
  3. Run the following command to verify that the two ReplicaSets were deployed properly:

    kubectl get rs --namespace=kube-system
    

    The output should resemble the following example:

    User@aksuser:~$kubectl get rs --namespace=kube-system
    NAME                            DESIRED   CURRENT   READY   AGE
    ama-metrics-5c974985b8          1         1         1       11h
    ama-metrics-ksm-5fcf8dffcd      1         1         1       11h
    

Artifacts/Resources provisioned/created as a result of metrics addon enablement for an AKS cluster

When you enable metrics addon, the following resources are provisioned:

Resource Name Resource Type Resource Group Region/Location Description
MSPROM-<aksclusterregion>-<clustername> Data Collection Rule Same Resource group as AKS cluster resource Same region as Azure Monitor Workspace This data collection rule is for prometheus metrics collection by metrics addon, which has the chosen Azure monitor workspace as destination, and also it is associated to the AKS cluster resource
MSPROM-<aksclusterregion>-<clustername> Data Collection endpoint Same Resource group as AKS cluster resource Same region as Azure Monitor Workspace This data collection endpoint is used by the above data collection rule for ingesting Prometheus metrics from the metrics addon

When you create a new Azure Monitor workspace, the following additional resources are created as part of it

Resource Name Resource Type Resource Group Region/Location Description
<azuremonitor-workspace-name> System Data Collection Rule MA_<azuremonitor-workspace-name>_<azuremonitor-workspace-region>_managed Same region as Azure Monitor Workspace This is system data collection rule that customers can use when they use OSS Prometheus server to Remote Write to Azure Monitor Workspace
<azuremonitor-workspace-name> System Data Collection endpoint MA_<azuremonitor-workspace-name>_<azuremonitor-workspace-region>_managed Same region as Azure Monitor Workspace This is system data collection endpoint that customers can use when they use OSS Prometheus server to Remote Write to Azure Monitor Workspace

HTTP Proxy

Azure Monitor metrics addon supports HTTP Proxy and uses the same settings as the HTTP Proxy settings for the AKS cluster configured with these instructions.

Network firewall requirements

Azure public cloud

The following table lists the firewall configuration required for Azure monitor Prometheus metrics ingestion for Azure Public cloud. All network traffic from the agent is outbound to Azure Monitor.

Agent resource Purpose Port
global.handler.control.monitor.azure.com Access control service/ Azure Monitor control plane service 443
*.ingest.monitor.azure.com Azure monitor managed service for Prometheus - metrics ingestion endpoint (DCE) 443
*.handler.control.monitor.azure.com For querying data collection rules 443

Azure US Government cloud

The following table lists the firewall configuration required for Azure monitor Prometheus metrics ingestion for Azure US Government cloud. All network traffic from the agent is outbound to Azure Monitor.

Agent resource Purpose Port
global.handler.control.monitor.azure.us Access control service/ Azure Monitor control plane service 443
*.ingest.monitor.azure.us Azure monitor managed service for Prometheus - metrics ingestion endpoint (DCE) 443
*.handler.control.monitor.azure.us For querying data collection rules 443

Uninstall the metrics add-on

To uninstall the metrics add-on, see Disable Prometheus metrics collection on an AKS cluster.

Supported regions

The list of regions Azure Monitor Metrics and Azure Monitor Workspace is supported in can be found here under the Managed Prometheus tag.

Next steps