How to create Native Prometheus Alert on Managed AKS/Prometheus

KaushikKumar Patel 41 Reputation points Microsoft Employee
2024-11-08T17:11:34.38+00:00

Hey folks, I have Managed Grafana and Prometheus on AKS. Native Opensource Prometheus has been in use for few years, hence has many custom (application/service) related alerting and recording rule setup. 

 

Is there anyway in Azure, where we can deploy these prometheus alert rules without using azure prometheus terraform/bicep logic.

 

Goal is each app/service team manages their own set of alerts which are deployed as part of their app deployment with other K8s Manifest such as hpa, deployment, keda scaleset, configmap, and such.

 

This is an important requirement for us since migrating this to tf/bicep will add significant operation overhead.

I already have helm templates for Prometheus alerts, but not sure how to get them applied/deployment to managed AKS/Prometheus.

Azure Kubernetes Service (AKS)
Azure Kubernetes Service (AKS)
An Azure service that provides serverless Kubernetes, an integrated continuous integration and continuous delivery experience, and enterprise-grade security and governance.
2,182 questions
{count} votes

Accepted answer
  1. Akshay kumar Mandha 1,500 Reputation points Microsoft Vendor
    2024-11-28T12:28:37.7066667+00:00

    Hi KaushikKumar Patel,
    We understand your point, and you have found an alternative solution to this query. We will repost your solution in the answer section, as the original poster cannot accept their own answer. This way, you can accept it as the answer. An accepted answer will help other community members navigate to the appropriate solutions

    Issue: I have Managed Grafana and Prometheus on AKS. Native Opensource Prometheus has been in use for few years, hence has many custom (application/service) related alerting and recording rule setup. Is there anyway in Azure, where we can deploy these prometheus alert rules without using azure prometheus terraform/bicep logic.

    Goal is each app/service team manages their own set of alerts which are deployed as part of their app deployment with other K8s Manifest such as hpa, deployment, keda scaleset, configmap, and such.

     This is an important requirement for us since migrating this to tf/bicep will add significant operation overhead. I already have helm templates for Prometheus alerts, but not sure how to get them applied/deployment to managed AKS/Prometheus.

    Solution:
    Decided to use: https://github.com/Azure/prometheus-collector/tree/main/tools/az-prom-rules-converter

    And, we will also leverage az cli tool which creates rule via supply yaml file.

    DockerfileCopy

    az alerts-management prometheus-rule-group create --name
                                                      --resource-group
                                                      --rules
                                                      --scopes
                                                      [--cluster-name]
                                                      [--description]
                                                      [--enabled {0, 1, f, 
    

    Azure Managed Prometheus Rule Group Argument Details

    --rules [Required] : List<Object> Defines the rules in the Prometheus rule group. See https://github.com/Azure/azure-cli/tree/dev/doc/shorthand_syntax.md for more about shorthand syntax.

    Azure List Element Properties

    • expression [Required] (string) The PromQL expression to evaluate. Evaluated periodically as given by 'interval', and the result recorded as a new set of time series with the metric name as given by 'record'.
    • actions (list of map) Actions that are performed when the alert rule becomes active, and when an alert condition is resolved.
    • alert (string) Alert rule name.
    • annotations (object) The annotations clause specifies a set of informational labels that can be used to store longer additional information such as alert descriptions or runbook links. Only valid for alerts. The annotation values can be templated.
    • enabled (bool) Enable/disable rule. Allowed values: 0, 1, f, false, n, no, t, true, y,yes.
    • for (string) The amount of time alert must be active before firing.
    • labels (object) Labels to add or overwrite before storing the result.
    • record (string) Recorded metrics name.
    • resolve-configuration (map) Defines the configuration for resolving fired alerts.
    • autoResolved (bool) the flag that indicates whether or not to auto resolve a fired alert.
      • timeToResolve (string) the duration a rule must evaluate as healthy before the fired alert is automatically resolved represented in ISO 8601 duration format. Should be between 1 and 15 minutes
      • severity (int) The severity of the alerts fired by the rule. Must be between 0 (critical), 1 (error), 2 (warning), 3 (info) and 4 (verbose).

    Native Prometheus Rule Group vs Azure Managed Prometheus Rule Group Comparison

    Here's a table comparing native Prometheus recording and alert rules with Azure Managed Prometheus rule groups:

    Expand table

    Rule Group SpecificationNative Prometheus Rule GroupAzure Managed Prometheus Rule Groupalert (optional)Name of the alertAlert rule name (optional)alert (optional)Name of the alertAlert rule name (optional)alert (optional)Name of the alertAlert rule name (optional)expr (required)PromQL expression to trigger alertPromQL expression (required)for (required)How long expression true before firingTime alert must be active before firingkeep_firing_forDuration alert remains active after triggeringNot directly supportedlabelsKey-value pairs added as labelsLabels to add or overwrite (object)annotationsDescriptive information about the alertAnnotations for additional informationrecord (optional)Recorded metrics nameRecorded metrics name (not allowed if alert spec is used)actions (optional)N/AActions performed on alert state changesenabled (optional)N/AEnable/disable ruleresolve-configurationN/AConfiguration for resolving alertsautoResolvedN/AFlag for auto-resolving alertstimeToResolveN/ADuration for auto-resolve (ISO 8601)severityN/AThe severity of the alerts fired by the rule. Must be between 0 and 4Prometheus Rule to Azure Prometheus Rule Group Converter

    A tool to convert Prometheus rules YAML file files to Azure Prometheus rule groups ARM template

    Create prometheus rule group using Azure CLI with rule group file

    PowerShellCopy

    az alerts
    

    Create prometheus rule group using Azure CLI with rule details in cli

    JSONCopy

    az alerts-management prometheus-rule-group create -n TestPrometheusRuleGroup -g TestResourceGroup \
      -l westus --enabled 
    

    Native Prometheus Rule Group Spec

    Reference

    Customize Azure Managed Prometheus

    • Minimal ingestion profile for Prometheus metrics in Azure Monitor
    • Create and validate custom configuration file for Prometheus metrics in Azure Monitor
    • Customize collection using CRDs (Service and Pod Monitors)
    • Integrate KEDA with your Azure Kubernetes Service cluster az alerts-management prometheus-rule-group create --nameDockerfileCopy
                                                            --resource-group
                                                        --rules
                                                        --scopes
                                                        [--cluster-name]
                                                        [--description]
                                                        [--enabled {0, 1, f, 
      
      Azure Managed Prometheus Rule Group Argument Details --rules [Required] : List<Object> Defines the rules in the Prometheus rule group. See https://github.com/Azure/azure-cli/tree/dev/doc/shorthand_syntax.md for more about shorthand syntax. Azure List Element Properties
      • expression [Required] (string) The PromQL expression to evaluate. Evaluated periodically as given by 'interval', and the result recorded as a new set of time series with the metric name as given by 'record'.
      • actions (list of map) Actions that are performed when the alert rule becomes active, and when an alert condition is resolved.
      • alert (string) Alert rule name.
      • annotations (object) The annotations clause specifies a set of informational labels that can be used to store longer additional information such as alert descriptions or runbook links. Only valid for alerts. The annotation values can be templated.
      • enabled (bool) Enable/disable rule. Allowed values: 0, 1, f, false, n, no, t, true, y,yes.
      • for (string) The amount of time alert must be active before firing.
      • labels (object) Labels to add or overwrite before storing the result.
      • record (string) Recorded metrics name.
      • resolve-configuration (map) Defines the configuration for resolving fired alerts. Only relevant for alerts.
        • autoResolved (bool) the flag that indicates whether or not to auto resolve a fired alert.
        • timeToResolve (string) the duration a rule must evaluate as healthy before the fired alert is automatically resolved represented in ISO 8601 duration format. Should be between 1 and 15 minutes
      • severity (int) The severity of the alerts fired by the rule. Must be between 0 (critical), 1 (error), 2 (warning), 3 (info) and 4 (verbose).
      Native Prometheus Rule Group vs Azure Managed Prometheus Rule Group Comparison Here's a table comparing native Prometheus recording and alert rules with Azure Managed Prometheus rule groups:Expand table

    Rule Group SpecificationNative Prometheus Rule GroupAzure Managed Prometheus Rule Groupalert (optional)Name of the alertAlert rule name (optional)alert (optional)Name of the alertAlert rule name (optional)expr (required)PromQL expression to trigger alertPromQL expression (required)for (required)How long expression true before firingTime alert must be active before firingkeep_firing_forDuration alert remains active after triggeringNot directly supportedlabelsKey-value pairs added as labelsLabels to add or overwrite (object)annotationsDescriptive information about the alertAnnotations for additional informationrecord (optional)Recorded metrics nameRecorded metrics name (not allowed if alert spec is used)actions (optional)N/AActions performed on alert state changesenabled (optional)N/AEnable/disable ruleresolve-configurationN/AConfiguration for resolving alertsautoResolvedN/AFlag for auto-resolving alertstimeToResolveN/ADuration for auto-resolve (ISO 8601)severityN/AThe severity of the alerts fired by the rule. Must be between 0 and 4Prometheus Rule to Azure Prometheus Rule Group Converter A tool to convert Prometheus rules YAML file files to Azure Prometheus rule groups ARM template Create prometheus rule group using Azure CLI with rule group file

        az alerts
    

    Create prometheus rule group using Azure CLI with rule details in cliJSONCopy

      az alerts-management prometheus-rule-group create -n TestPrometheusRuleGroup -g TestResourceGroup \ -l westus --enabled
    

    User's image

    1 person found this answer helpful.

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.