Редагувати

Поділитися через


Get started: configure observability with a script in Azure IoT Operations Preview

Important

Azure IoT Operations Preview – enabled by Azure Arc is currently in preview. You shouldn't use this preview software in production environments.

You'll need to deploy a new Azure IoT Operations installation when a generally available release is made available. You won't be able to upgrade a preview installation.

See the Supplemental Terms of Use for Microsoft Azure Previews for legal terms that apply to Azure features that are in beta, preview, or otherwise not yet released into general availability.

Observability provides visibility into every layer of your Azure IoT Operations configuration. It gives you insight into the actual behavior of issues, which increases the effectiveness of site reliability engineering. Azure IoT Operations offers observability through custom curated Grafana dashboards that are hosted in Azure. These dashboards are powered by Azure Monitor managed service for Prometheus and by Container Insights. This article shows you how to configure the services you need for observability.

Prerequisites

Configure your subscription

Run the following code to register providers with the subscription where your cluster is located.

Note

This step only needs to be run once per subscription. To register resource providers, you need permission to do the /register/action operation, which is included in subscription Contributor and Owner roles. For more information, see Azure resource providers and types.

az account set -s <subscription-id>
az provider register -n "Microsoft.Insights"
az provider register -n "Microsoft.AlertsManagement"

Install observability components

The steps in this section deploy an OpenTelemetry (OTel) Collector and then install shared monitoring resources and configure your Arc enabled cluster to emit observability signals to these resources. The shared monitoring resources include Azure Managed Grafana, Azure Monitor Workspace, Azure Managed Prometheus, Azure Log Analytics, and Container Insights.

Deploy OpenTelemetry Collector

  1. Clone or download the Azure IoT Operations repo to your local machine: azure-iot-operations.git.

    Note

    The repo contains the deployment definition of Azure IoT Operations, and samples that include the sample dashboards used in this article.

  2. Browse to the following path in your local copy of the repo:

    azure-iot-operations\tools\setup-3p-obs-infra

  3. Create a file called otel-collector-values.yaml and paste the following code into it to define an OpenTelemetry Collector:

    mode: deployment
    fullnameOverride: aio-otel-collector
    image:
      repository: otel/opentelemetry-collector
      tag: 0.107.0
    config:
      processors:
        memory_limiter:
          limit_percentage: 80
          spike_limit_percentage: 10
          check_interval: '60s'
      receivers:
        jaeger: null
        prometheus: null
        zipkin: null
        otlp:
          protocols:
            grpc:
              endpoint: ':4317'
            http:
              endpoint: ':4318'
      exporters:
        prometheus:
          endpoint: ':8889'
          resource_to_telemetry_conversion:
            enabled: true
      service:
        extensions:
          - health_check
        pipelines:
          metrics:
            receivers:
              - otlp
            exporters:
              - prometheus
          logs: null
          traces: null
        telemetry: null
      extensions:
        memory_ballast:
          size_mib: 0
    resources:
      limits:
        cpu: '100m'
        memory: '512Mi'
    ports:
      metrics:
        enabled: true
        containerPort: 8889
        servicePort: 8889
        protocol: 'TCP'
      jaeger-compact:
        enabled: false
      jaeger-grpc:
        enabled: false
      jaeger-thrift:
        enabled: false
      zipkin:
        enabled: false
    
  4. In the otel-collector-values.yaml file, make a note of the following values that you use in the az iot ops init command when you deploy Azure IoT Operations on the cluster:

    • fullnameOverride
    • grpc.endpoint
    • check_interval
  5. Save and close the file.

  6. Deploy the collector by running the following commands:

    kubectl get namespace azure-iot-operations || kubectl create namespace azure-iot-operations
    helm repo add open-telemetry https://open-telemetry.github.io/opentelemetry-helm-charts
    
    helm repo update
    helm upgrade --install aio-observability open-telemetry/opentelemetry-collector -f otel-collector-values.yaml --namespace azure-iot-operations
    

Deploy observability components

  • Deploy the observability components by running one of the following commands. Use the subscription ID and resource group of the Arc-enabled cluster that you want to monitor.

    Note

    To discover other optional parameters you can set, see the bicep file. The optional parameters can specify things like alternative locations for cluster resources.

    The following command grants admin access for the newly created Grafana instance to the user:

    az deployment group create \
         --subscription <subscription-id> \
         --resource-group <cluster-resource-group> \
         --template-file observability-full.bicep \
         --parameters grafanaAdminId=$(az ad user show --id $(az account show --query user.name --output tsv) --query=id --output tsv) \
                      clusterName=<cluster-name> \
                      sharedResourceGroup=<shared-resource-group> \
                      sharedResourceLocation=<shared-resource-location> \
         --query=properties.outputs
    

    If that access isn't what you want, the following command that doesn't configure permissions. Then, set up permissions manually using role assignments before anyone can access the Grafana instance. Assign one of the Grafana roles (Grafana Admin, Grafana Editor, Grafana Viewer) depending on the level of access desired.

    az deployment group create \
         --subscription <subscription-id> \
         --resource-group <cluster-resource-group> \
         --template-file observability-full.bicep \
         --parameters clusterName=<cluster-name> \
                      sharedResourceGroup=<shared-resource-group> \
                      sharedResourceLocation=<shared-resource-location> \
          --query=properties.outputs
    

    If the deployment succeeds, a few pieces of information are printed at the end of the command output. The information includes the Grafana URL and the resource IDs for both the Log Analytics and Azure Monitor resources that were created. The Grafana URL allows you to go to the Grafana instance that you configure in Deploy dashboards to Grafana. The two resource IDs enable you to configure other Arc enabled clusters by following the steps in Add an Arc-enabled cluster to existing observability infrastructure.

Configure Prometheus metrics collection

  1. Copy and paste the following configuration to a new file named ama-metrics-prometheus-config.yaml, and save the file:

    apiVersion: v1
    data:
      prometheus-config: |2-
        scrape_configs:
          - job_name: e4k
            scrape_interval: 1m
            static_configs:
              - targets:
                - aio-internal-diagnostics-service.azure-iot-operations.svc.cluster.local:9600
              - job_name: nats
                scrape_interval: 1m
                static_configs:
                - targets:
                  - aio-dp-msg-store-0.aio-dp-msg-store-headless.azure-iot-operations.svc.cluster.local:7777
              - job_name: otel
                scrape_interval: 1m
                static_configs:
                - targets:
                  - aio-otel-collector.azure-iot-operations.svc.cluster.local:8889
              - job_name: aio-annotated-pod-metrics
                kubernetes_sd_configs:
                - role: pod
                  relabel_configs:
                  - action: drop
                    regex: true
                    source_labels:
                    - __meta_kubernetes_pod_container_init
                  - action: keep
                    regex: true
                    source_labels:
                    - __meta_kubernetes_pod_annotation_prometheus_io_scrape
                  - action: replace
                    regex: ([^:]+)(?::\\d+)?;(\\d+)
                    replacement: $1:$2
                    source_labels:
                    - __address__
                    - __meta_kubernetes_pod_annotation_prometheus_io_port
                    target_label: __address__
                  - action: replace
                    source_labels:
                    - __meta_kubernetes_namespace
                    target_label: kubernetes_namespace
                  - action: keep
                    regex: 'azure-iot-operations'
                    source_labels:
                    - kubernetes_namespace
                  scrape_interval: 1m
    kind: ConfigMap
    metadata:
      name: ama-metrics-prometheus-config
      namespace: kube-system
    
  2. Apply the configuration file by running the following command:

    kubectl apply -f ama-metrics-prometheus-config.yaml
    

Deploy dashboards to Grafana

Azure IoT Operations provides a collection of dashboards designed to give you many of the visualizations you need to understand the health and performance of your Azure IoT Operations deployment.

Complete the following steps to install the Azure IoT Operations curated Grafana dashboards.

  1. Sign in to the Grafana console, then in the upper right area of the Grafana application, select the + icon

  2. Select Import dashboard, follow the prompts to browse to the samples\grafana-dashboards path in your local cloned copy of the repo, and select a JSON dashboard file

  3. When the application prompts, select your managed Prometheus data source

  4. Select Import