An Azure service that is used to collect, analyze, and act on telemetry data from Azure and on-premises environments.
Hello @RajKumar Kannan Thank you for posting your query on Microsoft Q&A platform.
Thanks for your question. This is a valid concern and the behavior you are seeing is expected with Azure GPU‑enabled virtual machines.
Azure does not expose GPU utilization or GPU memory as platform (host) metrics for virtual machines (including NC, ND, and NV series). Only CPU, disk, network, and similar host‑level metrics are available by default.
GPU metrics must be collected from inside the guest OS using guest‑based monitoring.
This is by design in Azure Monitor.
Below is the reason why GPU metrics don’t appear by default:
Azure Monitor separates VM monitoring into:
- Platform metrics – collected by Azure automatically (CPU, disk, network)
- Guest metrics – collected from inside the VM using Azure Monitor Agent (AMA)
GPU telemetry is only available inside the guest OS through NVIDIA drivers and tooling. Azure Monitor will not collect it unless you explicitly configure guest‑level data collection.
Reference: https://learn.microsoft.com/en-us/azure/azure-monitor/vm/data-collection-performance
Please have a look into below correct and supported way to collect GPU metrics:
- Prerequisites (mandatory):
- GPU VM (NC / ND / NV series)
- NVIDIA GPU drivers installed (Microsoft recommends using GPU‑optimized marketplace images)
- For Linux GPU VMs, Microsoft and NVIDIA support collecting GPU metrics using:
- NVIDIA DCGM (Data Center GPU Manager) to expose GPU metrics
- Azure Monitor Agent (AMA) or a Prometheus‑compatible collector to ingest them
DCGM exposes GPU utilization, memory usage, and other metrics from inside the VM.
Official reference describing this architecture and Azure ingestion:
For Windows GPU VMs:
- Azure Monitor Agent must be installed
- Custom Data Collection Rules (DCRs) are required
- GPU metrics are collected only if NVIDIA drivers expose counters
- Alternatively, tools like
nvidia-smicombined with a collector can be used
Azure Monitor does not auto‑discover GPU counters on Windows.
Microsoft DCR documentation:
Where GPU metrics appear depends entirely on how they are collected. When VM Insights is enabled with the default configuration, GPU metrics are not included and will not appear anywhere, as VM Insights only collects a predefined set of guest metrics. When Azure Monitor Agent (AMA) is used with custom performance counters or Data Collection Rules, GPU metrics if exposed by the GPU driver are ingested into the Perf table in Log Analytics. When NVIDIA DCGM or other custom GPU collectors are configured, the GPU data is also written to the Perf table or a custom Log Analytics table, depending on how ingestion is set up. If Telegraf is used to send GPU metrics to Azure Monitor Metrics, those GPU metrics appear in Metrics Explorer under a custom namespace (for example, a Telegraf or NVIDIA-related namespace), rather than in VM Insights or platform metrics.
GPU metrics do not appear in InsightsMetrics by default.
VM Insights only includes a predefined set of guest metrics:
Reference: https://learn.microsoft.com/azure/azure-monitor/vm/monitor-virtual-machine-data-collection.
Example of GPU metric names when using NVIDIA DCGM.
These are vendor‑defined metrics exposed inside the VM:
-
DCGM_FI_DEV_GPU_UTIL– GPU utilization (%) -
DCGM_FI_DEV_FB_USED– GPU memory used (MiB) -
DCGM_FI_PROF_PIPE_TENSOR_ACTIVE– Tensor core activity
Reference: https://www.ibm.com/docs/en/tarm/8.15.x?topic=resources-azure-vm-gpu-metrics-collection
You see CPU/Disk but not GPU metrics which is expected when:
- Only platform metrics are enabled
- VM Insights is enabled without GPU‑specific configuration
- Azure Monitor Agent is installed but no GPU exporter, counters, or scripts are configured
Azure intentionally requires explicit guest configuration for GPU telemetry.
So, GPU utilization and GPU memory metrics are not available as Azure platform metrics for NC/ND/NV virtual machines. To collect GPU metrics, guest‑based monitoring must be configured using NVIDIA GPU drivers and tools such as DCGM or equivalent, together with Azure Monitor Agent and Data Collection Rules. These metrics are collected as guest performance data and do not appear in VM Insights or InsightsMetrics by default.
Thanks,
Suchitra.