Hello Abimbola Adeniran,
Greetings! Welcome to Microsoft Q&A Platfrom.
1.Azure offers metrics in the Azure portal that provide insight on how your virtual machines (VM) and disks perform. The metrics can also be retrieved through an API call. This article is broken into 3 subsections:
- Disk IO, throughput, queue depth and latency metrics - These metrics allow you to see the storage performance from the perspective of a disk and a virtual machine.
- Disk bursting metrics - These are the metrics provide observability into our bursting feature on our premium disks.
- Storage IO utilization metrics - These metrics help diagnose bottlenecks in your storage performance with disks.
Here are the key metrics related to disk performance:
OS Disk Latency (Preview): The average time to complete IOs during monitoring for the OS disk (values in milliseconds).
OS Disk Queue Depth: The number of current outstanding IO requests waiting to be read from or written to the OS disk.
OS Disk Read Bytes/Sec: Bytes read per second from the OS disk (inclusive of cache if enabled).
OS Disk Read Operations/Sec: Input operations read per second from the OS disk (inclusive of cache if enabled).
OS Disk Write Bytes/Sec: Bytes written per second to the OS disk.
OS Disk Write Operations/Sec: Output operations written per second to the OS disk.
Data Disk Latency (Preview): Average time to complete IOs during monitoring for data disks.
Data Disk Queue Depth: Outstanding IO requests waiting to be read from or written to data disks.
Data Disk Read Bytes/Sec: Bytes read per second from data disks (inclusive of cache if enabled).
Data Disk Read Operations/Sec: Input operations read per second from data disks (inclusive of cache if enabled).
Data Disk Write Bytes/Sec: Bytes written per second to data disks.
Data Disk Write Operations/Sec: Output operations written per second to data disks.
Interpretation:
Latency: Lower latency values indicate better performance.
Throughput (Bytes/Sec): Higher throughput values mean faster data transfer.
IOPS: Higher IOPS values indicate better input/output performance.
IOPS (Input/Output Operations Per Second): IOPS measures the number of read or write operations a storage device can perform in a single second. Higher IOPS generally indicate better performance. Calculating IOPS involves dividing the combined read and write throughput by the block size. For example:
IOPS = (ReadThroughput + WriteThroughput) / BlockSize
Throughput: Throughput represents the amount of data transferred per unit of time. It’s typically measured in megabytes per second (MB/s). The relationship between IOPS and throughput depends on the block size. The formula is:
Throughput (MB/s) = Average IO size (in bytes) × IOPS
Latency: Latency is the average time taken to complete an I/O operation. Lower latency is desirable. It’s measured in milliseconds (ms). Key latency metrics include:
Disk sec/Transfer: Time for one read/write operation (disk latency). Ideally, it should be below 10 ms for high-load servers.
Data Disk Latency: Average time for I/Os during monitoring for data disks.
Queue Depth: Queue depth indicates the number of outstanding I/O requests waiting to be read from or written to a disk. High queue depth may impact performance.
Interpreting Metrics:
Average vs. Maximum: While Azure metrics show averages, you can use tools like Azure Monitor or Performance Co-Pilot (PCP) to capture maximum values over time.
Thresholds: Set performance thresholds based on your workload requirements. For example, if latency exceeds a certain value, investigate further.
Bursting Metrics: Premium disks support bursting. Monitor the bursting credit percentage to ensure optimal performance during peak loads. On-demand bursting options only apply to premium disks products at this time, where you can get unlimited burst performance on a pay as you go model.
2.To identify peaks, consider these approaches:
Azure Monitor: Use Azure Monitor to set up alerts based on thresholds (e.g., high latency or low throughput).
- Custom Metrics: You can create custom metrics to track specific thresholds (e.g., peak latency) using Azure Monitor.
Azure virtual machines have input/output operations per second (IOPS) and throughput performance limits based on the virtual machine type and size. OS disks and data disks can be attached to virtual machines. The disks have their own IOPS and throughput limits.
Azure offers the ability to boost disk storage IOPS and MB/s performance, this is referred to as bursting for both virtual machines (VM) and disks. You can effectively use VM and disk bursting to achieve better bursting performance on both your VMs and disk.
refer - https://learn.microsoft.com/en-us/azure/virtual-machines/disks-performance, https://learn.microsoft.com/en-us/azure/virtual-machines/disk-bursting for more details.
Azure Disk Backup is a native, cloud-based backup solution that protects your data in managed disks. It's a simple, secure, and cost-effective solution that enables you to configure protection for managed disks. There are some following factors to consider when analyzing the disk performance that is in Backup,
- Ensure you’re not hitting data ingress/egress limits.
- Storage Throttling: Premium disks can throttle if IOPS/throughput exceed VM and disk SKUs.
- VM Scalability Targets: Choose a VM SKU that meets your workload’s performance demands.
- Cache Restriction: Optimize VMs using Premium storage by configuring disk caching.
refer this following blog for detailed guidance on performance and cost-effective steps - https://bluexp.netapp.com/blog/azure-disk-performance-how-to-analyze-and-monitor-issues.
If your backup files are consistently less than 1TB, consider switching to Standard SSD disks. They offer cost savings while still providing reasonable performance.
Hope this answer helps! Please let us know if you have any further queries. I’m happy to assist you further.
Please "Accept the answer” and “up-vote” wherever the information provided helps you, this can be beneficial to other community members.