Hello,
Welcome to Microsoft Q&A,
It's typically tied to the nuances of Azure's disk I/O architecture and how Kafka workloads stress disks differently in virtualized environments.
If your Kafka brokers are using Standard SSDs or Premium SSDs below P30, you're likely I/O bound, not CPU/memory bound. Each Azure VM size has a maximum aggregate throughput/IOPS limit across all attached disks. Even if your disk supports more IOPS, the VM size can throttle total disk performance. Azure managed disks are network-attached, and thus always slower unless you use Lsv3 series VMs with ephemeral NVMe disks.
The below recommendation you could try to improve the disk performance
- Upgrade to higher-tier Premium SSDs (P30+) or Ultra Disks
- Move to Lsv3 series VMs for local NVMe disks (ephemeral, extremely fast)
- Split Kafka log directories across multiple disks for better parallelism
- Ensure caching settings are optimal (usually
None
for Kafka logs) - Use Azure Disk bursting cautiously – burst credits may deplete and throttle performance
- Scale horizontally – more brokers with smaller load each
Please Upvote and accept the answer if it helps!!