Globally unique resources that provide access to data management services and serve as the parent namespace for the services.
Hello @Greg Toews
These issues often occur when scaling PostgreSQL past 10k TPS on AKS, and focusing on NVMe is the right direction.
1.Can NVMe get us the performance we need without Ultra Disk's cost?
Yes, this is the recommended approach for your workload. According to Microsoft’s benchmarks, PostgreSQL can reach nearly 15,000 TPS with single-digit millisecond latency on a Standard_L16s_v3 VM using local NVMe and Azure Container Storage, which exceeds your 10,000 TPS requirement.
It is also important to consider cost efficiency: an 8-vCPU Standard_L8s_v3 VM with local NVMe achieves approximately 400,000 IOPS, whereas achieving similar IOPS with Ultra Disk would require a 112-vCPU VM. For a dataset of 6–8 TB, the Lsv3 series offers a much more cost-effective solution.
While Ultra Disks can deliver high performance, they generally come at a significantly higher cost and may not match the latency provided by NVMe.
Pls refer the link : https://azure.microsoft.com/en-us/blog/running-high-performance-postgresql-on-azure-kubernetes-service/
2.How do we deal with NVMe being ephemeral?
This is a crucial design consideration. Rather than focusing on making NVMe durable, it is advisable to design your application for resilience.
The recommended solution is to deploy PostgreSQL with the CloudNativePG (CNPG) operator, which offers the following benefits:
• Native PostgreSQL streaming replication across three nodes (one primary and two replicas) distributed across multiple Availability Zones
• Automated failover with an RTO of less than 10 seconds and an RPO of zero
• Continuous WAL archiving to Azure Blob Storage, providing a reliable backup layer
PostgreSQL's WAL mechanism records every transaction before it is applied, ensuring that replicas maintain a consistent and current copy, even if a node fails. Azure Container Storage natively manages NVMe volume orchestration within Kubernetes, eliminating the need for manual RAID configuration.
3.When exactly would a VM be recycled, and would disabling updates prevent it?
Node image upgrades (weekly Linux patches) → node is reimaged
• Kubernetes version upgrades → nodes are recreated
• Node failure or hardware fault → data lost
• VM deallocation (scale-in event) → data lost
- VM redeploy
- node upgrade
- scale-in
AKS does weekly node image updates and recommends using auto-upgrade channels, so recycling is totally expected it’s all about keeping things secure and reliable.
Turning off updates isn’t the way to go since that could leave your cluster open to OS security issues and isn’t supported long-term. Instead, it’s best to take advantage of CloudNativePG’s high availability setup: when a node gets recycled, CNPG will automatically create a new replica on a fresh node and sync it up with the primary. Your data stays safe—you’ll just have one less replica for a bit while it catches up.
Reference Links:
1.AKS Engineering Blog: PostgreSQL + NVMe deep dive
https://blog.aks.azure.com/2025/07/09/postgresql-nvme
2.Best practices for ephemeral NVMe data disks on AKS
https://learn.microsoft.com/en-us/azure/aks/best-practices-storage-nvme
3.Deploy highly available PostgreSQL on AKS
https://learn.microsoft.com/en-us/azure/aks/deploy-postgresql-ha?tabs=azuredisk
4.PostgreSQL HA overview with CloudNativePG
https://learn.microsoft.com/en-us/azure/aks/postgresql-ha-overview
If you have any questions please feel free to comment and we will be happy to assist you.
Thanks,,
Manish.