Thank you for reaching Microsoft Q&A!
Root Cause Analysis: Based on your description, using a burstable SKU (B-series) can lead to performance degradation once CPU credits are depleted, often causing the server to become unresponsive. I recommend checking the CPU and memory usage metrics over the times of the incidents to see if resource exhaustion was a factor.
Here are few steps you can consider:
- Upgrade Your SKU: Upgrading from the Burstable SKU to a General Purpose or Memory Optimized SKU should provide you with more consistent performance and prevent the outages you’re experiencing. This aligns with your current strategy for upgrading your database compute tier.
- Enable High Availability: Configuring high availability can help mitigate the impact of unexpected outages. This setup will allow a standby replica of your database to take over in case your primary server goes down.
- Regular Monitoring: Utilize Azure’s built-in monitoring tools to regularly assess your database’s performance. Setting up alerts for high CPU usage or potential resource exhaustion can help you take action before it leads to downtime.
- Custom Maintenance Windows: Consider using custom maintenance windows to avoid unexpected downtime during peak application usage times.
After you finish upgrading the SKU and enabling high availability, continuously monitor the performance using Azure Monitor and set alerts for any anomalies.
- If issues persist, consider reaching out to Azure Support for deeper insights specific to your resource and incidents
Reference Documents:
- Manage scheduled maintenance settings for Azure Database for PostgreSQL – Flexible server
- High availability (Reliability) in Azure Database for PostgreSQL
- Handling transient connectivity errors in Azure Database for PostgreSQL.
Thanks!
Kalyani