The Azure Database for PostgreSQL flexible server in one of our environments has been down for over an hour. No connection can be made, whether through the Azure Portal, via pgAdmin4, or through the application.
There was a critical health event when our issues begun. It has logs:
"properties": {
"title": "Unknown Reason",
"details": "",
"currentHealthStatus": "Unavailable",
"previousHealthStatus": "Available",
"type": "Downtime",
"cause": "PlatformInitiated"
},
As far as we can tell, the only related event was a connection request made to the database. The query made is one that has been run many times before, and just a simple SQL Select with one WHERE.
What we have tried to troubleshoot:
- Restarting the server - This failed with an Internal Server Error
- Stopping the server completely - The server is still in a stopping state, and has been for over an hour. The in progress health event has:
"properties": {
"title": "Stopped",
"details": "",
"currentHealthStatus": "Degraded",
"previousHealthStatus": "Unavailable",
"type": "Downtime",
"cause": "UserInitiated"
},
Even though the log is titled "Stopped" the status on the overview page is still "Stopping"
This is all happening in only one of our three environments (separate resource groups). All resource groups are running the same instance of the code, and are currently on the same database migration. All three database servers have the same settings (Burstable, Standard B1ms compute, 32 GB storage). Other environments are working fine.
Our region is Australia East with 1 availability zone. High availability is not enabled.
What could be causing this issue?