Share via

Azure Database for PostgreSQL - Flexible server is currently unavailable

Michał Grodzki 0 Reputation points
2025-10-21T09:19:36.7366667+00:00

Hello Microsoft Team,

since last 3 days I'm facing critical issues with the Database. It seems like the memory usage is going high and everything freezes - then connecting is unavailable.

In Resource Health I am getting this error (few times a day):
User's image

Trying to restart the server fails.
Only stopping it and then starting it over works, but after some time the problem occurs once again.

What's the problem?

Azure Database for PostgreSQL

2 answers

Sort by: Most helpful
  1. Sina Salam 30,166 Reputation points Volunteer Moderator
    2025-10-21T14:27:10.4366667+00:00

    Hello Michał Grodzki,

    Welcome to the Microsoft Q&A and thank you for posting your questions here.

    I understand that your Azure Database for PostgreSQL-Flexible server is currently unavailable.

    The first step is to triage whether the issue is platform-related or workload-driven. Check Azure Resource Health for any Unavailable (Platform event) or Unknown status, as these often indicate underlying host maintenance or platform incidents. If detected, raise a support case directly from the Resource Health blade, this option is available even without a paid plan and is the proper channel for stalled control-plane operations. In parallel, review Azure Service Health to identify any broader regional service incidents – https://learn.microsoft.com/en-us/azure/service-health/overview.

    If no platform event is found, treat the issue as workload saturation. Immediately reduce connection pressure by validating application connection pool settings or enabling PgBouncer on the Flexible Server. To restore stability quickly, scale up to a higher compute tier (e.g., General Purpose or Memory Optimized) and avoid enabling Query Store on Burstable tiers during heavy load. Scaling operations can be completed online – https://learn.microsoft.com/en-us/azure/postgresql/flexible-server/concepts-compute-storage.

    For diagnosis, use the High Memory Utilization guidance in Azure Monitor – https://learn.microsoft.com/en-us/azure/postgresql/flexible-server/how-to-troubleshoot-high-memory-usage to track metrics such as Memory percent, CPU percent, Active connections, and Temp file bytes written. Combine this with in-database queries like:

    SELECT pid, state, wait_event_type, wait_event, query_start,
           (now() - query_start) AS runtime, query
    FROM pg_stat_activity
    WHERE state <> 'idle'
    ORDER BY runtime DESC
    LIMIT 20;
    

    These checks help correlate spikes and identify long-running queries or memory-heavy operations. Tune parameters like work_mem, maintenance_work_mem, shared_buffers, and max_connections, following Microsoft’s recommended limits for each tier.

    If the server again gets stuck in “Stopping,” re-check Resource Health and open a support case with incident details, activity logs, and metric snapshots to allow Microsoft to investigate control-plane behavior. This process is outlined in the Resource Health documentation – https://learn.microsoft.com/en-us/azure/service-health/resource-health-overview.

    For long-term hardening, right-size the server (avoid Burstable for production workloads), enforce connection governance via pooling, enable zone-redundant high availability (HA), and configure Azure Monitor alerts for memory, connection count, and resource state changes. These practices ensure early detection, better performance stability, and faster recovery during both platform and workload events.

    I hope this is helpful! Do not hesitate to let me know if you have any other questions or clarifications.


    Please don't forget to close up the thread here by upvoting and accept it as an answer if it is helpful.

    Was this answer helpful?

    0 comments No comments

  2. Anonymous
    2025-10-21T09:28:22.3566667+00:00

    Hi @Michał Grodzki

    Thank you for posting your question on Microsoft Q&A. Here are some troubleshooting steps that may help address your issue.

    You are experiencing intermittent unavailability and high memory usage on your Azure Database for PostgreSQL – Flexible Server, with Resource Health showing UnknownReason (Unplanned) outages. Restart attempts fail unless you stop and start the server, and the issue recurs after some time.

    Possible Causes

    ·         Resource Saturation High memory usage often indicates:

    o    Inefficient queries or missing indexes causing large in-memory operations.

    o    Connection pooling misconfiguration leading to excessive active sessions.

    o    Workload exceeding the compute/memory tier of your Flexible Server.

    o     Under the Metrics section, monitor the CPU and memory usage. If the CPU percentage is over 90% or memory usage is over 95%, this could lead to unresponsiveness.

    o    If you're using a burstable (B-series) SKU, consider upgrading to a General Purpose or Memory Optimized SKU, which can handle production workloads better and may help in managing high resource usage.

    ·         Underlying Platform Issue the Resource Health message suggests an unplanned outage, which can occur due to:

    o    Host-level failures or transient infrastructure issues.

    o    Maintenance or failover events triggered by Azure.

    Recommended Actions

    ·         Check Resource Health and Service Status

    o    Review https://learn.microsoft.com/en-us/azure/service-health/resource-health-overview for outage details.

    o    Confirm if there are any https://learn.microsoft.com/en-us/azure/service-health/overview impacting PostgreSQL Flexible Server.

    ·         Monitor and Optimize Memory Usage

    o    Use https://learn.microsoft.com/en-us/azure/postgresql/flexible-server/concepts-query-performance-insight to identify heavy queries.

    o    Enable https://learn.microsoft.com/en-us/azure/postgresql/concepts-pg-stat-statements  for query-level diagnostics.

    o    Consider adding indexes or optimizing queries to reduce memory footprint.

    ·         Scale Up or Adjust Configuration

    o    If workload exceeds current tier, https://learn.microsoft.com/en-us/azure/postgresql/flexible-server/concepts-compute-storage to a higher SKU.

    o    Validate https://learn.microsoft.com/en-us/azure/postgresql/flexible-server/concepts-connection-pooling to avoid resource exhaustion.

    ·         Enable High Availability

    o    Configure https://learn.microsoft.com/en-us/azure/postgresql/flexible-server/concepts-high-availability to minimize downtime during failovers.

    ·         Review Logs and Metrics

    o    Check https://learn.microsoft.com/en-us/azure/postgresql/flexible-server/concepts-monitoring for memory and CPU trends.

    o    Inspect PostgreSQL logs for errors or long-running transactions.

    References

    ·         https://learn.microsoft.com/en-us/azure/postgresql/flexible-server/overview

    ·         https://learn.microsoft.com/en-us/azure/postgresql/flexible-server/how-to-troubleshoot-connectivity

    ·         https://learn.microsoft.com/en-us/azure/postgresql/flexible-server/concepts-performance-best-practices

    Summary: The recurring issue likely stems from resource saturation combined with transient platform events. Start by analyzing query patterns, scaling resources, and enabling HA. If the problem persists despite optimization, feel free to reach out to us including Resource Health logs.

    Hope the above steps were helpful. If you have any other questions, please feel free to contact us.

    Thanks,
    Vrishabh

    Was this answer helpful?

    0 comments No comments

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.