Peformance issue with postgres fdw in azure postgres flexible server

Mani Mahesh S 56 Reputation points
2022-11-24T06:33:09.217+00:00

Hi team,

We have a application which uses two databases. Database A and Database B. Both databases are present in azure Postgresql flexible server. We are migrating on-prem oracle database to azure cloud Postgresql. In our application, we have a reporting module, which needs to access all tables in Database B and 7 tables in Database A. In on-prem oracle, this requirement was fulfilled by replicating 7 tables logically between Database A and Database B. After replicating, they have created necessary indexes for reporting, so all the queries where performing well in On-prem Oracle.

In Azure Postgres Flexible, we have enabled High- Availability setup. As a postgres limitation, in case of HA failover, Logical replication slots will not be created automatically in Salve, Due to this issue, whenever HA- Failover happens, we were forced to recreate all 7 tables and also recreating indexes. Note: All 7 tables are bigger in size and it will take around 7 to 8 hours for replicating entire data and indexes in it. Due to this, our team is forced notv to use logical replication for this scenario.

As an alternate solution, we chose, PostgresFDW and tried to generate reports, initially it took some time, but tried some optimization with fdw options. Using Fetch size, Use Remote Estimate ,batch size and fdw_startup_cost. After making these optimizations, time reduced to 1 min for some reports. But still may reports are not getting completed. Is there any way to force indexes in foreign tables, Or any ther suggestions to improve performance of these reports. Kindly suggest.

Thanks and regards,
Manimahesh S

Azure Database for PostgreSQL
{count} votes

1 answer

Sort by: Most helpful
  1. gennady 1 Reputation point Microsoft Employee
    2022-12-06T17:18:47.757+00:00

    Hello,
    Without really troubleshooting an issue, seeing explain plan or understanding if there is something here particular to local loopback adapter, its hard to make definitive statement. I would recommend looking at opening a support case as well.
    As far as general recommendations following can be stated:

    1. FETCH_SIZE
      The FDW retrieves data from the remote server using row bunches and the size of these bunches is defined by the fetch_size option (default 100). This option can be specified for a SERVER or FOREIGN TABLE, and the values defined for the last one have priority. Sometimes a possible optimization would be to increase the values of fetch_size to try to reduce the number of fetch activity. Hard to state since we dont have any explain plan or other data on fetch costs. for more see - https://www.postgresql.org/docs/current/postgres-fdw.html#:~:text=%20fetch_size%20This%20option%20specifies%20the%20number%20of,for%20the%20server.%20The%20default%20is%20100.%20batch_size
    2. USE_REMOTE_ESTIMATE
      PostgreSQL’s planner decides which is the better strategy to execute the queries, this decision is based on the table’s statistics, so, if the statistics are outdated the planner will choose a bad execution strategy. Autovacuum process will help to keep statistics updated, but does not execute ANALYZE commands on foreign tables, hence to keep statistics updated on foreign tables it is necessary to run manually ANALYZE on those tables. To avoid outdated statistics, the FDW has an option to get the statistics from the remote server on the fly, this option is precisely use_remote_estimate and can be specified for a foreign table or a foreign server. For more see - https://postgrespro.com/list/thread-id/1881055#:~:text=use_remote_estimate%20This%20option%2C%20which%20can%20be%20specified%20for,for%20its%20server%2C%20but%20only%20for%20that%20table.
    3. Pooling connections
      Another improvement could be using a pooling solution, like pgpool or pgbouncer, with static connections for FDWs. This would reduce the overhead of establishing new connections each time is required.
      For more see - https://learn.microsoft.com/azure/postgresql/flexible-server/concepts-pgbouncer
    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.