When scaling compute, https://learn.microsoft.com/en-us/azure/postgresql/flexible-server/concepts-business-continuity#planned-downtime-events states
During compute scaling operation, active checkpoints are allowed to complete, client connections are drained, any uncommitted transactions are canceled, storage is detached, and then it's shut down. A new flexible server with the same database server name is provisioned with the scaled compute configuration. The storage is then attached to the new server and the database is started which performs recovery if necessary before accepting client connections.
...
When the flexible server is configured with high availability, the flexible server performs the scaling and the maintenance operations on the standby server first. For more information, see Concepts - High availability.
and https://learn.microsoft.com/en-us/azure/postgresql/flexible-server/concepts-high-availability
For other user initiated operations such as scale-compute or scale-storage, the changes are applied at the standby first, followed by the primary. Currently, the service is not failed over to the standby and hence while the scale operation is carried out on the primary server, applications will encounter a short downtime.
Is there a way to avoid the downtime?
For managed maintenance it seems to be possible using HA fail-over
For flexible servers configured with high availability, these maintenance activities are performed on the standby replica first and the service is failed over to the standby to which applications can reconnect.
Read replicas would be a natural fit, but the docs https://learn.microsoft.com/en-us/azure/postgresql/flexible-server/concepts-read-replicas give no hints about a (automated) process where the scaling is conducted without downtime using the replicas.