Logical replication and logical decoding in Azure Database for PostgreSQL - Flexible Server
APPLIES TO:
Azure Database for PostgreSQL - Flexible Server
Azure Database for PostgreSQL - Flexible Server supports the following logical data extraction and replication methodologies:
- Logical replication
- Using PostgreSQL native logical replication to replicate data objects. Logical replication allows fine-grained control over the data replication, including table-level data replication.
- Using pglogical extension that provides logical streaming replication and more capabilities such as copying initial schema of the database, support for TRUNCATE, ability to replicate DDL etc.
- Logical decoding which is implemented by decoding the content of write-ahead log (WAL).
Comparing logical replication and logical decoding
Logical replication and logical decoding have several similarities. They both:
- Allow you to replicate data out of Postgres.
- Use the write-ahead log (WAL) as the source of changes.
- Use logical replication slots to send out data. A slot represents a stream of changes.
- Use a table's REPLICA IDENTITY property to determine what changes can be sent out.
- Don't replicate DDL changes.
The two technologies have their differences:
Logical replication:
- Allows you to specify a table or set of tables to be replicated.
Logical decoding:
- Extracts changes across all tables in a database.
Pre-requisites for logical replication and logical decoding
- Go to server parameters page on the portal.
- Set the server parameter
wal_level
tological
. - If you want to use pglogical extension, search for the
shared_preload_libraries
, andazure.extensions
parameters, and selectpglogical
from the drop-down box. - Update
max_worker_processes
parameter value to at least 16. Otherwise, you may run into issues likeWARNING: out of background worker slots
. - Save the changes and restart the server to apply the
wal_level
change. - Confirm that your PostgreSQL instance allows network traffic from your connecting resource.
- Grant the admin user replication permissions.
ALTER ROLE <adminname> WITH REPLICATION;
- You may want to make sure the role you are using has privileges on the schema that you're replicating. Otherwise, you may run into errors such as
Permission denied for schema
.
Note
It is always a good practice to separate your replication user from regular admin account.
Using logical replication and logical decoding
Native logical replication
Logical replication uses the terms 'publisher' and 'subscriber'.
- The publisher is the PostgreSQL database you're sending data from.
- The subscriber is the PostgreSQL database you're sending data to.
Here's some sample code you can use to try out logical replication.
Connect to the publisher database. Create a table and add some data.
CREATE TABLE basic(id SERIAL, name TEXT); INSERT INTO basic(name) VALUES ('apple'); INSERT INTO basic(name) VALUES ('banana');
Create a publication for the table.
CREATE PUBLICATION pub FOR TABLE basic;
Connect to the subscriber database. Create a table with the same schema as on the publisher.
CREATE TABLE basic(id SERIAL, name varchar(40));
Create a subscription that will connect to the publication you created earlier.
CREATE SUBSCRIPTION sub CONNECTION 'host=<server>.postgres.database.azure.com user=<admin> dbname=<dbname> password=<password>' PUBLICATION pub;
You can now query the table on the subscriber. You'll see that it has received data from the publisher.
SELECT * FROM basic;
You can add more rows to the publisher's table and view the changes on the subscriber.
If you are not able to see the data, enable the login privilege for
azure_pg_admin
and check the table content.ALTER ROLE azure_pg_admin login;
Visit the PostgreSQL documentation to understand more about logical replication.
pglogical extension
Here is an example of configuring pglogical at the provider database server and the subscriber. Refer to pglogical extension documentation for more details. Also make sure you have performed pre-requisite tasks listed above.
Install pglogical extension in the database in both the provider and the subscriber database servers.
\c myDB CREATE EXTENSION pglogical;
If the replication user is other than the server administration user (who created the server), make sure that you grant membership in a role
azure_pg_admin
to the user and assign REPLICATION and LOGIN attributes to the user. See pglogical documentation for details.GRANT azure_pg_admin to myUser; ALTER ROLE myUser REPLICATION LOGIN;
At the provider (source/publisher) database server, create the provider node.
select pglogical.create_node( node_name := 'provider1', dsn := ' host=myProviderServer.postgres.database.azure.com port=5432 dbname=myDB user=myUser password=myPassword');
Create a replication set.
select pglogical.create_replication_set('myreplicationset');
Add all tables in the database to the replication set.
SELECT pglogical.replication_set_add_all_tables('myreplicationset', '{public}'::text[]);
As an alternate method, you can also add tables from a specific schema (for example, testUser) to a default replication set.
SELECT pglogical.replication_set_add_all_tables('default', ARRAY['testUser']);
At the subscriber database server, create a subscriber node.
select pglogical.create_node( node_name := 'subscriber1', dsn := ' host=mySubscriberServer.postgres.database.azure.com port=5432 dbname=myDB user=myUser password=myPasword' );
Create a subscription to start the synchronization and the replication process.
select pglogical.create_subscription ( subscription_name := 'subscription1', replication_sets := array['myreplicationset'], provider_dsn := 'host=myProviderServer.postgres.database.azure.com port=5432 dbname=myDB user=myUser password=myPassword');
You can then verify the subscription status.
SELECT subscription_name, status FROM pglogical.show_subscription_status();
Caution
Pglogical does not currently support an automatic DDL replication. The initial schema can be copied manually using pg_dump --schema-only. DDL statements can be executed on the provider and subscriber at the same time by using the pglogical.replicate_ddl_command function. Please be aware of other limitations of the extension listed here.
Logical decoding
Logical decoding can be consumed via the streaming protocol or SQL interface.
Streaming protocol
Consuming changes using the streaming protocol is often preferable. You can create your own consumer / connector, or use a third-party service like Debezium.
Visit the wal2json documentation for an example using the streaming protocol with pg_recvlogical.
SQL interface
In the example below, we use the SQL interface with the wal2json plugin.
Create a slot.
SELECT * FROM pg_create_logical_replication_slot('test_slot', 'wal2json');
Issue SQL commands. For example:
CREATE TABLE a_table ( id varchar(40) NOT NULL, item varchar(40), PRIMARY KEY (id) ); INSERT INTO a_table (id, item) VALUES ('id1', 'item1'); DELETE FROM a_table WHERE id='id1';
Consume the changes.
SELECT data FROM pg_logical_slot_get_changes('test_slot', NULL, NULL, 'pretty-print', '1');
The output will look like:
{ "change": [ ] } { "change": [ { "kind": "insert", "schema": "public", "table": "a_table", "columnnames": ["id", "item"], "columntypes": ["character varying(40)", "character varying(40)"], "columnvalues": ["id1", "item1"] } ] } { "change": [ { "kind": "delete", "schema": "public", "table": "a_table", "oldkeys": { "keynames": ["id"], "keytypes": ["character varying(40)"], "keyvalues": ["id1"] } } ] }
Drop the slot once you are done using it.
SELECT pg_drop_replication_slot('test_slot');
Visit the PostgreSQL documentation to understand more about logical decoding.
Monitoring
You must monitor logical decoding. Any unused replication slot must be dropped. Slots hold on to Postgres WAL logs and relevant system catalogs until changes have been read. If your subscriber or consumer fails or if it is improperly configured, the unconsumed logs will pile up and fill your storage. Also, unconsumed logs increase the risk of transaction ID wraparound. Both situations can cause the server to become unavailable. Therefore, it is critical that logical replication slots are consumed continuously. If a logical replication slot is no longer used, drop it immediately.
The 'active' column in the pg_replication_slots view will indicate whether there is a consumer connected to a slot.
SELECT * FROM pg_replication_slots;
Set alerts on the Maximum Used Transaction IDs and Storage Used flexible server metrics to notify you when the values increase past normal thresholds.
Limitations
- Logical replication limitations apply as documented here.
- Slots and HA failover - Logical replication slots on the primary server are not available on the standby server in your secondary AZ. This situation applies to you if your server uses the zone-redundant high availability option. In the event of a failover to the standby server, logical replication slots will not be available on the standby.
Important
You must drop the logical replication slot in the primary server if the corresponding subscriber no longer exists. Otherwise the WAL files start to get accumulated in the primary filling up the storage. If the storage threshold exceeds certain threshold and if the logical replication slot is not in use (due to non-available subscriber), Flexible server automatically drops that unused logical replication slot. That action releases accumulated WAL files and avoids your server becoming unavailable due to storage getting filled situation.
Next steps
- Learn more about networking options
- Learn about extensions available in flexible server
- Learn more about high availability
Feedback
Submit and view feedback for