Copy only new data via a copy data pipeline from PostgreSQL to Azure PostgreSQL

Andres Esteban 156 Reputation points
2022-04-27T16:20:06.9+00:00

Hello community:

I am trying to copy a number of tables from PostgreSQL to Azure PostgresSQL using Azure DataFactory. To do this, I create a pipeline and configure the input and output, the source table and the copied table are exactly the same. As I have a primary key, when the trigger is executed, I get an error because it copies all the data and not only the new data and therefore duplicates the primary key values, which is impossible.

My question is how to configure this pipeline so that it only copies the new data. The Primary Key column is the ID column, whose values are not incremental but unique.

It is not possible to make a DataFlow because it is not allowed as a PostgreSQL data source.

Here is a schematic of what I have set up

196989-image.png

Thank you in advance

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
Azure Database for PostgreSQL
0 comments No comments
{count} votes

2 answers

Sort by: Most helpful
  1. Nandan Hegde 36,716 Reputation points MVP Volunteer Moderator
    2022-04-27T16:31:28.183+00:00

    Hey,
    you can use the concept of watermark :
    https://www.sqlservercentral.com/articles/incremental-data-loading-using-azure-data-factory
    for incremental data loading.


  2. AnnuKumari-MSFT 34,566 Reputation points Microsoft Employee Moderator
    2022-04-29T16:14:26.027+00:00

    Hi @Andres Esteban ,
    Thankyou for posting your query on Microsoft Q&A platform .
    My understanding about your query is that you want to load incremental data from OnPremise PostGreSQL server to Az PostGreSQL server. The problem statement here is that there is no watermark column and the primary key is also not incremental column .

    In order to achieve this requirement, you might need to introduce a temp table which can be treated as a full load table for the source data.
    Post doing a full load to that table , you can perform MERGE i.e. Upsert on the destination table.

    Click here to know about MERGE statement in PostGreSQL .

    Hope this will help. Please let us know if any further queries.

    ------------------------------

    • Please don't forget to click on 130616-image.png or upvote 130671-image.png button whenever the information provided helps you.
      Original posters help the community find answers faster by identifying the correct answer. Here is how
    • Want a reminder to come back and check responses? Here is how to subscribe to a notification
    • If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.