How to extract newly added, unmodified and modified data from two similar tables

Pratim Das, Partha C 266 Reputation points
2024-03-07T07:04:45.3066667+00:00

Hi Team,

Please help me to find a solution to below situation -

I have two tables with Persons and Persons_History

Table_Structure

Persons_History as the name suggests, stores previous date's data.

PersonHistoryData

Persons table contains latest data

PersonData

As you can see the yellow highlighted row is modified. Unselected rows are unchanged and the green row is newly added.

I need to separate out changed, unchanged and new rows.

I'm using Azure Data Factory and Azure SQL database as storage.

Please guide me, how I can achieve that.

I've attached the script to create and populate the tables -> Scripts.txt

Thanking you in advance.

Regards,

Partha

Azure SQL Database
Azure
Azure
A cloud computing platform and infrastructure for building, deploying and managing applications and services through a worldwide network of Microsoft-managed datacenters.
914 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,444 questions
{count} votes

Accepted answer
  1. Smaran Thoomu 8,885 Reputation points Microsoft Vendor
    2024-03-08T05:03:59.4566667+00:00

    Hi @Pratim Das, Partha C

    Thank you for reaching out to the community forum with your query.

    As I understand you want to extract newly added, unmodified, and modified data from two similar tables, you can use the Azure Data Factory and Azure SQL database as storage. If my understanding is correct, please let me know.

    It is possible to separate out changed, unchanged, and new rows in Azure Data Factory (ADF) using Azure SQL Database. You can do this by using the Alter Row Transformation in the Mapping Data Flow. Here's a summary of the steps:

    1. Use the Alter Row transformation in your ADF data flow to set insert, delete, update, and upsert policies on rows.
    2. Add one-to-many conditions as expressions in order of priority, as each row will be marked with the policy corresponding to the first-matching expression.
    3. Each of those conditions can result in a row (or rows) being inserted, updated, deleted, or upserted.
    4. Alter Row can produce both DDL & DML actions against your database.
    5. Create an Alter Row transformation and specify a row policy with a condition of true to mark each row that doesn’t match any of the previously defined expressions for the specified row policy.
    6. Use debug mode to view the results of your alter row policies in the data preview pane.
    7. For the alter row policies to work, the data stream must write to a database or Azure Cosmos DB sink.
    8. In the Settings tab of your sink, enable which alter row policies are allowed for that sink.

    Note that if you’re using native Change Data Capture (CDC) sources like SQL Server or SAP, ADF will automatically detect the row marker, so Alter Row policies are unnecessary for those cases.

    Hope this helps. Do let us know if you any further queries.


    If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.


0 additional answers

Sort by: Most helpful