Data factory - find delta between records present in 2 excel files

Pratim Das, Partha C 346 Reputation points
2024-01-18T05:08:03.32+00:00

Hi, I have an use case - I have 2 excel files - structure wise same as below but number of rows are different. Lets assume, file ABC.xlsx has 1 row and file XYZ.xlsx has that same row and one extra row Screenshot

I need to extract the delta i.e. the row in XYZ.xlsx file where Name is 'B'. I need to accomplish it using Azure Data Factory. Please guide me. Note: There is a possibility that Timestamp column might not be present. Regards, Partha

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,641 questions
0 comments No comments
{count} votes

Accepted answer
  1. Dieter Gasser 416 Reputation points
    2024-01-18T08:50:53.8166667+00:00

    Hi, you can achieve this with a data flow:

    User's image

    1. Join the two dataset using Union
    2. Use Aggregate to add a count column which holds the number of times a row exists. In the Aggregate, under "Group by", select all columns, then under "Aggregate" add a column "Count" with expression "count()"
    3. Use Filter to only return the rows where Count equals 1. The expression is "Count==1"

    I hope this helps you get started.


0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.