How can we validate the destination file data with source table after running adf pipeline?

kuljeet panag 1 Reputation point
2021-10-06T19:39:07.14+00:00

Is there any way to validate the parquet file data (row by row) from blob storage with source table?

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,059 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,084 questions
{count} votes

1 answer

Sort by: Most helpful
  1. HimanshuSinha-msft 19,381 Reputation points Microsoft Employee
    2021-10-07T18:03:50.473+00:00

    Hello @kuljeet panag ,
    Thanks for the ask and using Microsoft Q&A platform .
    I think one way to go is to use the Mapping data flow . In mapping data flow you can create two sources and then use a JOIN transformation with NOT EXIST option .

    if you review the thread : https://learn.microsoft.com/en-us/answers/questions/578219/data-flow-34delete-if34-setting-in-alter-row.html?childToView=580571#answer-580571 , it does use the join and it should give you some idea .

    138652-image.png

    Please do let me know how it goes .
    Thanks
    Himanshu

    -------------------------------------------------------------------------------------------------------------------------

    • Please don't forget to click on 130616-image.png or upvote 130671-image.png button whenever the information provided helps you. Original posters help the community find answers faster by identifying the correct answer. Here is how
    • Want a reminder to come back and check responses? Here is how to subscribe to a notification
    • If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators
    0 comments No comments