Hello @Ershad Nozari ,
Thanks for the question and using MS Q&A platform.
As we understand the ask here is recommendation on join two different dataset ( in two diffrent) files and then find the delta and process it , please do let us know if its not accurate.
As I understand you atleast have two options
Options #1
Use mapping data flow ( MDF ) : MDF is used mostly for transformation . You can read the data from the two files , join them and then use function like CRC32 to have the fingerprint of the row of the incoming files . Do a similar thing on the past file and compare the fongerprints and determine the delta .
Option #2
You also use Synapse Analytics also ,, it offers something called spark pool ( whcih runs on Apache Spark ) and you can read the file and join the dataframes and use the hash function to determine the delta .
Please do let me if you have any queries.
Thanks
Himanshu
- Please don't forget to click on or upvote button whenever the information provided helps you. Original posters help the community find answers faster by identifying the correct answer. Here is how
- Want a reminder to come back and check responses? Here is how to subscribe to a notification
- If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators