Hello! I'm ingesting two "Account" tables from 2 different countries from 2 different Salesforce instances. I'm joining them together on an ID key.
A number of columns are supposed to match their records because Salesforce is supposed to synchronise them on their system.
I have built a data flow that filters the joined account table on the columns that do not equal each other (i.e. column1 != column2 OR column3 != column4 OR etc.).
Now, I am currently writing the results into a sink "tracking file" , but I would like this data flow to run daily. At the moment it will overwrite this data file each time and it's possible the previous days' mismatch has been fixed already. I however want to track how often this data quality issue arises.
My question therefore: How do I append new rows to my tracking file when there is a new record with a mismatch and keep the previous cases there was a mismatch (even if it isn't an issue anymore)?
I'm thinking parameterise the writing to sink to a new file each time it runs, build another data flow that creates a union of an existing tracking file and each new days' file, write to existing tracking file to update.
Is there a better way? Perhaps the filter on "not equal" columns in records is not the right way? Another thought was to use a Lookup function, but the overwriting in sink function persists.
Thanks in advance!
P.S. Using ADF2, Salesforce linked service, and Data Lake Storage Gen2.