update if activity is used between source and sink
Thank you
This browser is no longer supported.
Upgrade to Microsoft Edge to take advantage of the latest features, security updates, and technical support.
Hello,
At present I have created a pipeline which dynamically loads data (Any number of objects at source, i.e. sql server tables) from source to sink (Parquet files). This seems to be working fine. So, instead of creating a pipeline for each object (For example a sql server table), we have one pipeline which handles all the objects and loads them into .parquet files appropriately…
So on a daily basis, the .parquet files(Destiination) have the same data as the sql tables(Source)
So far so good.
Question,
the next stage is for me to be able to load the .parquet files into another blob storage BUT making sure that the final destination will have the necessary upserts and deletes if any (i.e. comparing the first .parquet to the final .parquet).
For example,
source --> .parquet1 (Has the same data as source - loaded daily) --> upsert/delete the next .parquet file on each load so that the final destination has the latest data.
…
Instead of creating a separate dataflow for each .parquet file, what is the best way to take care of the upserts/deletes dynamically so to perhaps have one dataflow which will handle any of the .parquet file and update/insert/delete the final destination accordingly.
Hope you see what I mean
update if activity is used between source and sink
Thank you