Hi @pankaj chaturvedi ,
Thankyou for using Microsoft Q&A platform and thanks for posting your question here.
As I understand your ask, you are trying to understand what is the need of 'BenchmarkKey' and 'BenchmarkValue' in the incremental load process when we have the keycolumns for upsert. Please let me know if my understanding is incorrect.
Benchmark/Watermark concept is for finding out the incremental records. A watermark is a column in the source table that has the last updated time stamp or an incrementing key.
Suppose the last run of the ETL was a full load having 1GB of data. Now, the incremental records (eg. 20 records) needs to be either inserted or updated.
The way to find out those 20 newly created or updated records is that we need to store the max(modifieddatetime) value from source for the last run and get all the modifieddatetime > the 'max(modifieddatetime) value from source for the last run'
This value i.e 'max(modifieddatetime) value from source for the last run' needs to be stored somewhere for us to make the comparison . That's where benchmarkey and value comes into picture.
So here Benchmarkkey would be modifieddatetime and the max(modifieddatetime) value would be stored as benchmarkvalue at the end of every transaction.
If we don't make use of Benchmarkkey and Benchmarkvalue , and directly use Keycolumns to perform upsert , it would do the operation on whole dataset ie. 1GB+20 new records.
To get more information, please watch:
How to do full load from On Premise SQL Server till ADLS using Azure Synapse Pipelines
How to perform incremental load from OnPremise SQL server to Dedicated Sql pool
Additional resources: Incrementally load data from Azure SQL Database to Azure Blob storage using the Azure portal
Concept is same for On-Premise to Azure SQL sink as well.
Hope this will help. Please let us know if any further queries.
------------------------------
- Please don't forget to click on or upvote button whenever the information provided helps you.
Original posters help the community find answers faster by identifying the correct answer. Here is how - Want a reminder to come back and check responses? Here is how to subscribe to a notification
- If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators