Pipeline for few incremental tables & few flush and load tables.

Question

Pipeline for few incremental tables & few flush and load tables.

RJ 326

Hi there,

How to design ADF pipeline where few tables are incremental from source to target and few other tables are flush and load. I'm thinking I need to create separate pipelines 1 for incremental tables and other for flush and load.

Is there any other better way if its 100s of table.

Thanks

Subashri Vasudevan 11,231 Reputation points

2023-09-01T01:50:48.86+00:00

What is your Source? Please let us know about it, so that we can suggest the approach
RJ
AnnuKumari-MSFT 34,556 Reputation points Microsoft Employee Moderator

2023-09-14T06:44:38.2133333+00:00

Hi RJ ,

Just following up to see if the below answer helped. Please do consider clicking Accept Answer as accepted answers help community as well. Also, please click on Yes for the survey 'Was the answer helpful'

1 answer

Your answer

Subashri Vasudevan 11,231 Reputation points

2023-09-01T01:50:48.86+00:00

What is your Source? Please let us know about it, so that we can suggest the approach
RJ
AnnuKumari-MSFT 34,556 Reputation points Microsoft Employee Moderator

2023-09-14T06:44:38.2133333+00:00

Hi RJ ,

Just following up to see if the below answer helped. Please do consider clicking Accept Answer as accepted answers help community as well. Also, please click on Yes for the survey 'Was the answer helpful'

Answer 1

Hi RJ ,

Thankyou for using Microsoft Q&A platform and thanks for posting your question here.

I understand that you want to know how to process full load for few tables and incremental load for other tables in synapse or ADF. Please let me know if that is not the requirement.

You need to create a control/lookup table that will store all the metadata regarding which table needs to be loaded as full load, and which one as incremental load . Use watermark/benchmark column to determine the incremental records . For inc records , You need to use upsert logic to update if the record is already present in sink and insert if it's not present based on the keycolumn value.

You can use lookup activity to fetch the details from the control table and then use foreach activity to loop through all the records and inside foreach, use if block to check if it's inc load/full load . Inside both true / false conditions blocks, use copy activity , and in full load, use truncate table query in pre copy script and in inc load, get the benchmark value and load only those records which are greater than that benchmark value.

For entire implementation , kindly check out these below videos:

How to do full load from On Premise SQL Server till ADLS using Azure Synapse Pipelines

How to load latest and greatest data from ADLS to Dedicated SQL Pool using Synapse Pipelines

How to perform incremental load from OnPremise SQL server to Dedicated Sql pool

How to perform Upsert for Incremental records using Azure Synapse Pipelines

Hope it helps . Please accept the answer in case you find it helpful and take the survey as well. Thankyou

AnnuKumari-MSFT 34,556 Reputation points Microsoft Employee Moderator

2023-09-08T17:01:13.7766667+00:00

Hi RJ ,

Just checking in to see if the above answer helped. Please do consider clicking Accept Answer as accepted answers help community as well. Also, please click on Yes for the survey 'Was the answer helpful'

Share via

Pipeline for few incremental tables & few flush and load tables.

1 answer

Your answer