Incrementally load data from a source data store to a destination data store

Article
10/03/2024

APPLIES TO: Azure Data Factory Azure Synapse Analytics

Tip

Try out Data Factory in Microsoft Fabric, an all-in-one analytics solution for enterprises. Microsoft Fabric covers everything from data movement to data science, real-time analytics, business intelligence, and reporting. Learn how to start a new trial for free!

In a data integration solution, incrementally (or delta) loading data after an initial full data load is a widely used scenario. The tutorials in this section show you different ways of loading data incrementally by using Azure Data Factory.

Delta data loading from database by using a watermark

In this case, you define a watermark in your source database. A watermark is a column that has the last updated time stamp or an incrementing key. The delta loading solution loads the changed data between an old watermark and a new watermark. The workflow for this approach is depicted in the following diagram:

Workflow for using a watermark

For step-by-step instructions, see the following tutorials:

For templates, see the following:

Delta copy with control table

Delta data loading from SQL DB by using the Change Tracking technology

Change Tracking technology is a lightweight solution in SQL Server and Azure SQL Database that provides an efficient change tracking mechanism for applications. It enables an application to easily identify data that was inserted, updated, or deleted.

The workflow for this approach is depicted in the following diagram:

Workflow for using Change Tracking

For step-by-step instructions, see the following tutorial:

Incrementally copy data from Azure SQL Database to Azure Blob storage by using Change Tracking technology

Loading new and changed files only by using LastModifiedDate

You can copy the new and changed files only by using LastModifiedDate to the destination store. ADF will scan all the files from the source store, apply the file filter by their LastModifiedDate, and only copy the new and updated file since last time to the destination store. Please be aware that if you let ADF scan huge amounts of files but you only copy a few files to the destination, this will still take a long time because of the file scanning process.

For step-by-step instructions, see the following tutorial:

Incrementally copy new and changed files based on LastModifiedDate from Azure Blob storage to Azure Blob storage

For templates, see the following:

Copy new files by LastModifiedDate

Loading new files only by using time partitioned folder or file name.

You can copy new files only, where files or folders has already been time partitioned with timeslice information as part of the file or folder name (for example, /yyyy/mm/dd/file.csv). It is the most performant approach for incrementally loading new files.

For step-by-step instructions, see the following tutorial:

Incrementally copy new files based on time partitioned folder or file name from Azure Blob storage to Azure Blob storage

Advance to the following tutorial:

Incrementally copy data from one table in Azure SQL Database to Azure Blob storage

Additional resources

Documentation

Incrementally copy a table using Azure portal - Azure Data Factory

In this tutorial, you create an Azure Data Factory with a pipeline that loads delta data from a table in Azure SQL Database to Azure Blob storage.
Incrementally copy multiple tables using Azure portal - Azure Data Factory

In this tutorial, you create an Azure data factory with a pipeline that loads delta data from multiple tables in a SQL Server database to a database in Azure SQL Database.
Change data capture - Azure Data Factory & Azure Synapse

Learn about change data capture in Azure Data Factory and Azure Synapse Analytics.
Data tool to copy new and updated files incrementally - Azure Data Factory

Create an Azure data factory and then use the Copy Data tool to incrementally load new files based on LastModifiedDate.
Incrementally copy new files based on time partitioned file name - Azure Data Factory

Create an Azure data factory and then use the Copy Data tool to incrementally load new files only based on time partitioned file name.
Delta copy from a database using a control table - Azure Data Factory

Learn how to use a solution template to incrementally copy new or updated rows only from a database with Azure Data Factory.
Incrementally copy data by using change tracking in the Azure portal - Azure Data Factory

Learn how to create a data factory with a pipeline that loads delta data based on change tracking information from Azure SQL Database and moves it to Azure Blob Storage.
Build large-scale data copy pipelines with metadata-driven approach in copy data tool - Azure Data Factory

Provides information about the metadata-driven approach in ADF copy data tool

Training

Learning path

Data integration at scale Azure Data Factory - Training

Data integration at scale with Azure Data Factory or Azure Synapse Pipeline

Certification

Microsoft Certified: Azure Data Engineer Associate - Certifications

Demonstrate understanding of common data engineering tasks to implement and manage data engineering workloads on Microsoft Azure, using a number of Azure services.

Events

FabCon Vegas

Mar 31, 11 PM - Apr 2, 11 PM

The biggest Fabric, Power BI, and SQL learning event. March 31 – April 2. Use code FABINSIDER to save $400.

Share via

Incrementally load data from a source data store to a destination data store

Delta data loading from database by using a watermark

Delta data loading from SQL DB by using the Change Tracking technology

Loading new and changed files only by using LastModifiedDate

Loading new files only by using time partitioned folder or file name.

Feedback

Additional resources

Share via

Incrementally load data from a source data store to a destination data store

Delta data loading from database by using a watermark

Delta data loading from SQL DB by using the Change Tracking technology

Loading new and changed files only by using LastModifiedDate

Loading new files only by using time partitioned folder or file name.

Related content

Feedback

Additional resources