How to ignore the records in ADF Data Flows

venkat rao 65 Reputation points
2024-05-23T06:58:12.53+00:00

Hi All

I am building a data transamination using mapping data flows ,I have a time stamp field Like TimeStampUpdated in the target table.
I want to lockup historical data with incremental data transamination and ignore the records coming in the incremental which are less than or equal to TimeStampUpdated
Require your assistance

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
4,565 questions
Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,030 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,929 questions
{count} votes

2 answers

Sort by: Most helpful
  1. Jing Zhou 3,815 Reputation points Microsoft Vendor
    2024-05-23T08:25:36.4666667+00:00

    Hello,

    Thank you for posting in Q&A forum.

    Based on your description, you are building a data conversion process that includes a timestamp field named TimeStampUpdated. You want to lock historical data through incremental data conversion and ignore records in the increment that are less than or equal to TimeStampUpdated. This requirement can usually be achieved through conditional statements in data flow processing. You can set conditions in the data flow to filter out data that meets the conditions, thereby achieving the functions you need.

    Best regards,

    Jill Zhou


    If the Answer is helpful, please click "Accept Answer" and upvote it.


  2. ShaikMaheer-MSFT 38,311 Reputation points Microsoft Employee
    2024-05-23T16:36:44.7066667+00:00

    Hi venkat rao,

    Thank you for posting query in Microsoft Q&A Platform.

    If I understand correctly, using Mapping dataflow you want to perform incremental load by comparing values between source and sink based on "TimeStampUpdated" column. Please correctly If my understanding wrong with more details.

    Please fellow below steps to achieve it.

    Step1: Source transformation to take source data.

    Step2: Another source transformation to take sink data for value in column TimeStampUpdated column.

    Step3: For step2 source add sink transformation and use cached sink. Below video explains cached sink.

    Cache Sink and Cached lookup in Mapping Data Flow in Azure Data Factory

    Step4: For step1 source transformation, add filter transformation and write condition filter rows based TimeStampUpdated column value from cached sink.

    Step5: Add sink transformation and load filtered data to sink.

    Hope this helps. Please let me know if any further queries.


    Please consider hitting Accept Answer button. Accepted answers help community as well.