Need optimized approach to validate the sink data type from source

Venkatesh Srinivasan 86 Reputation points

Hi, I would like to check the data type of sink from the source dataset. I'm using data flow assert function to check the source(datatype) by converting the each columns their respective data type.

Its involves totally 65 column to do it. Right now I'm using expression to convert them, But my dataflow activity taking more than 24 hours still its not succeeding. Attaching the flow snap below for your reference.




Please help me to find a better approach to do it. I'm really stuck here.

Thanks in advance

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,711 questions
0 comments No comments
{count} votes

2 answers

Sort by: Most helpful
  1. MartinJaffer-MSFT 26,036 Reputation points

    Hello @Venkatesh Srinivasan ,
    Thanks for the question and using MS Q&A platform.

    As we understand the ask here is to ensure the incoming data has the correct type ?
    Instead of using Assertions, why not leverage the Validate schema option in source settings? This feature is available for Dataset sources, but not for Inline sources. Setting Validate schema to true will cause the Data Flow to fail when there is a mismatch between the schema defined in the projection, and the schema read in.

    You will also want to turn off Allow schema drift.


    Please do let me if you have any queries.


    • Please don't forget to click on 130616-image.png or upvote 130671-image.png button whenever the information provided helps you. Original posters help the community find answers faster by identifying the correct answer. Here is how
    • Want a reminder to come back and check responses? Here is how to subscribe to a notification
      • If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators
    1 person found this answer helpful.
    0 comments No comments

  2. Venkatesh Srinivasan 86 Reputation points

    Hi @MartinJaffer-MSFT Thanks for your reply, I want to catch the error records in a separate file which are not in same data type. Additionally, I don't want to fail the pipeline only for few records. Because daily we will load millions of records.

    Is there any other way of achieving it?

    Please let me know. Thanks!

    0 comments No comments