How to Capture Row-Level Schema Failures and Redirect Bad Records in ADF During DB2 to ADLS Ingestion

Janice Chi 140 Reputation points
2025-06-09T14:59:03.35+00:00

We are ingesting data from an IBM DB2 source into ADLS Gen2 using Azure Data Factory (ADF) Copy activity. The Copy activity is set up with schema validation enabled. During ingestion, if a row violates schema (e.g., data type mismatch), the entire partition or file fails, and we are not able to identify which row caused the issue.

We want to:

Capture row-level failure details instead of failing the full partition.

Re-ingest only the failed/bad records later.

Automatically move bad records to an error folder inside ADLS (similar to fault tolerance or error-handling mechanisms).

Questions:

Does ADF provide native support for row-level failure logging and redirecting bad records?

If not, what is the recommended design pattern to capture bad rows and continue with partial ingestion?

  • Are there any fault-tolerance settings or alternatives (like custom mapping data flows) that can help us isolate and move errored rows?
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,625 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. J N S S Kasyap 3,625 Reputation points Microsoft External Staff Moderator
    2025-06-09T15:41:29.65+00:00

    Hi @Janice Chi

    Does ADF provide native support for row-level failure logging and redirecting bad records during Copy activity? 

    Yes, Azure Data Factory (ADF) provides fault tolerance support in the Copy activity, which allows you to capture row-level errors during data movement. You can enable this by configuring the "Fault Tolerance" settings in the Copy activity. 

    Specifically, you can: 

    • Set Skip incompatible rows (schema mismatches, conversion errors, nullability violations, etc.). 
    • Enable "Log invalid rows" to capture failed rows in a separate error file. 
    • Redirect the bad records to a designated error folder in your ADLS Gen2. 

    https://learn.microsoft.com/en-us/azure/data-factory/copy-activity-fault-tolerance

    How Can I re-ingest only the failed/bad records later?

    When fault tolerance is enabled, ADF stores the bad records (along with error details) in text/CSV format under the specified error folder in your sink (e.g., ADLS Gen2). These files include: 

    • The row that failed. 
    • A column describing the error code and reason (e.g., data type conversion failure).

    You can design another ADF pipeline to: 

    • Read from this error folder. 
    • Optionally cleanse/fix the data. 
    • Re-ingest only the bad records into the destination (or quarantine DB/table). 

    What configurations steps do we need in Copy activity

    In your Copy Activity settings in ADF: 

    Navigate to the "Fault Tolerance" tab. 

    1. Set:  "Skip incompatible rows": true  "Log invalid rows": true  "Error file path": provide a folder path in ADLS (e.g., adls-container/errorlogs/db2-badrecords/) 
    2. Enable retry settings to handle transient failures. 

    Example JSON snippet from pipeline definition:

    "faultToleranceSettings": {
        "skipIncompatibleRow": true,
        "logInvalidRows": true,
        "errorFilePath": {
            "type": "AzureBlobFSLocation",
            "fileSystem": "mycontainer",
            "folderPath": "errorlogs/db2-badrecords"
        }
    }
    
    

    https://learn.microsoft.com/en-us/azure/data-factory/copy-activity-schema-and-type-mapping#schema-mismatch-and-error-row-handling

    Will enabling this fault tolerance to degrade the performance?

    Yes, enabling row-level fault tolerance introduces additional overhead as ADF must: 

    • Check each row for schema conformance. 
    • Log any invalid rows separately. 

    Performance impact is usually acceptable for moderate data volumes but should be tested and monitored for large datasets.

     I hope this information helps. Please do let us know if you have any further queries. 

     Kindly consider upvoting the comment if the information provided is helpful. This can assist other community members in resolving similar issues.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.