Inserting records into sink even when split condition criteria is not met

Subedi, Manish 40 Reputation points
2023-10-05T01:40:43.9033333+00:00

Hi,
In Data Flow, We have a split condition on a column value and it writes to two different sink:

  1. If True, Update the existing records (Insert Sink)
  2. If False, Insert new records (Update Sink)

The issue is all the records from the source file exists on Target table.
So basically it should update all the records. But it is inserting few new rows randomly. Out of 130K, 10 new records.
These 10 records already exist on target table.

Any suggestion and guidance will be highly appreciated.

Also is there any 'Wait' like activity or alternative in data flow(as in pipeline)?
If so, then we might use that before Update Sink activity so it waits for a while, assuming its some sort of memory leak.

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,196 questions
0 comments No comments
{count} votes

Accepted answer
  1. Suba Balaji 11,206 Reputation points
    2023-10-05T12:23:03.8933333+00:00

    Hello, Subedi, Manish

    Please check if you are able to preview the data in the split transformation against the source. If your condition is evaluating correctly, you should be seeing no records in data preview of true condition. Kindly check this and let us know for further steps.

    Please post the expression used in split, and also, see if you can use alter row transformation instead of split.

    And, unfortunately there is no waits for data flow.

    0 comments No comments

1 additional answer

Sort by: Most helpful
  1. Subedi, Manish 40 Reputation points
    2023-10-05T15:44:00.48+00:00

    Thank you Suba for the kind response!
    We went back and checked the flow, and figured out what the issue was.
    Interestingly, found out that the true() logic in split activity wasn't not working properly if we get null value on the column. It works only with column which has some value.
    So we created a derived column activity and converted TDLINX_MATCH column value to just 1 or 0, prior to split. And change the split condition to TDLINX_MATCH == 1

    This post can be closed.
    Thanks for your advice.
    User's image