Rows missing in dataflow output

Lepine-Mathieu, Guillaume 0 Reputation points
2024-06-04T13:15:44.6833333+00:00

I've noticed that around 3,000 rows go missing when running a dataflow that ingests data, performs tests and corrections, and sinks it into a new table. I use a filter to determine which rows need correction and which ones don't. However, when I tried replacing the filter with a conditional split to achieve the same purpose, the 3,000 missing rows appeared in the new table. Why did these two methods produce different results despite debug mode showing the same results?

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
4,697 questions
{count} votes

1 answer

Sort by: Most helpful
  1. phemanth 8,645 Reputation points Microsoft Vendor
    2024-06-04T13:57:54.67+00:00

    @Lepine-Mathieu, Guillaume

    Thank you for taking the time to answer my questions.

    The difference in results between the filter and conditional split in your dataflow likely boils down to the order of operations.

    Filter vs Conditional Split:

    • Filter: A filter keeps only the rows that meet a specific condition. Any rows that don't meet the criteria are discarded entirely.
    • Conditional Split: A conditional split, on the other hand, evaluates a condition and sends matching rows down a designated output stream. Rows can potentially be sent to multiple streams based on conditions.

    Understanding the Discrepancy:

    • Filter: In your scenario, the filter likely excludes the 3,000 rows that need correction from the data stream altogether. These rows never reach the correction logic and are thus missing in the final output.
    • Conditional Split: With a conditional split, these 3,000 rows are likely being sent down a separate output stream meant for rows requiring correction. If this output stream isn't directed to the final destination (the new table), it explains why they were missing previously and appear now.

    Troubleshooting Tips:

    1. Review Conditional Split Configuration: Double-check your conditional split configuration. Ensure the output stream containing the rows needing correction is directed towards the logic that performs the correction and ultimately reaches the destination table.
    2. Verify Data Flow Logic: In debug mode, trace the data flow for both the filter and conditional split scenarios. This can help identify where the discrepancy arises and ensure the corrected rows are being routed appropriately.
    3. Multiple Conditions in Split: If your conditional split has multiple conditions, make sure the rows you expect to be corrected are being directed to the correct output stream based on the condition.

    By carefully examining the data flow logic and the configuration of the conditional split, you should be able to resolve the issue and ensure all the corrected rows are included in the final output table.

    Hope this helps. Do let us know if you any further queries.

    0 comments No comments