RE: Dataflow Exists doesn't work properly

Bryan Gateley 0 Reputation points
2024-08-05T08:18:56.6333333+00:00

Following up on this question here.... which has a very clear example of how Exist and Not Exist are supposed to work.

https://learn.microsoft.com/en-us/answers/questions/1060665/dataflow-exists-doesnt-work-properly

I have a very simple example which I have even spot checked against logic from the documentation itself, specifically, in the second paragraph which states... "The exists transformation is similar to SQL WHERE EXISTS and SQL WHERE NOT EXISTS."

https://learn.microsoft.com/en-us/azure/data-factory/data-flow-exists

See screenshots below which show that the data in both sources are the same type and I am using NOT exists. Yet the results show records which DO exists and when running an equivalent (Where Exists) query in sql it shows those Records DO exists as well (meaning they should not be showing up in my dataflow). What am i missing?

Source2 Data Preview

azuredatafactory_dataflow_exists-source1

Source1 Data Previewazuredatafactory_dataflow_exists-source2

Exists Configuration Previewazuredatafactory_dataflow_exists-ordernumbernotexists

Exists Data Preview (See Order #DD1305)azuredatafactory_dataflow_exists-ordernumbernotexists-datapreview

Sql Equivalent - Order #DD1305 appears in Exists Sql, therefore should NOT appear in data flow. azuredatafactory_dataflow_exists_sql-equivalent

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
{count} votes

1 answer

Sort by: Most helpful
  1. AnnuKumari-MSFT 34,566 Reputation points Microsoft Employee Moderator
    2024-08-06T11:08:04.7066667+00:00

    Hi Bryan Gateley ,

    Welcome to Microsoft Q&A platform and thanks for posting your query here.

    As per my understanding , you are trying to get all the non-matching records between two datasets based on the conditions provided in the exists transformation configuration. Please let me know if that is not the case.

    I had a close look between the SQL query and the dataflow expression used in the Exists transformation . It seems in SQL the two conditions on which data matching is being performed are:

    1. shop.shop_ordernumber=ship.ship_ordernumber
    2. shop.storeid=ship.storeid

    However, in the dataflow, the conditions used are:

    1. shop.shop_ordernumber=ship.ship_ordernumber
    2. shop.shop_id=ship.shop_id

    Could you please cross check if it should be shop_id or storeid like SQL query.

    Hope it helps. Kindly let me know if the conditions are as expected, will dig more into it. Thanks

    0 comments No comments

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.