Data Factory's Join Transform not matching same IDs, creating null rows instead

Jeff vG 86 Reputation points
2021-07-05T03:32:41.367+00:00

I am joining together three different sources - 1 new an Excel file and a join of two existing tables in a SQL DB. Each has a sha256 hash of some identifying fields. There is only a very small number of rows (20) being inserted to the downstream tables. Then I rerun the pipeline to test to make sure I am identifying existing customers; so this doesn't seem like a row sampling issue since I am well below the 1000 rows in the sampling parameter. The existing (SQL Tables) feature is called id and the new (Excel) feature is called new_id.

When I do this join:

111608-image.png

I get this cartesian, despite the IDs obviously matching. You can see the older file has the same ID and it simply won't match, regardless of the style of join I do, or type of equality (== or ===).

111704-image.png

Doing this with an Exists transformation and the equality of the new and original id also returns zero rows. Left and Right joins match nothing but merge the respective other datasets columns, full of NULL. Even if I do left(new_id,10) == left(id,10) I still get nothing.

Even stranger, when I full-outer join the two existing SQL Tables together on the same id field (in Data Factory), they join together no problem: the full-outer join returns exactly what I expect.

111620-image.png

Why would this do this?

Azure SQL Database
Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
4,329 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,462 questions
{count} votes

2 answers

Sort by: Most helpful
  1. ShaikMaheer-MSFT 37,656 Reputation points Microsoft Employee
    2021-07-05T16:12:01.367+00:00

    Hi @Jeff vG ,

    Thank you for posting query in Microsoft Q&A Platform.

    To get matching rows of id and newid you need to go with inner join with == condition as shown below.

    111913-join.png

    Hope this will help. Thank you.

    ------------------------------

    • Please accept an answer if correct. Original posters help the community find answers faster by identifying the correct answer. Here is how.
    • Want a reminder to come back and check responses? Here is how to subscribe to a notification
    1 person found this answer helpful.

  2. ShaikMaheer-MSFT 37,656 Reputation points Microsoft Employee
    2021-07-07T05:13:04.78+00:00

    Hi @Jeff vG ,

    Thank you for detailed response. Its very strange. I am suspecting is there any spaces getting appended when you are trying excel and SQL.

    Could you please try to trim id and new_id and see if that helps.

    sample expression: trim(toString(id)) == trim(toString(new_id)))