I am joining together three different sources - 1 new an Excel file and a join of two existing tables in a SQL DB. Each has a sha256 hash of some identifying fields. There is only a very small number of rows (20) being inserted to the downstream tables. Then I rerun the pipeline to test to make sure I am identifying existing customers; so this doesn't seem like a row sampling issue since I am well below the 1000 rows in the sampling parameter. The existing (SQL Tables) feature is called id
and the new (Excel) feature is called new_id
.
When I do this join:
I get this cartesian, despite the IDs obviously matching. You can see the older file has the same ID and it simply won't match, regardless of the style of join I do, or type of equality (== or ===).
Doing this with an Exists
transformation and the equality of the new and original id also returns zero rows. Left and Right joins match nothing but merge the respective other datasets columns, full of NULL. Even if I do left(new_id,10) == left(id,10)
I still get nothing.
Even stranger, when I full-outer join the two existing SQL Tables together on the same id
field (in Data Factory), they join together no problem: the full-outer join returns exactly what I expect.
Why would this do this?
Hi @Jeff vG ,
It would be great if you can accept answer. Accepting answer helps community. Thank you.