What is your join condition?
Add column to CSV File from another CSV File (Azure Data Factory)
For example:
Persons.csv
name, last_name
-----------------------
jack, jack_lastName
luc, luc_lastname
FileExample.csv
id
243
123
Result:
name, last_name, exampleId
-------------------------------
jack, jack_lastName, 243
luc, luc_lastname, 123
I want to aggregate any number of columns from another data source, to insert that final result in a file or in a database table.
I have been trying many ways but I can't do it.
Azure Data Factory
3 answers
Sort by: Most helpful
-
MarkKromer-MSFT 5,231 Reputation points Microsoft Employee Moderator2022-04-05T05:00:33.497+00:00 -
MarkKromer-MSFT 5,231 Reputation points Microsoft Employee Moderator2022-04-05T05:20:36.8+00:00 Here is one way to solve it:
- Create a new data flow
- Add 2 sources: 1 for Persons.csv and 1 for FileExample.csv
- Add a surrogate key transformation after each source, names the keys as sk1 and sk2 respectively
- Add a Join transformation and join on sk1 == sk2
- After the Join, add a Select transformation and remove the sk1 and sk2 columns
-
MarkKromer-MSFT 5,231 Reputation points Microsoft Employee Moderator2022-04-06T06:47:35.28+00:00 You pattern will look something like this:
2 Delimited Text sources that you join on the surrogate keys and then write to the SQLSink. A 3rd source is the same SQL table that you write to in the sink. Notice I've set the sink ordering to ensure that I write the data first (SQLSink), then read back the auto-incremented IDs after the table write has been committed. The query I'm using in the ReadFromSQL just reads the data from that table so that I can write the IDs to my OuputIDs CSV file.