How to add Extra Column in CSV from another CSV using Data flow activity ?

Question

How to add Extra Column in CSV from another CSV using Data flow activity ?

Sudarshan Kumar 20

HI , this is My CSV 1 and below is CSV 2

Input Is below two CSV

f_ID,f_Ctb
71431454,5678

CSV  2 

id,name,DownloadTextractOutput
1,sudarshan,Research Report:

The Output is below

f_ID,f_Ctb,Textarct
71431454,5678,Research Report:

Subashri Vasudevan 11,226 Reputation points

2024-02-11T09:25:53.46+00:00

There's no common column between the two Csv files right ? Also will there be only one row in both the files always?
Sudarshan Kumar 20 Reputation points

2024-02-11T10:58:12.4566667+00:00

Correct ,There is no common rows and there will always be one row and only one rows ,I feel adding Id in both would help and we can perfrom some join
AnnuKumari-MSFT 34,556 Reputation points Microsoft Employee Moderator

2024-02-19T10:51:53.01+00:00

Hi Sudarshan Kumar , Just checking if you got a chance to try the below mentioned suggestion. Kindly accept the answer by clicking on Accept answer button if it helped to resolve your issue. Thankyou

3 answers

Your answer

Subashri Vasudevan 11,226 Reputation points

2024-02-11T09:25:53.46+00:00

There's no common column between the two Csv files right ? Also will there be only one row in both the files always?
Sudarshan Kumar 20 Reputation points

2024-02-11T10:58:12.4566667+00:00

Correct ,There is no common rows and there will always be one row and only one rows ,I feel adding Id in both would help and we can perfrom some join
AnnuKumari-MSFT 34,556 Reputation points Microsoft Employee Moderator

2024-02-19T10:51:53.01+00:00

Hi Sudarshan Kumar , Just checking if you got a chance to try the below mentioned suggestion. Kindly accept the answer by clicking on Accept answer button if it helped to resolve your issue. Thankyou

Answer 1

Richard Swinbank 527 MVP

If there will only ever be one row in CSV1, you can read it in and output it to a cache sink.

You can include cached values in your CSV2 stream with a Derived Column transformation that uses a cached lookup expression, e.g. YourCacheSink#outputs()[1].f_ID

Sudarshan Kumar 20 Reputation points

2024-02-11T11:59:12.2333333+00:00
sinkcache#outputs() does not give any values .It gives me empty array
Sudarshan Kumar 20 Reputation points

2024-02-11T12:11:00.2033333+00:00

this is my sinkset cache data preview

I want to add DownloadTextractOutput column to new derived column TextResponse
Richard Swinbank 527 Reputation points MVP

2024-02-11T12:47:39.8866667+00:00
The sink output looks OK for the cache. Can you show what your data flow looks like? Here's one I've started that does what I described above:

You can see that:

the CSV2 stream sink is configured as a cache

the cache transformation is called "sinkcache"

the derived column transformation uses the expression sinkcache#outputs()[1].DownloadTextractOutput
Sudarshan Kumar 20 Reputation points

2024-02-11T15:19:43.2466667+00:00

The Data Looks like this

I am getting below error No field named DownloadTextractOutput in the hierarchical structure
Richard Swinbank 527 Reputation points MVP

2024-02-11T15:42:40.9833333+00:00

Can you share an image of your data flow? In particular I'm interested in your sink configuration and in you derived column expression.
Sudarshan Kumar 20 Reputation points

2024-02-12T05:58:27.5366667+00:00

Data Flow
Sudarshan Kumar 20 Reputation points

2024-02-12T05:58:50.37+00:00
Richard Swinbank 527 Reputation points MVP

2024-02-12T07:24:17.2566667+00:00

Is your sink cache actually configured as a cache? The icon in your screenshot looks like a regular sink.

Its sink type must be set to “Cache” on its “Sink” tab:

(Image from the documentation).
Sudarshan Kumar 20 Reputation points

2024-02-12T07:29:55.8666667+00:00

Yes I selected that Cache option but when i try to see the mapping i do not see any data
Richard Swinbank 527 Reputation points MVP

2024-02-12T12:10:16.91+00:00

I'm guessing that the TypeMismatch reported in the derived column transformation is the issue.

Are you able to share the JSON code for the dataflow? Please only do this if it contains no personal or sensitive information.

You can access the code by clicking the "{ }" icon in the top right of the design surface.

Answer 2

Deleted

This answer has been deleted due to a violation of our Code of Conduct. The answer was manually reported or identified through automated detection before action was taken. Please refer to our Code of Conduct for more information.

Comments have been turned off. Learn more

Answer 3

Hi Sudarshan Kumar ,

Thankyou for using Microsoft Q&A platform and thanks for posting your query here.

As per my understanding, you want to combine two csv datasets into single dataset by combining columns without having any common column between the two datasets. Please let me know if my understanding is incorrect.

This can be achieved using mapping dataflow.

Add two source transformations pointing to each of the input csv files.
You can use 'Surrogate key transformation following each of the source transformations to create sequence number for each of the rows , say for first dataset, col1 having values as 1,2 and for second dataset , col2 having values 1, 2.
Now , you have a common column between the two datasets.
Use 'Join transformation ' to combine the datasets based on the common column values col1 and col2.
Use 'Select transformation' to deselect the unnecessary columns.
Use 'Sink transformation' to load the output to the sink file.

Hope it helps. Kindly accept the answer by clicking on Accept answer button. Thankyou

AnnuKumari-MSFT 34,556 Reputation points Microsoft Employee Moderator

2024-02-13T06:32:48.63+00:00

Hi Sudarshan Kumar , Just checking in to see if the above answer helped. Please do consider clicking Accept Answer as accepted answers help community as well. Also, please click on Yes for the survey 'Was the answer helpful'

Share via

How to add Extra Column in CSV from another CSV using Data flow activity ?

3 answers

Your answer