What is purpose of parameter, its default value & @item().name on dataset?

Question

What is purpose of parameter, its default value & @item().name on dataset?

Justin Doh 920

Hi.

I am trying to replicate pipeline from this video.

User's image

All About BI!

https://youtu.be/bkJwgEzos9k?t=61

But, I am still stuck after I replicated from this video.

What I am trying to do is:

Check structure of several csv files inside one folder.
Then, compare structure of one csv file (reference file).

First, I understand these:

I need to somehow pass child item info from source files.
Then, I need to pass using this expression: @activity('Get Metadata1').output.childItems

What I do not understand are followings:

Why do we need parameter on the second dataset? (in this case 'filename') and what I should I put as a Default value inside the box?

User's image

What is purpose of "@item().name"?

User's image

I am trying to understand the logic of parameter on the second dataset and also why we need @item().name.

Thanks.

Nandan Hegde 36,716 Reputation points MVP Volunteer Moderator

2023-08-04T02:44:51.5633333+00:00
Hey,

Based on the images provided , I can state the below aspects :

Rather than having static hardcoded dataset (hardcoded here states explicitly mentioning the filename ), the dataset is a dynamic one.

Below process is what might be happening :

Via Get meta data activity, you get list of all files within a folder

Now the need is to get the structure of every file.

To get structure of every file, as stated in 1 of our prev conversation ; you need to use a dataset that contains file within Getmeta data activity.

Now, the dataset is a dynamic one (in order to iterate across every file name you got via step #1).

So to pass filename at run time dynamically for every iteration, a parameter is created at dataset level which is filename.

Note: The default value can be blank as well. It is similar to how you define variables in SQL wherein you can initialize it with a default value and afterwards override.

Now for your 2nd ask, item().name ; as we are validating across every file within for activity ; so item().name would provide the filename passing in every iteration to get the corresponding structure for that file.

Note: I am yet to go through the video, would go through it once and confirm back on this and update this thread if need be
Nandan Hegde 36,716 Reputation points MVP Volunteer Moderator

2023-08-04T16:24:59.17+00:00

Hey,

the All About BI video is based on only 1 file structure comparison .
Whereas your case if you want to iterate over a bunch of file (the list which you get via the initial Get meta data activity). So you would be using for activity and to get the file name in every iteration (array of getmeta data child item list), you need to use item().name.

And that filename is passed in the dataset dynamically at every iteration and get the file structure which you can compare with the intentional header.

hope this helps!!
Justin Doh 920 Reputation points

2023-08-04T16:31:36.0033333+00:00
@Nandan Hegde

Thank you for following up on your previous help. Appreciated.

So, for my case, the example of All About BI! is not a good case? Basically I am comparing bunch of files with one reference file (for Structure). I am little bit confused.

I know you already answered this but during development, it keeps asking that it is missing, so I am not sure whether I need to put something or not. (for second dataset)? I am wondering which object (Parameter or Dataset property) I should create first to prevent this (asking for Default value) happening.

Thank you!
Justin Doh 920 Reputation points

2023-08-04T18:35:37.9033333+00:00

@Nandan Hegde Thank you. I think you deserve answering my questions. How do I accept your comments as answer?

Answer accepted by question author

1 additional answer

Your answer

Nandan Hegde 36,716 Reputation points MVP Volunteer Moderator

2023-08-04T02:44:51.5633333+00:00

Hey,

Based on the images provided , I can state the below aspects :

Rather than having static hardcoded dataset (hardcoded here states explicitly mentioning the filename ), the dataset is a dynamic one.

Below process is what might be happening :

Via Get meta data activity, you get list of all files within a folder

Now the need is to get the structure of every file.

To get structure of every file, as stated in 1 of our prev conversation ; you need to use a dataset that contains file within Getmeta data activity.

Now, the dataset is a dynamic one (in order to iterate across every file name you got via step #1).

So to pass filename at run time dynamically for every iteration, a parameter is created at dataset level which is filename.

Note: The default value can be blank as well. It is similar to how you define variables in SQL wherein you can initialize it with a default value and afterwards override.

Now for your 2nd ask, item().name ; as we are validating across every file within for activity ; so item().name would provide the filename passing in every iteration to get the corresponding structure for that file.

Note: I am yet to go through the video, would go through it once and confirm back on this and update this thread if need be
Nandan Hegde 36,716 Reputation points MVP Volunteer Moderator

2023-08-04T16:24:59.17+00:00

Hey,

the All About BI video is based on only 1 file structure comparison .
Whereas your case if you want to iterate over a bunch of file (the list which you get via the initial Get meta data activity). So you would be using for activity and to get the file name in every iteration (array of getmeta data child item list), you need to use item().name.

And that filename is passed in the dataset dynamically at every iteration and get the file structure which you can compare with the intentional header.

hope this helps!!
Justin Doh 920 Reputation points

2023-08-04T16:31:36.0033333+00:00

@Nandan Hegde

Thank you for following up on your previous help. Appreciated.

So, for my case, the example of All About BI! is not a good case? Basically I am comparing bunch of files with one reference file (for Structure). I am little bit confused.

I know you already answered this but during development, it keeps asking that it is missing, so I am not sure whether I need to put something or not. (for second dataset)? I am wondering which object (Parameter or Dataset property) I should create first to prevent this (asking for Default value) happening.

Thank you!
Justin Doh 920 Reputation points

2023-08-04T18:35:37.9033333+00:00

@Nandan Hegde Thank you. I think you deserve answering my questions. How do I accept your comments as answer?

Answer 1

Hey,
The All with BI aspect would be the logic that you would incorporate within the foreach activity.

So lets take an example of your case:

within your blob path lets say you have 4 files within the blob path :
Path : raw/folder1/
f1.csv
f2.csv
f3.csv
f4.csv

via your initial getmetadata activity leveraging the dataset mapping to the path raw/folder1/ and the childitem feature, you would get the list of file names as the get metadata activity1 output :

Getmetadata activity chile item list output :
f1.csv
f2.csv
f3.csv
f4.csv

Now you would provide this array as an input to the for activity, so the loop would run 4 times corresponding to the number of files.

Within for each activity :-
1st iteration :
item().name value would be f1.csv

now the getmeta data activity 2 within for activity would leverage the dataset wherein you need to pass the file name : item().name
So in 1st iteration: f1.csv

So the dataset would map to the overall aspect as :
raw/folder1/f1.csv thereby pointing to a file and hence providing the structure feature of get meta data activity.

lets assume the structure for the file f1.csv is

c1,c2,c3

Note: this is just a sample one

You can compare this header with your sample reference.

And based on comparision, the next actions would be taken.

Once the 1st iteration is done, for activity would move to 2nd iteration and in 2nd iteration the item().name would be f2.csv and it would follow the same process as f1.csv.

So the youtube video part would be the logic within the for activity.

And as per your defaut value, you can add any value as in case during run time, it can take the latest value in case if any value is passed else it would take the default value

Nandan Hegde 36,716 Reputation points MVP Volunteer Moderator

2023-08-05T05:45:07.02+00:00

Glad we could be of help @Justin Doh
Justin Doh 920 Reputation points

2023-08-05T13:41:24.61+00:00

@Nandan Hegde Thank you again!

Answer 2

Hi King Java,

Thank you for posting query in Microsoft Q&A Platform.

Q. Why do we need parameter on the second dataset? (in this case 'filename') and what I should I put as a Default value inside the box? and What is purpose of "@item().name"?

A. Your second dataset is inside ForEach Acitivity, ForEach activity takings childitems that means file names from folder and iterating over them. If you observe output json of first GetMetaData Activity it has file names as json object keys name and type. So inside ForEach Activity to access file name we should write expression as @item().name. This file name we should pass in to second dataset, so that it can point to files dynamically while fetching struture. Hence we parameterize second dataset there. We no need to pass any default value to second dataset parameter.

Please check below video that helps to understand about parameterization and also how pass output of one activity to other.

Parameterize Datasets in Azure Data Factory

How to read JSON output of one Activity in to another Activity in Azure Data Factory

Hope this helps. Please let me know if any further queries.

Please consider hitting Accept Answer button. Accepted answers help community as well.

Justin Doh 920 Reputation points

2023-08-04T19:24:07.46+00:00

@ShaikMaheer-MSFT Thank you so much for your help again! I really appreciate for your detailed feedback. I will review the videos again. I think parametrized datasets would be next goal, but for now, I have to use parameterized file names :)

Share via

What is purpose of parameter, its default value & @item().name on dataset?

1 additional answer

Your answer