Get file names from different folders and combine them into one variable in ADF

Zhu, Yueli YZ [NC] 280 Reputation points
2025-03-10T17:16:24.1966667+00:00

I have 100 folders: f1-f100, each folders have many different csv files. How to get all the csv files' names together,in a list, like results in one variable? Thanks a lot

source in a storage blob container like below:

 f1: test1.csv, test2.csv

 f2:test2.csv,

 f3: test3.csv,test4.csv, test5.csv

   ...

 f100: test1001:csv,test1002:csv,test1003:csv

expected result below:

{

"name": "file_names",

"value": [

	"test1.csv",

	"test2.csv",

	...

	"test1001.csv",

	"test1002.csv",

	"test1003.csv"

]

}

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,345 questions
{count} vote

Accepted answer
  1. Chandra Boorla 10,080 Reputation points Microsoft External Staff
    2025-03-13T17:19:38.2633333+00:00

    @Zhu, Yueli YZ [NC]

    I'm glad that you were able to resolve your issue and thank you for posting your solution so that others experiencing the same thing can easily reference this! Since the Microsoft Q&A community has a policy that "The question author cannot accept their own answer. They can only accept answers by others ", I'll repost your solution in case you'd like to accept the answer.

    Issue:

    Get file names from different folders and combine them into one variable in ADF

    Solution:

    "I was able to resolve the issue by first converting the array variable into a string format. Then, for each folder, I iterated through the file names and appended them to the string, ensuring that all file names for each folder were included."

    If I missed anything please let me know and I'd be happy to add it to my answer, or feel free to comment below with any additional information.

    Hope this helps. Do let us know if you have any further queries.


    If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.

    0 comments No comments

3 additional answers

Sort by: Most helpful
  1. Nandan Hegde 34,426 Reputation points MVP
    2025-03-11T03:42:04+00:00

    Please refer the below blog for step by step detail to achieve the goal :

    https://datasharkx.wordpress.com/2023/02/06/calculate-folder-size-of-an-azure-blob-storage-data-lake-storage-via-synapse-data-factory-pipeline/

    you need a combination of for each activity and get meta data activity to get list of all files within all folders recursively.

    Note : rather than getting the file size and adding it up, you can use append variable activity and append the file names to an array variable .

    Then take a union of the variable with itself at the end after all iterations to remove the duplicate values and get the unique file names


  2. Zhu, Yueli YZ [NC] 280 Reputation points
    2025-03-13T15:40:26.74+00:00

    I got it worked out by converting an array variable to a string. Then append the string of files' names for each folder.


  3. Zhu, Yueli YZ [NC] 280 Reputation points
    2025-03-13T19:16:14.4466667+00:00

    Hi Chandra Boorla,Nandan Hegde, following are my steps. Hopefully, it will be more clear and helpful for others. Again, thanks for all your help!

    1.ForEach activity---- in setting, items is the array ["f1"..,"f100"]

         Inside ForEach activity
    
        	a. Get Metadata activity : get the child item, this will list all the folders' name
    
        	b. Execute pipeline activity: since ADF could not nested ForEach activity, 
    
                      I created another pipeline sub_pipeline below:
    
    		(1) ForEach activity: in setting, items is the chilItems from the previous get metadata activity
    
                	    Inside the ForEach activity: 
    
    			      Append Activity:  value is @item.name		                 	
    
    		(2) Set variable activity: in setting, choose the pipeline return value, set a new return value 		
    
                c. Set variable activity: in setting, value is from the execute sub_pipeline return value
    
    	d. Set variable activity: in setting, value is convert the array of file names from one folder to a string 
    
    	e. Append variable activity: in setting, value is the variable from the previous set variable activity
    
    1. Set variable activity---- in setting, value is the variable from the previous append activity. This will list all the file names in one variable

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.