interate through folder path in ADX to merge parquet files into one

Yu, Hazel (APEX SYSTEMS LLC) 6 Reputation points
2022-10-05T15:39:42.853+00:00

Hi,
0

I have parquet files per item in ADLS sitting under each of subfolders named the same as the item name. The folder structure is like this, "abfss://storage_name@MetContainer _name.dfs.core.windows.net/UserData/folder1/folder2/folder3/folder4/"

under this path, I have subfolders named: 'applev1','bananav2','grapev3','orangev1','applev2' ( but there are some other folders named differently) ex) "abfss://storage_name@MetContainer _name.dfs.core.windows.net/UserData/folder1/folder2/folder3/folder4/applev2/"

and under each of these folders, I have parquet files with the same schema.

what I want to do is to merge the parquet files under each of these folders(applev1,bananav2) into one dataframe.

I know I can read parquet files in ADX from ADX like this..

.create external table sample_table (column1: string,column2: string,column3:string
)
kind = storage
dataformat = parquet('abfss://storage_name@MetContainer _name.dfs.core.windows.net/UserData/folder1/folder2/folder3/folder4/applev2/')

but I need to merge all the parquet files per item(applev1,bananav2,grapev3) into one table..

How can I do this in ADX? anyone can help?

I also posted this in StackOverflow,, please see the discussion in comment as well.

https://stackoverflow.com/questions/73954362/merge-multiple-parquet-files-in-different-folders-in-adls-through-adx/73959666?noredirect=1#comment130593505_73959666

Would really appreciate your help!

Azure Data Explorer
Azure Data Explorer
An Azure data analytics service for real-time analysis on large volumes of data streaming from sources including applications, websites, and internet of things devices.
508 questions
{count} votes