how to transform all files in a folder and export as seperate files in one notebook

Question

i have a adls gen2 folder with multiple parquet files with same structure. i want to transform all files at once seperately with one script in same notebook and convert each file to csv and write to another folder in adls.
how can achieve this?

let's say 10 files in adls....i want to do this

adls gen 2 folder A ---> read and transform in one db notebook --> write output to folder B in adls
10 parquet files seperately(no merging) in csv format (10 csv files)

@AmanpreetSingh-MSFT @PRADEEPCHEEKATLA @HarithaMaddi-MSFT

Accepted Answer

Hello @reddy ,

Here are the steps to convert Parquet files to csv format in a notebook:

Parquet files in an Azure Data Lake Gen2 folder name azure:

Step1: You can access the Azure Data Lake Gen2 storage account in databricks using any one of the methods from this document.

I’m access ADLS gen2 folder using the storage account access key.

 spark.conf.set("fs.azure.account.key..dfs.core.windows.net",” storage-account-access-key-name>"))

Step2: Using Spark, you can convert Parquet files to CSV format as shown below.

CSV files in an Azure Data Lake Gen2 folder name csv files:

Hope this helps. Do let us know if you any further queries.

----------------------------------------------------------------------------------------

Do click on "Accept Answer" and Upvote on the post that helps you, this can be beneficial to other community members.

Share via

how to transform all files in a folder and export as seperate files in one notebook

0 additional answers

Your answer