Can we combine multiple parquet file ( each file having almost 4 M rows ) in adf power query ?

Amar Agnihotri 921 Reputation points
2022-11-22T13:02:05.113+00:00

Hi,
I am having 11 parquet files in datalake

263029-image.png

Now I want to perform some transformations on these files using adf power query.

I created a source dataset to the folder containing these parquet files and then used that dataset in adf power query. I was expecting that it will pull all the files but it seems it is only pulling a single file and only 91 rows from the file

263074-image.png

I want to pull all these files in power query and then i want to perform some transformations like merging all these files into single file, filtering the table and many more. Can anybody help me out in this. This is the first time i am trying to use power query in adf.

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,131 questions
{count} votes

Accepted answer
  1. ShaikMaheer-MSFT 38,326 Reputation points Microsoft Employee
    2022-11-24T05:09:24.367+00:00

    Hi @Amar Agnihotri ,

    Thank you for posting query and sharing details.

    I reproduced your case, it seems Power Query editor shows only first file data in preview. But when you actually run the power query from pipeline then all the transformations are getting effected on all rows from all files and output files getting generated with all files.

    So, kindly go head and implement your transformations on preview data when you run actually, they effect on all rows from all files because your dataset pointing to folder of all files.

    Hope this helps. Please let me know if any further queries.

    ----------------

    Please consider hitting Accept Answer button. Accepted answers help community as well.

    1 person found this answer helpful.

0 additional answers

Sort by: Most helpful