rowcount from parquet file in synapse pipeline

Question

rowcount from parquet file in synapse pipeline

rajendar erabathini 616

HI - I need to know the row count from parquet file in synapse pipeline. I am trying with lookup activity but it has some limitation in keeping the rows in memory and throws an error if the data size exceeds 1MB(?). Is there any better way to find it within synapse pipeline. Please note that we are not using azure databricks.

thanks

Accepted answer

1 additional answer

Your answer

Answer 1

Hi rajendar erabathini,

Thank you for posting query in Microsoft Q&A Platform.

Yes, lookup activity can read up to 5000 rows or 4MB of size only. So you cannot use that in this case. You need to consider using mapping data flows. In source transformation use your file and then use aggregate transformation to group all data and get count of rows. Finally use sink transformation with output to activity option. This helps to output the row count to output of data flow activity.

Kindly consider checking below videos to understand each of above capabilities or transformations.

Aggregate Transformation in Mapping Data Flow in Azure Data Factory

Write Cache Sink to Activity Output in Azure Data factory

Hope this helps. Please let me know if any further queries.

Please consider hitting Accept Answer button. Accepted answers help community as well.

Answer 2

rajendar erabathini 616

thanks @ShaikMaheer-MSFT

Share via

rowcount from parquet file in synapse pipeline

1 additional answer

Your answer