Getting the size of parquet files from azure blob storage

KEERTHANA JAYADEVAN 66 Reputation points
2024-07-09T11:57:50.94+00:00

I have a blob container abcd

The folder structure is like below:

abcd/Folder1/Folder a, Folder b…..Folder z

Inside a particular Folder a/v1/full/20230505/part12344.parquet

Similarly Folder b/v1/full/20230505/part9385795.parquet

Scenario is I need to get the size of each parquet files present in each folders a to z. I can see that there’s no get data size anymore in metadata activity in ADF. What else can be done here using ADF or ADB code?

Azure Data Lake Storage
Azure Data Lake Storage
An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.
1,414 questions
Azure Blob Storage
Azure Blob Storage
An Azure service that stores unstructured data in the cloud as blobs.
2,588 questions
Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,050 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,042 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Amrinder Singh 4,190 Reputation points Microsoft Employee
    2024-07-09T13:07:13.4+00:00

    Hi KEERTHANA JAYADEVAN - Thanks for reaching out.

    I would recommend enabling blob inventory report for the scenario. You need to ensure that you have "Content Length" field in the same.

    https://learn.microsoft.com/en-us/azure/storage/blobs/blob-inventory

    You can then further leverage synapse or databricks to parse the report further and get the required details.

    https://learn.microsoft.com/en-us/azure/storage/blobs/storage-blob-inventory-report-analytics

    https://learn.microsoft.com/en-us/azure/storage/blobs/storage-blob-calculate-container-statistics-databricks

    Hope that helps!

    Let me know if there are any further queries/concerns, will be glad to assist.


    Please do not forget to "Accept the answer” and “up-vote” wherever the information provided helps you, this can be beneficial to other community members.

    0 comments No comments