Hello KEERTHANA JAYADEVAN,
Greetings! Welcome to Microsoft Q&A Platform.
As said above, to get the size of each parquet files present in each folders a to z. You can achieve this using either Azure Data Factory (ADF) or Azure Databricks (ADB).
On using Azure Data Factory (ADF),
Get Metadata Activity: Configure it to list the files in each folder.
ForEach Activity: Use the output of the Get Metadata activity to iterate over each file.
Get Metadata Activity: Inside the ForEach activity, use another Get Metadata activity to get the size of each file.
Store Results: Use a Copy Data activity or another appropriate activity to store the file sizes.
This approach should help you get the size of each Parquet file in your blob container.
When using Azure Databricks (ADB), you can use PySpark to list and get the size of each Parquet file as same as above.
If you need to Calculate the size/capacity of storage account and it services (Blob/Table) How to get the total size allocated to a Storage account and the for types like Queues, tables, blobs and files.
This article uses the Azure Blob Storage inventory feature and Azure Synapse to calculate the blob count and total size of blobs per container. These values are useful when optimizing blob usage per container. Calculate blob count and total size per container using Azure Storage inventory
You can use the below CLi command and followed Microsoft-Document as below:
az
Get report of file sizes from Azure Blob Storage
How to get Azure Blob file size
You can use Azure Storage Analytics to identify the largest files in your Blob storage. Storage Analytics provides detailed metrics and logs that you can use to monitor and troubleshoot your storage account. Here are the steps to enable Storage Analytics and view the metrics:
- Enable Storage Analytics: In the Azure portal, navigate to your storage account. Select the "Monitoring" tab, and then select "Storage Analytics". Click "Add policy" to create a new policy. Choose the metrics and logs you want to collect, and then click "Save".
- View the metrics: In the Azure portal, navigate to your storage account. Select the "Monitoring" tab, and then select "Metrics". Choose the metrics you want to view, and then select the time range you want to view. You can view the metrics for the entire storage account or for individual containers.
Identify the largest files: In the metrics view, you can see the total size of your Blob storage and the number of blobs in each container. You can use this information to identify the containers that are using the most storage. To identify the largest files within a container, you can use a tool like Azure Storage Explorer or Azure CLI to sort the blobs by size.
Hope this answer helps! Please let us know if you have any further queries. I’m happy to assist you further.
Please "Accept the answer” and “up-vote” wherever the information provided helps you, this can be beneficial to other community members.