Hi @Dhanoordaran V
You're correct that Microsoft Purview shows file sizes for some file types (like CSV) under the Properties tab, but support for other formats—such as Parquet and certain Synapse Analytics assets—can be inconsistent depending on the source system and integration.
Here are some key points to consider
- Parquet Files: For Parquet files stored in Azure Data Lake Storage (ADLS), file size metadata may not always be extracted by default. Ensure that the scan rule set for your data source includes the "Extract file-level metadata" option (enabled in advanced settings during scan configuration).
- Synapse Analytics: For Synapse SQL or dedicated pools, the Purview integration might not extract individual file sizes unless those are part of linked datasets (e.g., external tables over ADLS). Currently, file size metadata is more reliably extracted from storage-backed data sources rather than query-based systems.
- Using the Purview REST API: You can use the Purview Search API to query for assets and their metadata. However, file size is returned only if it was captured during the scan. Look for attributes like
qualifiedName
,name
, andfileSize
.
Example API filter:
{
"keywords": "parquet",
"filter": {
"and": [
{
"attributeName": "fileSize",
"operator": "isNotNull"
}
]
}
}
I hope this information helps. Please do let us know if you have any further queries.
Kindly consider upvoting the comment if the information provided is helpful. This can assist other community members in resolving similar issues.