Estimate table size
Applies to: ✅ Microsoft Fabric ✅ Azure Data Explorer
Understanding the size of a table can be helpful for efficient resource management and optimized query performance. In this article, you'll learn different methods to estimate table sizes and how to use them effectively.
Original size of ingested data
Use the .show table details to estimate the original data size of a table. For an example, see Use .show table details.
This command provides an estimation of the uncompressed size of data ingested into your table based on the assumption that the data was transferred in CSV format. The estimation is based on approximate lengths of numeric values, such as integers, longs, datetimes, and guids, by considering their string representations.
Example use case: Track the size of incoming data over time to make informed decisions about capacity planning.
Table size in terms of access bytes
Use the estimate_data_size() along with the sum() aggregation function to estimate table size based on data types and their respective byte sizes. For an example, see Use estimate_data_size().
This method provides a more precise estimation by considering the byte sizes of numeric values without formatting them as strings. For example, integer values require 4 bytes whereas long and datetime values require 8 bytes. By using this approach, you can accurately estimate the data size that would fit in memory.
Example use case: Determine the cost of a query in terms of bytes to be scanned.
Combined size of multiple tables
You can use the union operator along with the estimate_data_size() and sum() functions to estimate the combined size of multiple tables in terms of access bytes. For an example, see Use union with estimate_data_size().
Example use case: Assess the memory requirements for consolidating data from multiple tables into a single dataset.
Note
This approach may inflate the estimated data size due to empty columns, as union
combines all columns from the specified tables and estimate_data_size()
takes into account empty columns when calculating the data size.
Examples
Use .show table details
The following query estimates the original data size of the StormEvents
table.
.show table StormEvents details
| project TotalOriginalSize
Output
TotalOriginalSize |
---|
60192011 |
Tip
To format the bytes result to MB
, GB
, or another unit, use format_bytes().
Use estimate_data_size()
The following query estimates the original data size of the StormEvents
table in bytes.
::: moniker-end
StormEvents
| extend sizeEstimateOfColumn = estimate_data_size(*)
| summarize totalSize=sum(sizeEstimateOfColumn)
Output
totalSize |
---|
58608932 |
Note
The output is smaller even though the calculation is done over the same table. This is because this method provides a more precise estimation by considering the byte sizes of numeric values without formatting them as strings.
Use union with estimate_data_size()
The following query estimates the data size based for all tables in the Samples
database.
union withsource=_TableName *
| extend sizeEstimateOfColumn = estimate_data_size(*)
| summarize totalSize=sum(sizeEstimateOfColumn)
| extend sizeGB = format_bytes(totalSize,2,"GB")
totalSize | sizeGB |
---|---|
1761782453926 | 1640.79 GB |