Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Applies to:
Databricks SQL
Databricks Runtime 18.3 and above
Returns optimizer statistics for a table as a JSON document, including table-level statistics and optionally per-column statistics.
Databricks recommends SHOW STATISTICS for reading optimizer statistics programmatically. It returns only statistics, without the schema, storage, properties, partitions, and other metadata included in DESCRIBE TABLE EXTENDED ... AS JSON. Use ANALYZE TABLE ... COMPUTE STATISTICS (or predictive optimization) to collect the statistics that this command reports.
Required permissions
To run SHOW STATISTICS on a Unity Catalog table, you need:
USE CATALOGon the parent catalog.USE SCHEMAon the parent schema.SELECTon the table, or ownership of the table.
Syntax
SHOW STATISTICS [ { FROM | IN } ] table_name
[ FOR COLUMNS column_name [, ...] | FOR ALL COLUMNS ]
AS JSON
The FROM and IN keywords are optional and interchangeable.
Parameters
-
Identifies the table to report statistics for. The name must not include a temporal specification or options specification.
If the table cannot be found, Azure Databricks raises a TABLE_OR_VIEW_NOT_FOUND error.
Views (temporary or persistent) are not supported. Invoking
SHOW STATISTICSon a view raises EXPECT_TABLE_NOT_VIEW.NO_ALTERNATIVE. UseDESCRIBE EXTENDEDfor view metadata. FOR COLUMNS column_name [, ...]
Adds an entry to
statistics.column_statisticsfor each named column.If any named column does not exist on the table, the command raises COLUMN_NOT_FOUND.
FOR ALL COLUMNS
Adds an entry to
statistics.column_statisticsfor every atomic-typed top-level column of the table.Columns whose data type is STRUCT, ARRAY, or MAP are excluded, because the optimizer does not collect statistics on complex types. Nested fields are not enumerated.
AS JSON
Required. The result is returned as a single JSON-formatted
STRING.
JSON-formatted output
The returned JSON document uses the following schema:
{
"table_name": "<table_name>",
"catalog_name": "<catalog_name>",
"namespace": ["<schema_name>"],
"schema_name": "<schema_name>",
"statistics": {
"size_in_bytes": <bytes>,
"num_rows": <count>,
"collection_source": "<collection_source>",
"created_at": "<created_at_timestamp>",
"column_statistics": {
"<column_name>": {
"min": "<min_value>",
"max": "<max_value>",
"distinct_count": <count>,
"num_nulls": <count>,
"avg_len": <bytes>,
"max_len": <bytes>,
"has_histogram": <boolean>,
"has_mcv": <boolean>,
"has_kll_sketch": <boolean>
}
}
}
}
statistics.size_in_bytes
Total size of the table data in bytes, as known to the optimizer. Omitted when no table-level statistics exist.
statistics.num_rows
Row count, as known to the optimizer.
nullif not collected.statistics.collection_source
How the table-level statistics were produced. One of:
- Manual Analyze — collected by
ANALYZE TABLE ... COMPUTE STATISTICS. - Auto Stats — collected automatically during a write.
- Predictive Analyze — collected by predictive optimization.
- Unknown — collected by an unrecognized source.
null— no table-level statistics are available.
- Manual Analyze — collected by
statistics.created_at
When the table-level statistics were collected, as an ISO 8601 UTC timestamp (for example,
2026-05-29T08:14:24.845Z).nullif not collected.statistics.column_statistics
Included only when
FOR COLUMNSorFOR ALL COLUMNSis specified. Maps each requested column to its per-column statistics.The following fields are always present, with
nullwhen not collected:- min, max — minimum and maximum values, rendered as JSON strings.
- distinct_count — number of distinct values (
NDV). - num_nulls — number of
NULLvalues. - avg_len, max_len — average and maximum value length in bytes.
- has_histogram —
trueif an equi-height histogram is available for the column.
The following fields appear only when the corresponding sketch collection is enabled, and are omitted otherwise:
- has_mcv —
trueif a most-common-values list is available for the column. - has_kll_sketch —
trueif a KLL sketch is available for the column.
Examples
-- Set up a table and collect column statistics.
> CREATE TABLE customer(cust_id INT, name STRING, state STRING) USING parquet;
> INSERT INTO customer VALUES (100, 'Mike', 'AR'), (200, 'Jane', 'CA');
> ANALYZE TABLE customer COMPUTE STATISTICS FOR ALL COLUMNS;
-- Table-level statistics only.
> SHOW STATISTICS FROM customer AS JSON;
{"table_name":"customer","catalog_name":"spark_catalog",
"namespace":["default"],"schema_name":"default",
"statistics":{"size_in_bytes":864,"num_rows":2,
"collection_source":"Manual Analyze",
"created_at":"2026-05-29T08:14:24.845Z"}}
-- Per-column statistics for a list of columns.
> SHOW STATISTICS FROM customer FOR COLUMNS cust_id, name AS JSON;
{"table_name":"customer","catalog_name":"spark_catalog",
"namespace":["default"],"schema_name":"default",
"statistics":{"size_in_bytes":864,"num_rows":2,
"collection_source":"Manual Analyze",
"created_at":"2026-05-29T08:14:24.845Z",
"column_statistics":{
"cust_id":{"min":"100","max":"200","distinct_count":2,"num_nulls":0,
"avg_len":4,"max_len":4,"has_histogram":false,"has_mcv":false},
"name": {"min":"Jane","max":"Mike","distinct_count":2,"num_nulls":0,
"avg_len":4,"max_len":4,"has_histogram":false,"has_mcv":false}}}}
-- All atomic top-level columns. STRUCT, ARRAY, and MAP columns are skipped.
> SHOW STATISTICS FROM customer FOR ALL COLUMNS AS JSON;