Query performance insights

Important

This feature is in Private Preview. To try it, reach out to your Azure Databricks contact.

When queries run, Databricks might return insights that identify opportunities to improve performance. This page lists the supported insights and their meaning.

For a broader overview of performance best practices, review the Comprehensive Guide to Optimize Databricks, Spark and Delta Lake Workloads.

CONCURRENT_WRITE

Concurrent writes on the table cause conflicts that are automatically resolved or fail.
Recommendation: Review the delta history to identify concurrent writes and consider different scheduling to avoid conflicts.

COVERAGE_FILTER_KEYS_CLUSTERING

The table is clustered by one or more keys that aren't used in filtering during the table scan.
Recommendation: Determine which data subset you need for the desired outcome, then add filters on matching clustering keys to reduce bytes read.

COVERAGE_FILTER_KEYS_PARTITIONING

The table is partitioned by one or more keys that aren't used in filtering during the table scan.
Recommendation: Determine which data subset you need for the desired outcome, then add filters on matching partitioning keys to reduce bytes read.

COVERAGE_PHOTON

Photon can't accelerate the operation, so the standard runtime engine was used.
Recommendation: Review Photon limitations and consider adjusting the query to use a supported execution strategy for faster runtime.

COVERAGE_STATS_DELTA

Delta data skipping statistics are missing or incomplete for the table scan file filters, so the query uses in-file filtering. The following statistics statuses are possible:
- Full: Statistics are available for all filters.
- Partial: Statistics are available on a subset of filters.
- Unavailable: Statistics are not available on any filter.
- Unused: Statistics could not be used on a filter that converts the data type.
Recommendation: Collect Delta statistics to reduce the number of bytes read.

COVERAGE_STATS_OPTIMIZER

Cost-based optimizer statistics are missing or incomplete, so standard heuristics were used to generate the query plan.
Recommendation: Collect statistics to enable the optimizer to produce a better plan.

DATA_SKEW

Data is processed unevenly by available computing resources.
Recommendation: Review the distribution of the data, then salt keys or pre-aggregate the data.

DATA_SPILL

Data spill to disk while executing an operator because the data size did not fit in memory.
Recommendation: Increase warehouse size to increase available memory. Reduce the number of rows, number of columns or size of large column (strings, arrays, maps, structs) to reduce memory usage.

EXCESSIVE_QUEUE_TIME

Query has been waiting in queue on warehouse.
Recommendation: Increase the maximum number of clusters on the warehouse to reduce queue time.

EXPLODING_JOIN

Join is generating significantly more rows than it has read.
Recommendation: Determine which result subset is required, then update the join or reduce the number of input rows from both relations.

FLOW_FULL_RECOMPUTE

Flow has been planned to be executed as full recompute.
Recommendation: Rewrite the query for incremental support to reduce the number of bytes read.

IO_THROTTLING

Cloud storage request was throttled by your cloud provider.
Recommendation: Contact your administrator to increase your cloud storage request limits with your cloud provider.

REDUNDANT_AGGREGATION

Aggregate did not change the query result.
Recommendation: Remove the aggregate or apply primary and foreign key constraints.

SELECTIVE_JOIN

Join is generating significantly fewer rows than it has read.
Recommendation: Determine which result subset is required, then add filters before the join to reduce the number of input rows.

WIDE_PROJECTION

Projecting all columns on the table.
Recommendation: Determine which result subset is required, then project only those columns to reduce the number of bytes read.

Athugasemdir

Var þessi síða gagnleg?

Last updated on 2026-05-11