Nóta
Aðgangur að þessari síðu krefst heimildar. Þú getur prófað aðskrá þig inn eða breyta skráasöfnum.
Aðgangur að þessari síðu krefst heimildar. Þú getur prófað að breyta skráasöfnum.
Important
This feature is in Private Preview. To try it, reach out to your Azure Databricks contact.
When queries run, Databricks might return insights that identify opportunities to improve performance. This page lists the supported insights and their meaning.
For a broader overview of performance best practices, review the Comprehensive Guide to Optimize Databricks, Spark and Delta Lake Workloads.
CONCURRENT_WRITE
- Concurrent writes on the table cause conflicts that are automatically resolved or fail.
- Recommendation: Review the delta history to identify concurrent writes and consider different scheduling to avoid conflicts.
COVERAGE_FILTER_KEYS_CLUSTERING
- The table is clustered by one or more keys that aren't used in filtering during the table scan.
- Recommendation: Determine which data subset you need for the desired outcome, then add filters on matching clustering keys to reduce bytes read.
COVERAGE_FILTER_KEYS_PARTITIONING
- The table is partitioned by one or more keys that aren't used in filtering during the table scan.
- Recommendation: Determine which data subset you need for the desired outcome, then add filters on matching partitioning keys to reduce bytes read.
COVERAGE_PHOTON
- Photon can't accelerate the operation, so the standard runtime engine was used.
- Recommendation: Review Photon limitations and consider adjusting the query to use a supported execution strategy for faster runtime.
COVERAGE_STATS_DELTA
- Delta data skipping statistics are missing or incomplete for the table scan file filters, so the query uses in-file filtering. The following statistics statuses are possible:
- Full: Statistics are available for all filters.
- Partial: Statistics are available on a subset of filters.
- Unavailable: Statistics are not available on any filter.
- Unused: Statistics could not be used on a filter that converts the data type.
- Recommendation: Collect Delta statistics to reduce the number of bytes read.
COVERAGE_STATS_OPTIMIZER
- Cost-based optimizer statistics are missing or incomplete, so standard heuristics were used to generate the query plan.
- Recommendation: Collect statistics to enable the optimizer to produce a better plan.
DATA_SKEW
- Data is processed unevenly by available computing resources.
- Recommendation: Review the distribution of the data, then salt keys or pre-aggregate the data.
DATA_SPILL
- Data spill to disk while executing an operator because the data size did not fit in memory.
- Recommendation: Increase warehouse size to increase available memory. Reduce the number of rows, number of columns or size of large column (strings, arrays, maps, structs) to reduce memory usage.
EXCESSIVE_QUEUE_TIME
- Query has been waiting in queue on warehouse.
- Recommendation: Increase the maximum number of clusters on the warehouse to reduce queue time.
EXPLODING_JOIN
- Join is generating significantly more rows than it has read.
- Recommendation: Determine which result subset is required, then update the join or reduce the number of input rows from both relations.
FLOW_FULL_RECOMPUTE
- Flow has been planned to be executed as full recompute.
- Recommendation: Rewrite the query for incremental support to reduce the number of bytes read.
IO_THROTTLING
- Cloud storage request was throttled by your cloud provider.
- Recommendation: Contact your administrator to increase your cloud storage request limits with your cloud provider.
REDUNDANT_AGGREGATION
- Aggregate did not change the query result.
- Recommendation: Remove the aggregate or apply primary and foreign key constraints.
SELECTIVE_JOIN
- Join is generating significantly fewer rows than it has read.
- Recommendation: Determine which result subset is required, then add filters before the join to reduce the number of input rows.
WIDE_PROJECTION
- Projecting all columns on the table.
- Recommendation: Determine which result subset is required, then project only those columns to reduce the number of bytes read.