Events
Mar 31, 11 PM - Apr 2, 11 PM
The ultimate Microsoft Fabric, Power BI, SQL, and AI community-led event. March 31 to April 2, 2025.
Register todayThis browser is no longer supported.
Upgrade to Microsoft Edge to take advantage of the latest features, security updates, and technical support.
Dynamic file pruning, can significantly improve the performance of many queries on Delta Lake tables. Dynamic file pruning triggers for queries that contain filter statements or WHERE
clauses. You must use Photon-enabled compute to use dynamic file pruning in MERGE
, UPDATE
, and DELETE
statements. Only SELECT
statements leverage dynamic file pruning when Photon is not used.
Dynamic file pruning is especially efficient for non-partitioned tables, or for joins on non-partitioned columns. The performance impact of dynamic file pruning is often correlated to the clustering of data so consider using Z-Ordering to maximize the benefit.
For background and use cases for dynamic file pruning, see Faster SQL queries on Delta Lake with dynamic file pruning.
Dynamic file pruning is controlled by the following Apache Spark configuration options:
spark.databricks.optimizer.dynamicFilePruning
(default is true
): The main flag that directs the optimizer to push down filters. When set to false
, dynamic file pruning will not be in effect.spark.databricks.optimizer.deltaTableSizeThreshold
(default is 10,000,000,000 bytes (10 GB)
): Represents the minimum size (in bytes) of the Delta table on the probe side of the join required to trigger dynamic file pruning. If the probe side is not very large, it is probably not worthwhile to push down the filters and we can just simply scan the whole table. You can find the size of a Delta table by running the DESCRIBE DETAIL table_name
command and then looking at the sizeInBytes
column.spark.databricks.optimizer.deltaTableFilesThreshold
(default is 10
): Represents the number of files of the Delta table on the probe side of the join required to trigger dynamic file pruning. When the probe side table contains fewer files than the threshold value, dynamic file pruning is not triggered. If a table has only a few files, it is probably not worthwhile to enable dynamic file pruning. You can find the size of a Delta table by running the DESCRIBE DETAIL table_name
command and then looking at the numFiles
column.Events
Mar 31, 11 PM - Apr 2, 11 PM
The ultimate Microsoft Fabric, Power BI, SQL, and AI community-led event. March 31 to April 2, 2025.
Register todayTraining
Module
Optimize performance with Spark and Delta Live Tables - Training
Optimize performance with Spark and Delta Live Tables in Azure Databricks.