Auto optimize on Azure Databricks
Auto optimize is an optional set of features that automatically compact small files during individual writes to a Delta table. Auto optimize adds latency overhead to write operations but accelerates read operations. Auto optimize is particularly useful in the following scenarios:
- Streaming use cases where latency in the order of minutes is acceptable
MERGE INTO
is the preferred method of writing into Delta LakeCREATE TABLE AS SELECT
orINSERT INTO
are commonly used operations
How auto optimize works
Auto optimize consists of two complementary features: optimized writes and auto compaction.
How optimized writes work
Azure Databricks dynamically optimizes Apache Spark partition sizes based on the actual data, and attempts to write out 128 MB files for each table partition. This is an approximate size and can vary depending on dataset characteristics.
How auto compaction works
After an individual write, Azure Databricks checks if files can further be compacted, and runs an OPTIMIZE
job (with 128 MB file sizes instead of the 1 GB file size used in the standard OPTIMIZE
) to further compact files for partitions that have the most number of small files.
Enable auto optimize
Optimized writes are enabled by default for the following operations in Databricks Runtime 9.1 LTS and above:
MERGE
UPDATE
with subqueriesDELETE
with subqueries
For other operations, or for Databricks Runtime 7.3 LTS, you can explicitly enable optimized writes and auto compaction using one of the following methods:
New table: Set the table properties
delta.autoOptimize.optimizeWrite = true
anddelta.autoOptimize.autoCompact = true
in theCREATE TABLE
command.CREATE TABLE student (id INT, name STRING, age INT) TBLPROPERTIES (delta.autoOptimize.optimizeWrite = true, delta.autoOptimize.autoCompact = true)
Existing tables: Set the table properties
delta.autoOptimize.optimizeWrite = true
anddelta.autoOptimize.autoCompact = true
in theALTER TABLE
command.ALTER TABLE [table_name | delta.`<table-path>`] SET TBLPROPERTIES (delta.autoOptimize.optimizeWrite = true, delta.autoOptimize.autoCompact = true)
All new tables:
set spark.databricks.delta.properties.defaults.autoOptimize.optimizeWrite = true; set spark.databricks.delta.properties.defaults.autoOptimize.autoCompact = true;
In Databricks Runtime 10.1 and above, the table property delta.autoOptimize.autoCompact
also accepts the values auto
and legacy
in addition to true
and false
. When set to auto
(recommended), Azure Databricks tunes the target file size to be appropriate to the use case. When set to legacy
or true
, auto compaction uses 128 MB as the target file size.
In addition, you can enable and disable both of these features for Spark sessions with the configurations:
spark.databricks.delta.optimizeWrite.enabled
spark.databricks.delta.autoCompact.enabled
The session configurations take precedence over the table properties allowing you to better control when to opt in or opt out of these features.
When to opt in and opt out
This section provides guidance on when to opt in and opt out of auto optimize features.
When to opt in to optimized writes
Optimized writes aim to maximize the throughput of data being written to a storage service. This can be achieved by reducing the number of files being written, without sacrificing too much parallelism.
Optimized writes require the shuffling of data according to the partitioning structure of the target table. This shuffle naturally incurs additional cost. However, the throughput gains during the write may pay off the cost of the shuffle. If not, the throughput gains when querying the data should still make this feature worthwhile.
The key part of optimized writes is that it is an adaptive shuffle. If you have a streaming ingest use case and input data rates change over time, the adaptive shuffle will adjust itself accordingly to the incoming data rates across micro-batches. If you have code snippets where you coalesce(n)
or repartition(n)
just before you write out your stream, you can remove those lines.
When to opt in
- Streaming use cases where minutes of latency is acceptable
- When using SQL commands like
MERGE
,UPDATE
,DELETE
,INSERT INTO
,CREATE TABLE AS SELECT
When to opt out
- When the written data is on the order of terabytes and storage optimized instances are unavailable.
When to opt in to auto compaction
Auto compaction occurs after a write to a table has succeeded and runs synchronously on the cluster that has performed the write. This means that if you have code patterns where you make a write to Delta Lake, and then immediately call OPTIMIZE
, you can remove the OPTIMIZE
call if you enable auto compaction.
Auto compaction uses different heuristics than OPTIMIZE
. Since it runs synchronously after a write, we have tuned auto compaction to run with the following properties:
- Azure Databricks does not support Z-Ordering with auto compaction as Z-Ordering is significantly more expensive than just compaction.
- Auto compaction generates smaller files (128 MB) than
OPTIMIZE
(1 GB). - Auto compaction greedily chooses a limited set of partitions that would best leverage compaction. The number of partitions selected will vary depending on the size of cluster it is launched on. If your cluster has more CPUs, more partitions can be optimized.
- To control the output file size, set the Spark configuration
spark.databricks.delta.autoCompact.maxFileSize
. The default value is134217728
, which sets the size to 128 MB. Specifying the value104857600
sets the file size to 100MB.
When to opt in
- Streaming use cases where minutes of latency is acceptable.
- When you don’t have regular
OPTIMIZE
calls on your table.
When to opt out
For DBR 10.3 and below: When other writers perform operations like
DELETE
,MERGE
,UPDATE
, orOPTIMIZE
concurrently, because auto compaction can cause a transaction conflict for those jobs.If auto compaction fails due to a transaction conflict, Azure Databricks does not fail or retry the compaction. The corresponding write query (which triggered the auto compaction) will succeed even if the auto compaction does not succeed.
In DBR 10.4 and above, this is not an issue: auto compaction does not cause transaction conflicts to other concurrent operations like
DELETE
,MERGE
, orUPDATE
. The other concurrent transactions are given higher priority and will not fail due to auto compaction.
Example workflow: Streaming ingest with concurrent deletes or updates
This workflow assumes that you have one cluster running a 24/7 streaming job ingesting data, and one cluster that runs on an hourly, daily, or ad-hoc basis to delete or update a batch of records. For this use case, Azure Databricks recommends that you:
Enable optimized writes on the table level using
ALTER TABLE <table_name|delta.`table_path`> SET TBLPROPERTIES (delta.autoOptimize.optimizeWrite = true)
This ensures that the number of files written by the stream and the delete and update jobs are of optimal size.
Enable auto compaction on the session level using the following setting on the job that performs the delete or update.
spark.sql("set spark.databricks.delta.autoCompact.enabled = true")
This allows files to be compacted across your table. Since it happens after the delete or update, you mitigate the risks of a transaction conflict.
Frequently asked questions (FAQ)
- Does auto optimize Z-Order files?
- Does auto optimize corrupt Z-Ordered files?
- If I have auto optimize enabled on a table that I’m streaming into, and a concurrent transaction conflicts with the optimize, will my job fail?
- Do I need to schedule
OPTIMIZE
jobs if auto optimize is enabled on my table? - I have many small files. Why is auto optimize not compacting them?
Does auto optimize Z-Order files?
Auto optimize performs compaction only on small files. It does not Z-Order files.
Does auto optimize corrupt Z-Ordered files?
Auto optimize ignores files that are Z-Ordered. It only compacts new files.
If I have auto optimize enabled on a table that I’m streaming into, and a concurrent transaction conflicts with the optimize, will my job fail?
No. Transaction conflicts that cause auto optimize to fail are ignored, and the stream will continue to operate normally.
Do I need to schedule OPTIMIZE
jobs if auto optimize is enabled on my table?
For tables with size greater than 10 TB, we recommend that you keep OPTIMIZE
running on a schedule to further consolidate files, and reduce the metadata of your Delta table. Since auto optimize does not support Z-Ordering, you should still schedule OPTIMIZE ... ZORDER BY
jobs to run periodically.
I have many small files. Why is auto optimize not compacting them?
By default, auto optimize does not begin compacting until it finds more than 50 small files in a directory. You can change this behavior by setting spark.databricks.delta.autoCompact.minNumFiles
. Having many small files is not always a problem, since it can lead to better data skipping, and it can help minimize rewrites during merges and deletes. However, having too many small files might be a sign that your data is over-partitioned.
Feedback
Submit and view feedback for