Ibahagi sa


Transactions

Important

Transactions that write to Unity Catalog managed Delta tables are in Public Preview.

Transactions that write to Unity Catalog managed Iceberg tables are in Private Preview. To join this preview, submit the managed Iceberg tables preview enrollment form.

Transactions let you coordinate operations across multiple SQL statements and tables. All changes succeed together or roll back together, ensuring data consistency across your operations and tables. Transactions provide ACID guarantees: atomicity, consistency, isolation, and durability. See What are ACID guarantees on Azure Databricks?. Transactions can be used with stored procedures and SQL Scripting to build mission-critical warehousing workloads.

The following example shows a transaction:

BEGIN ATOMIC
  UPDATE accounts SET balance = balance - 100 WHERE id = 1;
  UPDATE accounts SET balance = balance + 100 WHERE id = 2;
  INSERT INTO audit_log VALUES (1, 2, 100, current_timestamp());
END;

All three statements commit together. If any statement fails, all changes roll back and Databricks terminates the transaction without side effects.

For hands-on practice with transactions, see Tutorial: Coordinate transactions across tables.

Requirements

To run transactions that span multiple statements or multiple tables:

  • All tables written to must:
  • Use supported compute:
    • For non-interactive transactions, use any SQL warehouse, serverless compute, or cluster running Databricks Runtime 18.0 and above.
    • For interactive transactions, use any SQL warehouse.

Transaction modes

Azure Databricks supports two transaction modes:

Mode Syntax Commit Rollback Best for
Non-interactive ATOMIC compound statement Automatic on success Automatic on error Fixed sequences, scheduled jobs
Interactive BEGIN TRANSACTION; COMMIT; Manual Manual Conditional logic, validation and debugging, JDBC, ODBC

For detailed syntax, examples, and usage patterns for both modes, see Transaction modes.

Supported operations

You can use the following operations within transactions:

Operation Description
SELECT Query data and validate results
VALUES clause Generate test data or constant values
INSERT (including all variants) Add new rows
UPDATE Modify existing rows
COPY INTO Load data from a file into a Delta table
DELETE FROM Remove rows
MERGE INTO Upsert patterns combining insert, update, and delete

Read sources in transactions

Transactions can read from Unity Catalog tables (Delta and Iceberg), streaming tables, views, and materialized views. To read from non-transactional sources, use the allow_nontransactional_reads hint.

Read from non-transactional sources

To read from non-transactional sources such as Parquet files, Avro files, and federated tables using JDBC, use the allow_nontransactional_reads hint, as shown the following example:

BEGIN TRANSACTION;

-- Read from a non-transactional Parquet source
INSERT INTO transactional_table
SELECT col1, col2
FROM parquet.`/path/to/data`
WITH (allow_nontransactional_reads = true);

-- Read from a managed Delta table (no hint needed)
INSERT INTO another_table
SELECT * FROM managed_delta_table;

COMMIT;

Warning

Non-transactional reads are not repeatable. Concurrent changes to the source data during the transaction might result in inconsistent reads.

Transaction isolation

Transactions provide repeatable reads across all statements. When you access a table in a transaction, Azure Databricks captures a consistent snapshot of that table at first access. All subsequent reads of that table use this snapshot, so your reads remain consistent even if other users concurrently modify the same tables.

Example:

BEGIN TRANSACTION;

-- First access to products table captures snapshot
SELECT * FROM products WHERE product_id = 1001;

-- Another user updates product 1001

-- Still reads the same snapshot (repeatable read)
SELECT * FROM products WHERE product_id = 1001;

COMMIT;

Conflict detection and concurrency

Azure Databricks uses optimistic concurrency control. Transactions proceed without locking, and conflicts are detected at commit time. When you commit, Azure Databricks checks whether other transactions modified the same data after your transaction began. If conflicts exist, your transaction fails. For non-interactive transactions, the rollback also happens automatically. For interactive transactions, you must explicitly run ROLLBACK to clear the transaction state before beginning a new transaction.

Non-interactive transactions support row-level concurrency. Two transactions can modify different rows in the same data file without conflicting when Row-level concurrency is enabled on the target tables.

Interactive transactions support table-level concurrency.

Conflict scenarios

Scenario Description
Write-write conflicts Two transactions update or delete the same rows.
Write-read conflicts Another transaction modified rows that your transaction read. Applies to Serializable isolation only.
Phantom read conflicts Another transaction added new rows matching a predicate your transaction read. Applies to both WriteSerializable and Serializable isolation.
Metadata conflicts Another transaction changed table schema or properties.

For more details about isolation levels and conflict resolution for transactions, see Transaction modes. For information about isolation levels and write conflict behavior for Delta Lake tables on Azure Databricks, see Optimization recommendations on Azure Databricks.

How transactions appear in the Delta log

Each successful transaction appears as a single entry in the table's Delta log, regardless of how many individual statements ran within the transaction. This provides a clean audit trail and simplifies rollback operations.

Individual operations within a transaction are available as JSON metadata in the Delta log entry for the transaction.

Error handling and rollback

The following table describes how error rollbacks occur for both transaction types:

Scenario Behavior for non-interactive transactions Behavior for interactive transactions
Statement failure Any statement that raises an error causes immediate automatic rollback. You must explicitly run ROLLBACK to discard changes if the session is still active.
Failed validation logic or business rules Use SIGNAL to throw an exception and trigger automatic rollback. Run ROLLBACK to discard changes.
Session disconnect The transaction automatically rolls back. The transaction automatically rolls back.
Timeout Automatically rolls back after 48 hours total duration. Automatically rolls back after 10 minutes of inactivity or 48 hours total duration (see Limitations). The transaction is terminated without side effects, but you must explicitly run ROLLBACK to clear the transaction state if the session is still active.

For interactive transactions, you can explicitly roll back using the ROLLBACK statement. This is useful when you want to discard changes based on validation logic or business rules, or after a statement failure when the session remains active.

Best practices

Follow these practices to reduce conflicts and optimize transaction performance.

Avoid conflicts

  • Keep transactions short: Long-running transactions increase conflict likelihood and hold resources longer.
  • Validate early: Check preconditions at the beginning of a transaction to fail fast.
  • Databricks recommends non-interactive transactions for most use cases: Non-interactive transactions use row-level concurrency. See Non-interactive transactions.
  • Retry on conflicts: When conflicts occur, retry the transaction with fresh data.

Use transactions from different clients

Transactions work across various client interfaces:

Limitations

The following limitations apply to transactions that span multiple tables:

Limitation Description
Interactive transaction conflicts Interactive transactions (BEGIN TRANSACTION; ... COMMIT;) use more conservative conflict detection than non-interactive transactions and can conflict at the table level, except for INSERT operations that do not read from the target table. Use non-interactive transactions (ATOMIC compound statement) when row-level conflict detection is important. See Non-interactive transactions.
Write targets You can only write to Unity Catalog managed Delta or Iceberg tables that have the catalogManaged table feature enabled. See Catalog-managed commits.
DML operations only Transactions support SELECT, INSERT, UPDATE, DELETE, COPY INTO, and MERGE. Run DDL operations, such as CREATE TABLE, ALTER TABLE, or DROP TABLE, outside of transactions.
COPY INTO concurrency A transaction running a COPY INTO command fails if another COPY INTO command runs concurrently to write to the same table and commits first.
Streaming writes not supported Transactional writes to streaming tables are not supported.
RLS and CLM tables not supported Tables with Row filters and column masks cannot participate in transactions.
Table and view limits A transaction can read from or write to up to 100 tables combined, and can read from up to 100 views. Each table can have up to 100 intermediate commits within a transaction.
Time travel not supported You cannot use time travel within a transaction.
Idle timeout Interactive transactions roll back after 10 minutes of inactivity. The transaction is terminated without side effects, but you must explicitly run ROLLBACK to clear the transaction state if the session is still active.
Maximum duration All transactions automatically roll back after 48 hours total duration. For interactive transactions, the transaction is terminated without side effects, but you must explicitly run ROLLBACK to clear the transaction state if the session is still active.

Next steps