Ibahagi sa


Tutorial: Coordinate transactions across tables

Important

Transactions that write to Unity Catalog managed Delta tables are in Public Preview.

Transactions that write to Unity Catalog managed Iceberg tables are in Private Preview. To join this preview, submit the managed Iceberg tables preview enrollment form.

This tutorial demonstrates how to use transactions to coordinate updates across multiple statements and tables. You learn both transaction modes: non-interactive transactions, which commit automatically, and interactive transactions, which give you explicit control. The tutorial also demonstrates using transactions with stored procedures and SQL Scripting to build mission-critical warehousing workloads on Azure Databricks.

Requirements

  • Environment: Access to a Azure Databricks workspace.
  • Compute: Supported compute types vary by transaction mode:
    • A classic or serverless SQL warehouse supports both transaction modes.
    • Serverless compute supports non-interactive transactions only.
    • Classic clusters running Databricks Runtime 18.0 or above support non-interactive transactions only.
  • Privileges: CREATE TABLE in a Unity Catalog schema.

Set up sample tables

All tables written to in a multi-statement, multi-table transaction must:

Create two sample tables in the SQL Editor or a notebook:

-- Create a table for account data
CREATE TABLE IF NOT EXISTS sample_accounts (
  id INT,
  account_name STRING,
  balance DECIMAL(10,2)
) USING DELTA
TBLPROPERTIES (
  'delta.feature.catalogManaged' = 'supported'
);

-- Create a table for transaction records
CREATE TABLE IF NOT EXISTS sample_transactions (
  id INT,
  account_id INT,
  transaction_type STRING,
  amount DECIMAL(10,2)
) USING DELTA
TBLPROPERTIES (
  'delta.feature.catalogManaged' = 'supported'
);

-- To upgrade an existing table, use:
-- ALTER TABLE <table_name> SET TBLPROPERTIES ('delta.feature.catalogManaged' = 'supported');

-- Insert sample data
INSERT INTO sample_accounts VALUES
  (1, 'Alice', 1000.00),
  (2, 'Bob', 500.00);

INSERT INTO sample_transactions VALUES
  (1, 1, 'deposit', 100.00);

Verify the setup:

SELECT * FROM sample_accounts;
SELECT * FROM sample_transactions;

Output:

sample_accounts:
id	account_name	balance
1	  Alice	        1000.00
2	  Bob	          500.00

sample_transactions:
id	account_id	transaction_type	amount
1	           1	         deposit	100.00

Non-interactive transactions

Non-interactive transactions use BEGIN ATOMIC ... END; syntax. All statements run as a single atomic unit. If every statement succeeds, Azure Databricks commits automatically. If any statement fails, Azure Databricks rolls back all changes automatically. For detailed syntax and usage patterns, see non-interactive transactions.

Run a successful transaction

Update both tables atomically:

BEGIN ATOMIC
  -- Update Alice's account balance
  UPDATE sample_accounts
  SET balance = balance + 100.00
  WHERE id = 1;

  -- Record the deposit transaction
  INSERT INTO sample_transactions
  VALUES (2, 1, 'deposit', 100.00);
END;

Verify both operations succeeded:

-- Alice's balance should now be 1100.00
SELECT * FROM sample_accounts WHERE id = 1;

-- Should show two transaction records
SELECT * FROM sample_transactions;

Both the balance update and the transaction record were created together. If either statement had failed, neither change would have committed, and Databricks would have terminated the transaction without side effects.

Use SIGNAL to fail a transaction on a condition

You can use SIGNAL inside a BEGIN ATOMIC ... END; block to fail the transaction when a user-defined condition is not met. This is useful for data validation before committing:

BEGIN ATOMIC
  -- Insert new account
  INSERT INTO sample_accounts VALUES (3, 'Charlie', -50.00);

  -- Fail the transaction if balance is negative
  IF (SELECT balance FROM sample_accounts WHERE id = 3) < 0 THEN
    SIGNAL SQLSTATE '45000' SET MESSAGE_TEXT = 'Account balance cannot be negative';
  END IF;
END;

The SIGNAL raises an error, which causes the entire transaction to roll back automatically. Verify that the insert was rolled back:

-- Should return 0 rows (the transaction was rolled back by SIGNAL)
SELECT * FROM sample_accounts WHERE id = 3;

See automatic rollback on failure

Run a transaction with an invalid statement:

BEGIN ATOMIC
  -- This statement is valid
  INSERT INTO sample_accounts VALUES (4, 'David', 300.00);

  -- This statement will fail (table does not exist)
  INSERT INTO non_existent_table VALUES (1, 2, 3);
END;

The transaction fails with an error. Verify that the first statement was rolled back:

-- Should return 0 rows because the transaction was rolled back
SELECT * FROM sample_accounts WHERE id = 4;

Even though the first INSERT statement was valid, it was rolled back because the second statement failed. This demonstrates the all-or-nothing guarantee of transactions.

Interactive transactions

Interactive transactions give you explicit control over when to commit or roll back. Use BEGIN TRANSACTION to start, then COMMIT to save changes or ROLLBACK to discard them.

Commit changes

Start a transaction:

BEGIN TRANSACTION;

Make changes (not yet committed):

INSERT INTO sample_accounts VALUES (5, 'Eve', 850.00);
UPDATE sample_accounts SET balance = balance + 50.00 WHERE id = 2;

Commit to make changes permanent:

COMMIT;

Verify the changes:

-- Eve's account should now be visible
SELECT * FROM sample_accounts WHERE id = 5;

-- Bob's balance should be 550.00 (500 + 50)
SELECT * FROM sample_accounts WHERE id = 2;

Roll back changes

Start a new transaction:

BEGIN TRANSACTION;

Make a change:

INSERT INTO sample_accounts VALUES (6, 'Frank', 600.00);

Verify the change is visible in your session:

-- Should show Frank's account (visible in your session only)
SELECT * FROM sample_accounts WHERE id = 6;

Roll back to discard the change:

ROLLBACK;

Verify the change was discarded:

-- Should return 0 rows (the insert was rolled back)
SELECT * FROM sample_accounts WHERE id = 6;

Use with stored procedures and SQL Scripting

You can combine transactions with stored procedures to create reusable transaction logic. This pattern is useful for complex operations that you run frequently.

  1. Create two tables with catalog-managed commits enabled

    
    CREATE TABLE orders (order_id STRING, item_sku STRING, quantity_ordered INT)
     TBLPROPERTIES ('delta.feature.catalogManaged' = 'supported');
    
    CREATE TABLE inventory (item_sku STRING, quantity_in_stock INT)
     TBLPROPERTIES ('delta.feature.catalogManaged' = 'supported');
    
    
  2. Define the stored procedure

    CREATE OR REPLACE PROCEDURE main.retail.apply_order(
        IN  p_order_id      STRING,
        IN  p_customer_id   STRING,
        IN  p_order_amount  DECIMAL(18,2)
    )
    LANGUAGE SQL
    SQL SECURITY INVOKER
    MODIFIES SQL DATA
    AS
    BEGIN
        -- Insert the order
        INSERT INTO main.retail.orders (order_id, customer_id, amount)
        VALUES (p_order_id, p_customer_id, p_order_amount);
    
        -- Update total sales per customer
        MERGE INTO main.retail.total_sales AS t
        USING (
            SELECT
              p_customer_id  AS customer_id,
              p_order_amount AS order_amount
        ) s
          ON t.customer_id = s.customer_id
        WHEN MATCHED THEN
          UPDATE SET t.total_amount = t.total_amount + s.order_amount
        WHEN NOT MATCHED THEN
          INSERT (customer_id, total_amount)
          VALUES (s.customer_id, s.order_amount);
    END;
    
    
  3. Define the transaction

    BEGIN ATOMIC
        -- Staging batch id for this transaction
        DECLARE new_order_id STRING DEFAULT uuid();
        DECLARE v_batch_id STRING DEFAULT uuid();
    
        -- 1) Stage incoming customer and order rows
        INSERT INTO main.retail.orders_staging (order_id, customer_id, amount, batch_id)
        VALUES (new_order_id, 'CUST_123', 249.99, v_batch_id);
    
        -- 2) Drive final writes from staging to production via stored procedure
        FOR o AS
          SELECT
            order_id,
            customer_id,
            amount
          FROM main.retail.orders_staging
          WHERE batch_id = v_batch_id
        DO
            CALL main.retail.apply_order(
              o.order_id,
              o.customer_id,
              o.amount
            );
        END FOR;
    
        -- 3) Clean up processed staging rows
        DELETE FROM main.retail.orders_staging
        WHERE batch_id = v_batch_id;
    END; -- 4) Commit the transaction
    
    

If any part of the transaction fails, Databricks rolls back all changes automatically.

Clean up

Remove the sample tables:

DROP TABLE IF EXISTS sample_accounts;
DROP TABLE IF EXISTS sample_transactions;


DROP TABLE IF EXISTS orders;
DROP TABLE IF EXISTS inventory;

Next steps