Share via


UPDATE

Applies to: check marked yes Databricks SQL check marked yes Databricks Runtime

Updates the column values for the rows that match a predicate. When no predicate is provided, update the column values for all rows.

This statement is only supported for Delta Lake tables.

Syntax

[ common_table_expression ]
  UPDATE table_name [table_alias]
     SET  { { column_name | field_name }  = [ expr | DEFAULT } [, ...]
     [WHERE clause]

Parameters

  • common table expression

    Common table expressions (CTE) are one or more named queries which can be reused multiple times within the main query block to avoid repeated computations or to improve readability of complex, nested queries.

  • table_name

    Identifies table to be updated. The table name must not use a temporal specification or options specification.

    table_name must not be a foreign table.

  • table_alias

    Define an alias for the table. The alias must not include a column list.

  • column_name

    A reference to a column in the table. You may reference each column at most once.

  • field_name

    A reference to field within a column of type STRUCT. You may reference each field at most once.

  • expr

    An arbitrary expression. If you reference table_name columns they represent the state of the row prior the update.

  • DEFAULT

    Applies to: check marked yes Databricks SQL check marked yes Databricks Runtime 11.3 LTS and above

    The DEFAULT expression for the column if one is defined, NULL otherwise.

  • WHERE

    Filter rows by predicate. The WHERE clause may include subqueries with the following exceptions:

    • Nested subqueries, that is, a subquery inside another subquery
    • A NOT IN subquery inside an OR, for example, a = 3 OR b NOT IN (SELECT c from t)

    In most cases, you can rewrite NOT IN subqueries using NOT EXISTS. You should use NOT EXISTS whenever possible, as UPDATE with NOT IN subqueries can be slow.

Updating from another table or join

Databricks SQL does not support the UPDATE ... FROM ... JOIN syntax used in some other SQL dialects (for example, updating columns in a table from a join with another table). To update a table based on a join with another table or subquery, use MERGE INTO instead.

Examples

> UPDATE events SET eventType = 'click' WHERE eventType = 'clk'

> UPDATE all_events
     SET session_time = 0, ignored = true
   WHERE session_time < (SELECT min(session_time) FROM good_events)

> UPDATE orders AS t1
     SET order_status = 'returned'
   WHERE EXISTS (SELECT oid FROM returned_orders WHERE t1.oid = oid)

> UPDATE events
     SET category = 'undefined'
   WHERE category NOT IN (SELECT category FROM events2 WHERE date > '2001-01-01')

> UPDATE events
     SET ignored = DEFAULT
  WHERE eventType = 'unknown'

The following example shows how to update a table from a join using MERGE instead of unsupported UPDATE ... FROM ... JOIN syntax:

-- Equivalent to (not supported in Databricks): UPDATE t1 SET t1.c1 = t2.c1 FROM t1 INNER JOIN t2 ON t1.c2 = t2.c2
> MERGE INTO t1
  USING t2 ON t1.c2 = t2.c2
  WHEN MATCHED THEN UPDATE SET t1.c1 = t2.c1;