Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Available in Databricks Runtime 14.1 and above, row tracking assigns stable row IDs and row commit versions to each row, enabling row-level lineage tracking. Some incremental updates for materialized views require this feature.
All Apache Iceberg v3 tables include row tracking. See Use Apache Iceberg v3 features. For Delta Lake tables, you must explicitly enable row tracking.
Note
Enabling row tracking upgrades the table writer protocol and might affect compatibility with external Delta Lake clients. See Delta Lake feature compatibility and protocols.
Enable row tracking on Delta Lake tables
To enable row tracking on a Delta Lake table, set the table property delta.enableRowTracking = true during table creation:
CREATE TABLE table_name
TBLPROPERTIES (delta.enableRowTracking = true)
AS SELECT * FROM source_table;
To enable row tracking on an existing Delta Lake table, use the following example:
ALTER TABLE table_name SET TBLPROPERTIES (delta.enableRowTracking = true);
Important
Enabling row tracking on existing tables automatically assigns row IDs and row commit versions to all existing rows in the table. This process might result in the creation of multiple new versions of the table and take a significant amount of time to complete.
Warning
If your table is a target for continuous writes, such as Structured Streaming workloads, pause write operations before enabling row tracking and resume after the operation completes. Using ALTER TABLE to enable row tracking updates table metadata, causing any concurrent write operations to fail with a MetadataChangedException for the duration of the operation. See Conflict exceptions.
Cloning a table creates a separate history, so the row IDs and row commit versions on cloned tables don't match those in the original table.
Metadata fields
Row tracking adds two hidden metadata fields to the table. You can explicitly add these fields to your query to return the values.
| Column name | Type | Values | Explanation |
|---|---|---|---|
_metadata.row_id |
Long | The unique identifier of the row. | A row keeps the same ID whenever it is modified using a MERGE or UPDATE statement. |
_metadata.row_commit_version |
Long | The Delta log or table version at which the row was last inserted or updated. | A row is assigned a new version whenever it is modified using a MERGE or UPDATE statement. |
Some operations store these metadata fields using the transaction log. Running OPTIMIZE or REORG operations on a table with row tracking enabled rewrites data files to store these fields.
Turn off row tracking on Delta Lake tables
To turn off row tracking on a Delta Lake table, set the table property to false.
ALTER TABLE table_name SET TBLPROPERTIES (delta.enableRowTracking = false);
Important
Disabling row tracking doesn't remove the corresponding table feature and doesn't downgrade the table protocol version. It also doesn't remove the metadata fields from the target table. To fully remove the table feature and downgrade the protocol, use DROP FEATURE. See Drop a Delta Lake table feature and downgrade table protocol.
After turning off row tracking, the generated row IDs are no longer reliable for tracking unique rows.
Limitations
The row IDs and row commit versions metadata fields can't be accessed while reading the change data feed. See Use change data feed on Azure Databricks.