Enrich Delta Lake tables with custom metadata

Databricks recommends always providing comments for tables and columns in tables. You can generate these comments using AI. See Add AI-generated comments to a table.

Unity Catalog also provides the ability to tag data. See Apply tags.

You can also log messages for individual commits to tables in a field in the Delta Lake transaction log.

Set user-defined commit metadata

You can specify user-defined strings as metadata in commits, either using the DataFrameWriter option userMetadata or the SparkSession configuration spark.databricks.delta.commitInfo.userMetadata. If both of them have been specified, then the option takes preference. This user-defined metadata is readable in the DESCRIBE HISTORY operation. See Work with Delta Lake table history.

SQL


SET spark.databricks.delta.commitInfo.userMetadata=overwritten-for-fixing-incorrect-data
INSERT OVERWRITE default.people10m SELECT * FROM morePeople

Python

df.write.format("delta") \
  .mode("overwrite") \
  .option("userMetadata", "overwritten-for-fixing-incorrect-data") \
  .save("/tmp/delta/people10m")

Scala

df.write.format("delta")
  .mode("overwrite")
  .option("userMetadata", "overwritten-for-fixing-incorrect-data")
  .save("/tmp/delta/people10m")