I have a delta table in Databricks named prod.silver.control_table
. It has a few columns including table_name
with string data type and transform_options
with the below structure:
|-- transform_options: map (nullable = true)
| |-- key: string
| |-- value: struct (valueContainsNull = true)
| | |-- col_name_mappings: map (nullable = true)
| | | |-- key: string
| | | |-- value: string (valueContainsNull = true)
| | |-- type_mappings: map (nullable = true)
| | | |-- key: string
| | | |-- value: string (valueContainsNull = true)
| | |-- partition_duplicates_by: array (nullable = true)
| | | |-- element: string (containsNull = true)
| | |-- order_duplicates_by: array (nullable = true)
| | | |-- element: string (containsNull = true)
For example, when table_name
is prod.silver.weather
, the transform_options
is:
{
"prod.bronze.weather_source_a":{"col_name_mappings":{"col_a_old":"col_a_new","col_b_old":"col_b_new"},"type_mappings":{"col_a_new":"INT","col_b_new":"TIMESTAMP"},"partition_duplicates_by":["col_a_new"],"order_duplicates_by":["_commit_version"]},
"prod.bronze.weather_source_b":{"col_name_mappings":{"col_c_old":"col_c_new","col_d_old":"col_d_new"},"type_mappings":{"col_c_new":"INT","col_d_new":"TIMESTAMP"},"partition_duplicates_by":["col_c_new"],"order_duplicates_by":["ingestion_timestamp","_commit_version"]}
}
I need to update values in order_duplicates_by
. I need to change _commit_version
into commit_version
by removing the initial underscore. In the above example, there are 2 key-value pairs in the transform_options
column. It is not always the case and there might be only one key-value pair like below:
For example, when table_name
is prod.silver.country
, the transform_options
is:
{
"prod.bronze.contry_source":{"col_name_mappings":{"col_a_old":"col_a_new","col_b_old":"col_b_new"},"type_mappings":{"col_a_new":"INT","col_b_new":"STRING"},"partition_duplicates_by":["col_a_new"],"order_duplicates_by":["_commit_version"]}
}
Any idea how to update table values?
Note that I want to update values in the control table. I prefer to use the SQL command like below however if there is a better way, please let me know:
UPDATE prod.silver.control_table
SET ...