Databricks, How to update value of a column with MAP data-type in a delta table using a python dictionary and SQL UPDATE command?

Question

Databricks, How to update value of a column with MAP data-type in a delta table using a python dictionary and SQL UPDATE command?

Mohammad Saber 591

I have a delta table in Databricks created by:

%sql
 
CREATE TABLE IF NOT EXISTS dev.bronze.test_map (
    id INT,
    table_updates MAP<STRING, TIMESTAMP>,
 
  CONSTRAINT test_map_pk PRIMARY KEY(id)
  
  ) USING DELTA
LOCATION "abfss://bronze@Table Path"

With initial values:

INSERT INTO dev.bronze.test_map 
VALUES (1, null), 
       (2, null), 
       (3, null);

Note that there is no value in column "table_updates".

enter image description here

After processing other tables in our platform, I have table updates info as a python dictionary like below:

table_updates_id1  =  
 
{'id1_table_1': datetime.datetime(2023, 3, 26, 4, 33, 22, 323000),
 
 'id1_table_2': datetime.datetime(2023, 3, 26, 4, 33, 22, 323000)}

User's image

Now, I want to update the value of column "table_update" where id=1 using "SQL UPDATE command" (note that I want to update the table not dataframe).

I tried different methods but failed.

1st trial:

spark.sql(f"""
UPDATE dev.bronze.test_map
SET
    table_updates = map({table_updates_id1})
WHERE
    id = 1
""")

Error:

enter image description here

2nd trial:

spark.sql(f"""
UPDATE dev.bronze.test_map
SET
    table_updates = map('{','.join([f'{k},{v}' for k,v in table_updates_id1.items()])}')
WHERE
    id = 1
""")

enter image description here

Any idea how to solve this issue? Thanks.

Accepted answer

1 additional answer

Your answer

Answer 1

@Mohammad Saber ,

Thankyou for using Microsoft Q&A platform and posting your question here. As I understand your issue, you are trying to update the column of delta table but it's giving syntax error. Please let me know if that is not the case here.

It looks like you have defined the variable as a python dictionary and then trying to UPDATE the delta table using SQL command, which is why it's not able to interpret dictionary datatype in sql query and due to conflict, it's throwing syntax error.

My recommendation would be to use python for updating the delta table as well. For instance:

from delta.tables import *
from pyspark.sql.functions import *

deltaTable = DeltaTable.forPath(spark, 'abfss://bronze@Table Path')

# Declare the predicate by using a SQL-formatted string.
table_updates_id1  =  
{'id1_table_1': datetime.datetime(2023, 3, 26, 4, 33, 22, 323000),
 'id1_table_2': datetime.datetime(2023, 3, 26, 4, 33, 22, 323000)}

deltaTable.update(
  condition = "id= 1",
  set = { "table_updates": "table_updates_id1" }
)

For more details, kindly check out this documentation: Table deletes, updates, and merges

Hope it helps. Kindly accept the answer by clicking on Accept answer button.

Answer 2

Sedat SALMAN 14,180 MVP

use the map_from_arrays function to update the table_updates column

you can use the following python or similar


keys = ', '.join([f"'{k}'" for k in table_updates_id1.keys()])
values = ', '.join([f"CAST('{v}' AS TIMESTAMP)" for v in table_updates_id1.values()])

update_query = f"""
UPDATE dev.bronze.test_map
SET
    table_updates = map_from_arrays(ARRAY[{keys}], ARRAY[{values}])
WHERE
    id = 1
"""

spark.sql(update_query)

Mohammad Saber 591 Reputation points

2023-03-26T11:04:00.37+00:00

Thanks @Sedat SALMAN

I used the code but got following error:

Share via

Databricks, How to update value of a column with MAP data-type in a delta table using a python dictionary and SQL UPDATE command?

1 additional answer

Your answer