Azure Databricks Automatic Schema Evolution

Harshit Chandani 40 Reputation points
2024-02-27T10:23:22.34+00:00

Definition: I've created a Delta Live Table pipeline to handle the real-time data.
Use Case: I want my delta live table pipeline to handle the schema changes whenever data comes with a different schema. Example Athletes.csv Schema

Name Country Athletes-1.csv Schema Name Country Sport At first, DLT receives the Athletes.csv file and then Athletes-1.csv but here Athletes-1.csv has a different schema as compared to Athletes.csv How to handle this situation.

Implemented Solution: I've set the spark configuration: spark.conf.set("spark.databricks.delta.schema.autoMerge.enabled", "true") Error:
com.databricks.sql.transaction.tahoe.DeltaAnalysisException: Unknown configuration was specified: delta.schema.autoMerge.enabled

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,158 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Vinodh247 17,941 Reputation points
    2024-02-27T14:21:52.1933333+00:00

    Hi Harshit Chandani,

    Thanks for reaching out to Microsoft Q&A.

    This error suggests that the delta.schema.autoMerge.enabled configuration setting is not recognized.

    • First, try using spark.conf.set("spark.databricks.delta.schema.autoMerge.enabled", True) (OR) spark.conf.set("spark.databricks.delta.schema.autoMerge.enabled", "True") and not spark.conf.set("spark.databricks.delta.schema.autoMerge.enabled", "true")
    • Check if there are any typos or extra space in the configuration parameters.

    User's image

    Please 'Upvote'(Thumbs-up) and 'Accept' as an answer if the reply was helpful. This will benefit other community members who face the same issue.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.