AnalysisException: Incompatible format detected in Azure Data Bricks

Rohit Kulkarni 691 Reputation points
2022-08-25T07:07:52.397+00:00

Hello Team,

I am trying to write the files from salesforce data to blob storage in parquet format. The query is mentioned below :

spark.conf.set(
STORAGE_ACCOUNT_CONFIG,
STORAGE_ACCOUNT_KEY,
)

dbutils.fs.ls("abfss://xyz.dfs.core.windows.net/raw")

set the data lake file location to save the delta file out of spark dataframe df:

file_location=STORAGE_PATH+SOURCE_SYSTEM_NAME+"/"+TABLE_NAME+"/"+CURRENT_MONTH_NAME
print(file_location)
(
df
.write
.format("delta")
.option("mergeSchema", "true") #append columns in case of adding new ones from the source and adding data to empty columns already existing in the dataframe
.mode("append")
.save(file_location)
)

When I included the CURRENT_MONTH_NAME in folder structure then i am getting error :

AnalysisException: Incompatible format detected.

AnalysisException Traceback (most recent call last)
<command-1746602509716500> in <module>
13 print(file_location)
14 (
---> 15 df
16 .write
17 .format("delta")

/databricks/spark/python/pyspark/sql/readwriter.py in save(self, path, format, mode, partitionBy, **options)
738 self._jwrite.save()
739 else:
--> 740 self._jwrite.save(path)
741
742 @since(1.4)

/databricks/spark/python/lib/py4j-0.10.9.1-src.zip/py4j/java_gateway.py in call(self, *args)
1302
1303 answer = self.gateway_client.send_command(command)
-> 1304 return_value = get_return_value(
1305 answer, self.gateway_client, self.target_id, self.name)
1306

/databricks/spark/python/pyspark/sql/utils.py in deco(*a, **kw)
121 # Hide where the exception came from that shows a non-Pythonic
122 # JVM exception message.
--> 123 raise converted from None
124 else:
125 raise

AnalysisException: Incompatible format detected.

You are trying to write to abfss://xyz.dfs.core.windows.net/abc/Salesforce/Account/August using Databricks Delta, but there is no
transaction log present. Check the upstream job to make sure that it is writing
using format("delta") and that you are trying to write to the table base path.

To disable this check, SET spark.databricks.delta.formatCheck.enabled=false
To learn more about Delta, see https://learn.microsoft.com/azure/databricks/delta/index

Please advise.

Regards
Rohit

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,219 questions
{count} votes

1 answer

Sort by: Most helpful
  1. PRADEEPCHEEKATLA 90,231 Reputation points
    2022-08-26T05:51:19.79+00:00

    Hello @Rohit Kulkarni ,

    Thanks for the question and using MS Q&A platform.

    You are experiencing this error message is because you have files saved into that path already that is not in delta format, so you should choose a new path or delete files in that path.

    For more details, refer to the SO thread addressing similar issue: Trouble when writing the data to Delta Lake in Azure databricks (Incompatible format detected).

    Hope this will help. Please let us know if any further queries.

    ------------------------------

    • Please don't forget to click on 130616-image.png or upvote 130671-image.png button whenever the information provided helps you. Original posters help the community find answers faster by identifying the correct answer. Here is howd
    • Want a reminder to come back and check responses? Here is how to subscribe to a notification
    • If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators
    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.