AnalysisException: Incompatible format detected in Azure Data Bricks

Question

Hello Team,

I am trying to write the files from salesforce data to blob storage in parquet format. The query is mentioned below :

spark.conf.set(
STORAGE_ACCOUNT_CONFIG,
STORAGE_ACCOUNT_KEY,
)

dbutils.fs.ls("abfss://xyz.dfs.core.windows.net/raw")

set the data lake file location to save the delta file out of spark dataframe df:

file_location=STORAGE_PATH+SOURCE_SYSTEM_NAME+"/"+TABLE_NAME+"/"+CURRENT_MONTH_NAME
print(file_location)
(
df
.write
.format("delta")
.option("mergeSchema", "true") #append columns in case of adding new ones from the source and adding data to empty columns already existing in the dataframe
.mode("append")
.save(file_location)
)

When I included the CURRENT_MONTH_NAME in folder structure then i am getting error :

AnalysisException: Incompatible format detected.

AnalysisException Traceback (most recent call last)
in
13 print(file_location)
14 (
---> 15 df
16 .write
17 .format("delta")

/databricks/spark/python/pyspark/sql/readwriter.py in save(self, path, format, mode, partitionBy, **options)
738 self._jwrite.save()
739 else:
--> 740 self._jwrite.save(path)
741
742 @since(1.4)

/databricks/spark/python/lib/py4j-0.10.9.1-src.zip/py4j/java_gateway.py in call(self, *args)
1302
1303 answer = self.gateway_client.send_command(command)
-> 1304 return_value = get_return_value(
1305 answer, self.gateway_client, self.target_id, self.name)
1306

/databricks/spark/python/pyspark/sql/utils.py in deco(*a, **kw)
121 # Hide where the exception came from that shows a non-Pythonic
122 # JVM exception message.
--> 123 raise converted from None
124 else:
125 raise

AnalysisException: Incompatible format detected.

You are trying to write to abfss://xyz.dfs.core.windows.net/abc/Salesforce/Account/August using Databricks Delta, but there is no
transaction log present. Check the upstream job to make sure that it is writing
using format("delta") and that you are trying to write to the table base path.

To disable this check, SET spark.databricks.delta.formatCheck.enabled=false
To learn more about Delta, see https://learn.microsoft.com/azure/databricks/delta/index

Please advise.

Regards
Rohit

Answer

Hello @Rohit Kulkarni ,

Thanks for the question and using MS Q&A platform.

You are experiencing this error message is because you have files saved into that path already that is not in delta format, so you should choose a new path or delete files in that path.

For more details, refer to the SO thread addressing similar issue: Trouble when writing the data to Delta Lake in Azure databricks (Incompatible format detected).

Hope this will help. Please let us know if any further queries.

------------------------------

Please don't forget to click on or upvote button whenever the information provided helps you. Original posters help the community find answers faster by identifying the correct answer. Here is howd
Want a reminder to come back and check responses? Here is how to subscribe to a notification
If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators

Share via

AnalysisException: Incompatible format detected in Azure Data Bricks

set the data lake file location to save the delta file out of spark dataframe df:

1 answer

Your answer