Error message "com.databricks.sql.cloudfiles.errors.CloudFilesIOException: Failed to write to the schema log at location". Databricks notebook is encountering an issue while writing to the schema log in Databricks Cloud Files.

Question

Hello everyone and nice to meet you! :-)

Does anyone have any clue of what could be the issue with the following error message? It is concerning configuring schema inference and evolution in Auto Loader Configure schema inference and evolution in Auto Loader - Azure Databricks | Microsoft Learn

Specifically when running something similar to the following commands in a python file in terraform

(
  .option("cloudFiles.format", "parquet")
  # The schema location directory keeps track of your data schema over time
  
  .load("

Answer

This is my code:

basePath = f"/mnt/raw/{system_name}/"
baseCheckpointPath = f"{basePath}_____checkpoints/"
baseSchemasPath = f"{basePath}_____autoloaderSchemas/"
# COMMAND ----------
def stream_csv_table_from_tablename(tableName):
    tableDf = (
        spark.readStream.format("cloudFiles")
        .option("cloudFiles.format", "csv")
        # The schema location directory keeps track of your data schema over time
        .option("cloudFiles.schemaLocation", f"{baseSchemasPath}{tableName}")
        .option("cloudFiles.inferColumnTypes", True)
        .option("header", True)
        .option("delimiter", ",")
        .load(f"{basePath}/{tableName}/")
    )
    return tableDf

This is the sample/documentation code:

(spark.readStream.format("cloudFiles")
  .option("cloudFiles.format", "parquet")
  # The schema location directory keeps track of your data schema over time
  .option("cloudFiles.schemaLocation", "")
  .load("")
  .writeStream
  .option("checkpointLocation", "")
  .start("


The  (csv table directory) is an existing location, and I have the nessecary permissions for this folder/directory, but the  (schema information location) does not exist, as it should be created automatically when running the notebook, as far as I have understood - could this be the error?  
Our runtime version is higher than the suggested.

Answer

I had similar issue but solved it by enabling "Hierarchical namespace" for the storage account I used.

Error message "com.databricks.sql.cloudfiles.errors.CloudFilesIOException: Failed to write to the schema log at location". Databricks notebook is encountering an issue while writing to the schema log in Databricks Cloud Files.

2 answers