Unity Catalog Table Creation Failure: Unsupported File Scheme

Akancha 0 Reputation points
2024-12-01T17:10:42.2233333+00:00

Hi ,

I am facing an issue while creating table on the unity catalog while using dbfs path. Getting below error.

Unity Catalog Table Creation Fails with "UC FILE_SCHEME_FOR_TABLE_CREATION_NOT_SUPPORTED"

I want to repartion the data, which is working fine and when i am trying to save the same as table getting below error. I want to table as repartioned data because i want to z-ordering execution.

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,369 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Amira Bedhiafi 29,866 Reputation points
    2024-12-01T18:40:46.8133333+00:00

    I think you are having this issue because Unity Catalog requires storage paths to be on external storage registered with Unity Catalog (for example Azure Data Lake Storage Gen2, AWS S3...) and does not support dbfs:/ paths for managed tables.

    Unity Catalog-managed tables must use an external storage location registered with Unity Catalog.

    SHOW STORAGE LOCATIONS;
    

    I you want to save the repartitioned data to an external storage location that is registered with Unity Catalog.

    df = spark.read.format("parquet").load("dbfs:/path-to-your-data")
    # Repartitioning the data
    df = df.repartition("column_to_partition")
    # Write data to Unity Catalog-compliant storage path
    df.write.format("delta").mode("overwrite").save("abfss://******@your-storage-account.dfs.core.windows.net/your-folder")
    

    Once the data is saved to a Unity Catalog-compatible location, create the table:

    CREATE TABLE catalog_name.schema_name.table_name
    USING DELTA
    LOCATION 'abfss://******@your-storage-account.dfs.core.windows.net/your-folder';
    

    After creating the table, you can perform Z-Ordering for performance optimization :

    OPTIMIZE catalog_name.schema_name.table_name
    ZORDER BY (column_name);
    
    

    Don't forget that Azure Data Lake storage account and container must have appropriate permissions for Unity Catalog and your Databricks workspace.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.