I believe this is as expected. The Databricks documentation for REPLACE (https://docs.databricks.com/en/sql/language-manual/sql-ref-syntax-ddl-create-table-using.html) says "If specified replaces the table and its content if it already exists. This clause is only supported for Delta Lake tables." CREATE OR REPLACE TABLE fails if the specified location is already occupied by anything other than a Delta table. You can't use this overwrite a Parquet table, and a replaced Delta table will also be a Delta table. CREATE TABLE fails if the specified location is already occupied by anything, as you have experienced.
CREATE TABLE does not overwrite location whereas CREATE OR REPLACE TABLE does
I am working on Azure Databricks, with Databricks Runtime version being - 14.3 LTS (includes Apache Spark 3.5.0, Scala 2.12)
. I am facing the following issue.
Suppose I have a view named v1
and a database f1_processed
created from the following command.
CREATE DATABASE IF NOT EXISTS f1_processed
LOCATION "abfss://processed@formula1dl679student.dfs.core.windows.net/"
Then if I try to create a table at a location that already exists using the command below.
CREATE TABLE f1_processed.circuits
AS
SELECT * FROM v1;
I get the following error
[DELTA_CREATE_TABLE_WITH_NON_EMPTY_LOCATION] Cannot create table ('`spark_catalog`.`f1_processed`.`circuits`').
The associated location ('abfss://processed@formula1dl679student.dfs.core.windows.net/circuits')
is not empty and also not a Delta table. SQLSTATE: 42601
However, if I replace CREATE with CREATE OR REPLACE, then the command runs fine. So the following code runs fine.
CREATE OR REPLACE TABLE f1_processed.circuits
AS
SELECT * FROM v1;
Also, the table creation is only happening for delta format. If I specify the format to be any other like parquet, then it fails. Checkout this question Is it a bug? Any help is appreciated