Create and manage catalogs

This article shows how to create and manage catalogs in Unity Catalog. A catalog contains schemas (databases), and a schema contains tables and views.

Requirements

  • You must be an Azure Databricks metastore admin or have been granted the CREATE CATALOG privilege on the metastore
  • Your Azure Databricks account must be on the Premium Plan.
  • You must have a Unity Catalog metastore linked to the workspace where you perform the catalog creation.
  • The compute resource that you use to run the notebook or Databricks SQL to create the catalog must be using a Unity Catalog compliant access mode.

Create a catalog

To create a catalog, you can use Data Explorer or a SQL command.

Data explorer

  1. Log in to a workspace that is linked to the metastore.

  2. Click Data Icon Data.

  3. Click the Create Catalog button.

  4. (Optional) Specify the location where data for managed tables in the catalog will be stored.

    Specify a location here only if you do not want managed tables in this catalog to be stored in the default root storage location that was configured for the metastore. See Create a Unity Catalog metastore.

    The path that you specify must be defined in an external location configuration, and you must have the CREATE MANAGED STORAGE privilege on that external location. You can also use a subpath of that path. See Manage external locations and storage credentials.

  5. Assign permissions for your catalog. See Unity Catalog privileges and securable objects.

  6. Click Save.

Sql

  1. Run the following SQL command in a notebook or Databricks SQL editor. Items in brackets are optional. Replace the placeholder values:

    • <catalog_name>: A name for the catalog.
    • <location_path>: Optional. Provide a storage location path if you want managed tables in this catalog to be stored in a location that is different than the default that was configured for the metastore. This path must be defined in an external location configuration, and you must have the CREATE MANAGED STORAGE privilege on the external location configuration. You can use the path that is defined in the external location configuration or a subpath (in other words, 'abfss://us-east-1/finance' or 'abfss://us-east-1/finance/product'). Requires Databricks Runtime 11.3 and above.
    • <comment>: Optional description or other comment.
    CREATE CATALOG [ IF NOT EXISTS ] <catalog_name>
       [ MANAGED LOCATION '<location_path>' ]
       [ COMMENT <comment> ];
    

    For example, to create a catalog named example:

    CREATE CATALOG IF NOT EXISTS example;
    
  2. Assign privileges to the catalog. See Unity Catalog privileges and securable objects.

Python

  1. Run the following SQL command in a notebook. Items in brackets are optional. Replace the placeholder values:

    • <catalog_name>: A name for the catalog.
    • <location_path>: Optional. Provide a storage location path if you want managed tables in this catalog to be stored in a location that is different than the default that was configured for the metastore. This path must be defined in an external location configuration, and you must have the CREATE MANAGED STORAGE privilege on the external location configuration. You can use the path that is defined in the external location configuration or a subpath (in other words, 'abfss://us-east-1/finance' or 'abfss://us-east-1/finance/product'). Requires Databricks Runtime 11.3 and above.
    • <comment>: An optional comment.
    spark.sql("CREATE CATALOG [ IF NOT EXISTS ] <catalog_name> [ MANAGED LOCATION '<location_path>' ] [ COMMENT <comment> ]")
    

    For example, to create a catalog named example:

    spark.sql("CREATE CATALOG IF NOT EXISTS example")
    

    For parameter descriptions, see CREATE CATALOG.

  2. Assign privileges to the catalog. See Unity Catalog privileges and securable objects.

R

  1. Run the following SQL command in a notebook. Items in brackets are optional. Replace the placeholder values:

    • <catalog_name>: A name for the catalog.
    • <location_path>: Optional. Provide a storage location path if you want managed tables in this catalog to be stored in a location that is different than the default that was configured for the metastore. This path must be defined in an external location configuration, and you must have the CREATE MANAGED STORAGE privilege on the external location configuration. You can use the path that is defined in the external location configuration or a subpath (in other words, 'abfss://us-east-1/finance' or 'abfss://us-east-1/finance/product'). Requires Databricks Runtime 11.3 and above.
    • <comment>: An optional comment.
    library(SparkR)
    
    sql("CREATE CATALOG [ IF NOT EXISTS ] <catalog_name> [ MANAGED LOCATION '<location_path>' ] [ COMMENT <comment> ]")
    

    For example, to create a catalog named example:

    library(SparkR)
    
    sql("CREATE CATALOG IF NOT EXISTS example")
    
  2. Assign privileges to the catalog. See Unity Catalog privileges and securable objects.

Scala

  1. Run the following SQL command in a notebook. Items in brackets are optional. Replace the placeholder values:

    • <catalog_name>: A name for the catalog.
    • <location_path>: Optional. Provide a storage location path if you want managed tables in this catalog to be stored in a location that is different than the default that was configured for the metastore. This path must be defined in an external location configuration, and you must have the CREATE MANAGED STORAGE privilege on the external location configuration. You can use the path that is defined in the external location configuration or a subpath (in other words, 'abfss://us-east-1/finance' or 'abfss://us-east-1/finance/product'). Requires Databricks Runtime 11.3 and above.
    • <comment>: An optional comment.
    spark.sql("CREATE CATALOG [ IF NOT EXISTS ] <catalog_name> [ MANAGED LOCATION '<location_path>' ] [ COMMENT <comment> ]")
    

    For example, to create a catalog named example:

    spark.sql("CREATE CATALOG IF NOT EXISTS example")
    
  2. Assign privileges to the catalog. See Unity Catalog privileges and securable objects.

When you create a catalog, two schemas (databases) are automatically created: default and information_schema.

You can also create a catalog by using the Databricks Terraform provider and databricks_catalog. You can retrieve information about catalogs by using databricks_catalogs.

Next steps

Now you can add schemas (databases) to your catalog.

Delete a catalog

To delete (or drop) a catalog, you can use Data Explorer or a SQL command.

Data explorer

You must delete all schemas in the catalog except information_schema before you can delete a catalog. This includes the auto-created default schema.

  1. Log in to a workspace that is linked to the metastore.
  2. Click Data Icon Data.
  3. In the Data pane, on the left, click the catalog you want to delete.
  4. In the detail pane, click the three-dot menu to the left of the Create database button and select Delete.
  5. On the Delete catalog dialog, click Delete.

Sql

Run the following SQL command in a notebook or Databricks SQL editor. Items in brackets are optional. Replace the placeholder <catalog_name>.

For parameter descriptions, see DROP CATALOG.

If you use DROP CATALOG without the CASCADE option, you must delete all schemas in the catalog except information_schema before you can delete the catalog. This includes the auto-created default schema.

DROP CATALOG [ IF EXISTS ] <catalog_name> [ RESTRICT | CASCADE ]

For example, to delete a catalog named vaccine and its schemas:

DROP CATALOG vaccine CASCADE

Python

Run the following SQL command in a notebook. Items in brackets are optional. Replace the placeholder <catalog_name>.

For parameter descriptions, see DROP CATALOG.

If you use DROP CATALOG without the CASCADE option, you must delete all schemas in the catalog except information_schema before you can delete the catalog. This includes the auto-created default schema.

spark.sql("DROP CATALOG [ IF EXISTS ] <catalog_name> [ RESTRICT | CASCADE ]")

For example, to delete a catalog named vaccine and its schemas:

spark.sql("DROP CATALOG vaccine CASCADE")

R

Run the following SQL command in a notebook. Items in brackets are optional. Replace the placeholder <catalog_name>.

For parameter descriptions, see DROP CATALOG.

If you use DROP CATALOG without the CASCADE option, you must delete all schemas in the catalog except information_schema before you can delete the catalog. This includes the auto-created default schema.

library(SparkR)

rsql("DROP CATALOG [ IF EXISTS ] <catalog_name> [ RESTRICT | CASCADE ]")

For example, to delete a catalog named vaccine and its schemas:

library(SparkR)

sql("DROP CATALOG vaccine CASCADE")

Scala

Run the following SQL command in a notebook. Items in brackets are optional. Replace the placeholder <catalog_name>.

For parameter descriptions, see DROP CATALOG.

If you use DROP CATALOG without the CASCADE option, you must delete all schemas in the catalog except information_schema before you can delete the catalog. This includes the auto-created default schema.

spark.sql("DROP CATALOG [ IF EXISTS ] <catalog_name> [ RESTRICT | CASCADE ]")

For example, to delete a catalog named vaccine and its schemas:

spark.sql("DROP CATALOG vaccine CASCADE")