What is a table?

Article
07/26/2024

A table resides in a schema and contains rows of data. All tables created in Azure Databricks use Delta Lake by default. Tables backed by Delta Lake are also called Delta tables.

A Delta table stores data as a directory of files in cloud object storage and registers table metadata to the metastore within a catalog and schema. All Unity Catalog managed tables and streaming tables are Delta tables. Unity Catalog external tables can be Delta tables but are not required to be.

It is possible to create tables on Databricks that don’t use Delta Lake. These tables do not provide the transactional guarantees or optimized performance of Delta tables. You can choose to create the following tables types using formats other than Delta Lake:

External tables.
Foreign tables.
Tables registered to the legacy Hive metastore.

In Unity Catalog, tables sit at the third level of the three-level namespace (catalog.schema.table):

Unity Catalog object model diagram, focused on table

Azure Databricks table types

Azure Databricks enables you to use the following types of tables.

Managed tables

Managed tables manage underlying data files alongside the metastore registration. Databricks recommends that you use managed tables whenever you create a new table. Unity Catalog managed tables are the default when you create tables in Azure Databricks. They always use Delta Lake. See Work with managed tables.

External tables

External tables, sometimes called unmanaged tables, decouple the management of underlying data files from metastore registration. Unity Catalog external tables can store data files using common formats readable by external systems. See Work with external tables.

Delta tables

The term Delta table is used to describe any table backed by Delta Lake. Because Delta tables are the default on Azure Databricks, most references to tables are describing the behavior of Delta tables unless otherwise noted.

Databricks recommends that you always interact with Delta tables using fully-qualified table names rather than file paths.

Streaming tables

Streaming tables are Delta tables primarily used for processing incremental data. Most updates to streaming tables happen through refresh operations.

You can register streaming tables in Unity Catalog using Databricks SQL or define them as part of a Delta Live Tables pipeline. See Load data using streaming tables in Databricks SQL. and What is Delta Live Tables?.

Foreign tables

Foreign tables represent data stored in external systems connected to Azure Databricks through Lakehouse Federation. Foreign tables are read-only on Azure Databricks. See What is Lakehouse Federation?.

Feature tables

Any Delta table managed by Unity Catalog that has a primary key is a feature table. You can optionally configure feature tables using the online Feature Store for low-latency use cases. See Work with feature tables in workspace feature store.

Hive tables (legacy)

Hive tables describe two distinct concepts on Azure Databricks, both of which are legacy patterns and not recommended.

Tables registered using the legacy Hive metastore store data in the legacy DBFS root, by default. Databricks recommends migrating all tables from the legacy HMS to Unity Catalog. See Database objects in the legacy Hive metastore.

Apache Spark supports registering and querying Hive tables, but these codecs are not optimized for Azure Databricks. Databricks recommends registering Hive tables only to support queries against data written by external systems. See Hive table (legacy).

Live tables (deprecated)

The term live tables refers to an earlier implementation of functionality now implemented as materialized views. Any legacy code that references live tables should be updated to use syntax for materialized views. See What is Delta Live Tables? and Use materialized views in Databricks SQL.

Basic table permissions

To create a table, users must have CREATE TABLE and USE SCHEMA permissions on the schema, and they must have the USE CATALOG permission on its parent catalog. To query a table, users must have the SELECT permission on the table, the USE SCHEMA permission on its parent schema, and the USE CATALOG permission on its parent catalog.

For more on Unity Catalog permissions, see Manage privileges in Unity Catalog.

Share via