Share via


SQL language reference

This is a SQL command reference for Databricks SQL and Databricks Runtime.

For information about how to understand and use the syntax notation and symbols in this reference, see How to use the SQL reference.

For information about using SQL with Lakeflow Spark Declarative Pipelines, see Pipeline SQL language reference.

Note

Databricks SQL Serverless is not available in Azure China. Databricks SQL is not available in Azure Government regions.

General reference

This general reference describes data types, functions, identifiers, literals, and semantics.

Language fundamentals

Reference Description
Adding comments to SQL statements Adding comments to your SQL code
Reserved words and schemas Keywords reserved by the SQL parser
Identifiers Rules for naming database objects
Names Object naming conventions
IDENTIFIER clause Dynamic identifier resolution
SQL expression SQL expression syntax and evaluation
NULL semantics How NULL values are handled
Parameter markers Parameterized query placeholders
Variables Session and temporary variables
Name resolution How object names are resolved
JSON path expression Navigating JSON data structures
Collation String comparison and sorting rules
Partitions Table partitioning concepts

Data types

Reference Description
Data types Overview of all supported data types
SQL data type rules Type coercion and conversion rules
Datetime patterns Format patterns for dates and times

Configuration parameters

Reference Description
Configuration parameters Overview of SQL configuration
ANSI_MODE ANSI SQL compliance mode
LEGACY_TIME_PARSER_POLICY Time parsing behavior
MAX_FILE_PARTITION_BYTES File partition size limit
READ_ONLY_EXTERNAL_METASTORE External metastore access mode
STATEMENT_TIMEOUT Query timeout setting
TIMEZONE Session timezone
USE_CACHED_RESULT Query result caching

Functions

Reference Description
Functions Overview of SQL functions
Built-in functions All built-in functions
Alphabetical list of built-in functions Complete A-Z function reference
Window functions Analytic and ranking functions
Lambda functions Anonymous function expressions
H3 geospatial functions Hexagonal hierarchical spatial index
ST geospatial functions Standard geospatial functions
User-defined aggregate functions (UDAFs) Custom aggregation functions
External user-defined scalar functions (UDFs) Custom scalar functions
Integration with Hive UDFs, UDAFs, and UDTFs Hive function compatibility
Function invocation How to call functions

Unity Catalog concepts

Reference Description
Principal Users, groups, and service principals
Privileges and securable objects in Unity Catalog Access control in Unity Catalog
Privileges and securable objects in the Hive metastore Legacy metastore access control
External locations Cloud storage locations
External tables Tables backed by external data
Credentials Storage access credentials
Volumes Managed and external file storage

Special topics

Reference Description
ANSI compliance in Databricks Runtime ANSI SQL standard compatibility
Apache Hive compatibility Hive SQL compatibility
SQL scripting Procedural SQL logic
Delta Sharing Cross-organization data sharing
Federated queries (Lakehouse Federation) Querying external databases
Information schema Metadata system tables

DDL statements

Data definition statements create or modify the structure of database objects.

ALTER statements

Statement Description
ALTER CATALOG Modify catalog properties
ALTER CONNECTION Modify connection settings
ALTER CREDENTIAL Modify credential properties
ALTER DATABASE Modify database properties
ALTER EXTERNAL LOCATION Modify external location
ALTER MATERIALIZED VIEW Modify materialized view
ALTER PROVIDER Modify data provider
ALTER RECIPIENT Modify share recipient
ALTER SCHEMA Modify schema properties
ALTER SHARE Modify share configuration
ALTER STREAMING TABLE Modify streaming table
ALTER TABLE Modify table structure
ALTER VIEW Modify view definition
ALTER VOLUME Modify volume properties

CREATE statements

Statement Description
CREATE BLOOM FILTER INDEX Create bloom filter index
CREATE CATALOG Create a new catalog
CREATE CONNECTION Create external connection
CREATE DATABASE Create a new database
CREATE FUNCTION (SQL and Python) Create SQL function
CREATE FUNCTION (External) Create external function
CREATE EXTERNAL LOCATION Create external location
CREATE MATERIALIZED VIEW Create materialized view
CREATE PROCEDURE Create stored procedure
CREATE RECIPIENT Create share recipient
CREATE SCHEMA Create a new schema
CREATE SERVER Create foreign server
CREATE SHARE Create data share
CREATE STREAMING TABLE Create streaming table
CREATE TABLE Create a new table
CREATE VIEW Create a view
CREATE VOLUME Create a volume

DROP statements

Analyze statements

Other DDL statements

Statement Description
COMMENT ON Add comments to objects
DECLARE VARIABLE Declare session variable
REPAIR TABLE Recover table partitions
REFRESH FOREIGN (CATALOG, SCHEMA, and TABLE) Refresh foreign metadata
REFRESH (MATERIALIZED VIEW or STREAMING TABLE) Refresh materialized data
SET TAG Add tags to objects
SYNC Synchronize metadata
TRUNCATE TABLE Remove all table data
UNDROP Restore dropped table
UNSET TAG Remove tags from objects

DML statements

Data manipulation statements add, change, or delete data in Delta Lake tables.

Statement Description
COPY INTO Load data from files into a table
DELETE FROM Delete rows from a table
INSERT Insert rows into a table
INSERT OVERWRITE DIRECTORY Write query results to directory
INSERT OVERWRITE DIRECTORY with Hive format Write results in Hive format
LOAD DATA Load data from files
MERGE INTO Upsert data into a table
UPDATE Update rows in a table

Data retrieval statements

Retrieve rows from tables according to specified clauses. Azure Databricks supports both standard SELECT syntax and SQL Pipeline Syntax.

Query statements

Statement Description
Query Full query syntax reference
SELECT (subselect) Select data from tables
VALUES clause Construct inline table
SQL Pipeline Syntax Pipe-based query composition
EXPLAIN Show query execution plan

Query clauses

Clause Description
SELECT clause Column selection
* (star) clause Select all columns
table reference Table and subquery sources
JOIN Combine rows from tables
WHERE clause Filter rows
GROUP BY clause Group rows for aggregation
HAVING clause Filter grouped rows
QUALIFY clause Filter window function results
ORDER BY clause Sort result rows
SORT BY clause Sort within partitions
CLUSTER BY clause (SELECT) Distribute and sort data
DISTRIBUTE BY clause Distribute data across partitions
LIMIT clause Limit result rows
OFFSET clause Skip result rows
PIVOT clause Rotate rows to columns
UNPIVOT clause Rotate columns to rows
LATERAL VIEW clause Apply table-generating function
TABLESAMPLE clause Sample table rows
Table-valued function (TVF) invocation Functions returning tables
Common table expression (CTE) Named subqueries
Set operators UNION, INTERSECT, EXCEPT
WINDOW clause Named window definitions
Hints Query optimizer hints

Delta Lake statements

Manage tables stored in Delta Lake format. For details, see What is Delta Lake in Azure Databricks?.

Statement Description
CACHE SELECT Cache query results
CONVERT TO DELTA Convert Parquet to Delta
DESCRIBE HISTORY Show table history
FSCK REPAIR TABLE Repair table metadata
GENERATE Generate table manifest
OPTIMIZE Compact table files
REORG TABLE Reorganize table data
RESTORE Restore table to previous version
VACUUM Remove old table files

SQL scripting statements

Execute procedural logic in SQL. See SQL scripting for an overview.

Statement Description
BEGIN END compound statement Group statements into a block
CASE statement Conditional branching
FOR statement Loop over a range or cursor
GET DIAGNOSTICS statement Retrieve execution information
IF THEN ELSE statement Conditional execution
ITERATE statement Continue to next loop iteration
LEAVE statement Exit a loop or block
LOOP statement Unconditional loop
REPEAT statement Loop until condition is true
RESIGNAL statement Re-raise an exception
SIGNAL statement Raise an exception
WHILE statement Loop while condition is true

Auxiliary statements

Collect statistics, manage caching, explore metadata, set configurations, and manage resources.

ANALYZE statements

Statement Description
ANALYZE TABLE … COMPUTE STATISTICS Compute table column statistics
ANALYZE TABLE … COMPUTE STORAGE METRICS Compute table storage metrics

Cache statements

Applies to: Databricks Runtime

Statement Description
CACHE TABLE Cache table in memory
CLEAR CACHE Remove all cached data
REFRESH CACHE Refresh cached data
REFRESH FUNCTION Refresh function metadata
REFRESH TABLE Refresh table metadata
UNCACHE TABLE Remove table from cache

DESCRIBE statements

Statement Description
DESCRIBE CATALOG Show catalog details
DESCRIBE CONNECTION Show connection details
DESCRIBE CREDENTIAL Show credential details
DESCRIBE DATABASE Show database details
DESCRIBE FUNCTION Show function details
DESCRIBE EXTERNAL LOCATION Show location details
DESCRIBE PROCEDURE Show procedure details
DESCRIBE PROVIDER Show provider details
DESCRIBE QUERY Show query output schema
DESCRIBE RECIPIENT Show recipient details
DESCRIBE SCHEMA Show schema details
DESCRIBE SHARE Show share details
DESCRIBE TABLE Show table details
DESCRIBE VOLUME Show volume details

SHOW statements

Statement Description
LIST List files in a path
SHOW ALL IN SHARE Show all objects in share
SHOW CATALOGS List catalogs
SHOW COLUMNS List table columns
SHOW CONNECTIONS List connections
SHOW CREATE TABLE Show table DDL
SHOW CREDENTIALS List credentials
SHOW DATABASES List databases
SHOW FUNCTIONS List functions
SHOW GROUPS List groups
SHOW EXTERNAL LOCATIONS List external locations
SHOW PARTITIONS List table partitions
SHOW PROCEDURES List procedures
SHOW PROVIDERS List providers
SHOW RECIPIENTS List recipients
SHOW SCHEMAS List schemas
SHOW SHARES List shares
SHOW SHARES IN PROVIDER List shares from provider
SHOW TABLE EXTENDED Show table information
SHOW TABLES List tables
SHOW TABLES DROPPED List dropped tables
SHOW TBLPROPERTIES Show table properties
SHOW USERS List users
SHOW VIEWS List views
SHOW VOLUMES List volumes

Configuration and variable management

Statement Description
CALL Call a stored procedure
EXECUTE IMMEDIATE Execute dynamic SQL
RESET Reset configuration to default
SET Set configuration parameter
SET RECIPIENT Set share recipient context
SET TIME ZONE Set session timezone
SET variable Set variable value
USE CATALOG Set current catalog
USE DATABASE Set current database
USE SCHEMA Set current schema

Resource management

Applies to: Databricks Runtime

Statement Description
ADD ARCHIVE Add archive to session
ADD FILE Add file to session
ADD JAR Add JAR to session
LIST ARCHIVE List session archives
LIST FILE List session files
LIST JAR List session JARs

Applies to: Databricks SQL Connector

Statement Description
GET Download file from volume
PUT INTO Upload file to volume
REMOVE Delete file from volume

Security statements

Manage access to data. For details, see Hive metastore privileges and securable objects (legacy).

Statement Description
ALTER GROUP Modify group membership
CREATE GROUP Create a new group
DENY Deny privileges
DROP GROUP Delete a group
GRANT Grant privileges
GRANT ON SHARE Grant share access
MSCK REPAIR PRIVILEGES Repair privilege metadata
REVOKE Revoke privileges
REVOKE ON SHARE Revoke share access
SHOW GRANTS Show granted privileges
SHOW GRANTS ON SHARE Show share privileges
SHOW GRANTS TO RECIPIENT Show recipient privileges