Create a SQL warehouse

This article explains how to configure and manage SQL warehouses using the Azure Databricks UI. This includes how to create, edit, and monitor existing SQL warehouses. You will also learn how to convert existing classic warehouses into serverless warehouses. You can also create SQL warehouses using the SQL warehouse API, or Terraform.

What is a SQL warehouse?

A SQL warehouse is a compute resource that lets you run SQL commands on data objects within Databricks SQL. Compute resources are infrastructure resources that provide processing capabilities in the cloud.

Important

You can also attach a notebook to a SQL warehouse. See Notebooks and SQL warehouses for more information and limitations.

To navigate to the SQL warehouse dashboard, click Endpoints Icon SQL Warehouses in the sidebar. By default, warehouses are sorted by state (running warehouses first), then in alphabetical order.

To help you get started, Azure Databricks creates a small SQL warehouse called Starter Warehouse automatically. You can edit or delete this SQL warehouse.

Requirements

SQL warehouses have the following requirements:

  • To create a SQL warehouse you must be a workspace admin or a user with unrestricted cluster creation permissions.

  • To manage a SQL warehouse you must be a workspace admin or have the CAN MANAGE permission on the SQL warehouse.

  • For classic or pro SQL warehouses, your Azure account must have adequate vCPU quota. The default vCPU quota is usually adequate to create a serverless SQL warehouse but might not be enough to scale the SQL warehouse or to create additional warehouses. See Required Azure vCPU quota for classic and pro SQL warehouses. You can request additional vCPU quota. Your Azure account may have limitations on how much vCPU quota you can request. Contact your Azure account team for more information.

  • Before you can create a serverless SQL warehouse in a region that supports the feature, there may be required steps. See Use serverless SQL warehouses.

Create a SQL warehouse

Create warehouses using the SQL Warehouses page in the web UI, the SQL Warehouse API, or Terraform. The default warehouse settings create an efficient and high-performing SQL warehouse. You can edit some of the settings to fit your workload needs.

Default SQL warehouse config

To create a SQL warehouse using the web UI:

  1. Click SQL Warehouses in the sidebar.
  2. Click Create SQL Warehouse.
  3. Enter a Name for the warehouse.
  4. Accept the default warehouse settings or edit them. See warehouse settings.
  5. (Optional) Configure advanced options. See Advanced options.
  6. Click Create.
  7. You can then configure warehouse permissions if you’d like.

Your SQL warehouse is now created and started.

Warehouse settings

Creating a SQL warehouse in the UI allows you to update the following settings:

  • Cluster Size represents the size of the driver node and number of worker nodes associated with the cluster. The default is X-Large. To reduce query latency, increase the size.

  • Auto Stop determines whether the warehouse stops if it’s idle for the specified number of minutes. Idle SQL warehouses continue to accumulate DBU and cloud instance charges until they are stopped.

    • Pro and classic SQL warehouses: The default is 45 minutes, which is recommended for typical use. The minimum is 10 minutes.
    • Serverless SQL warehouses: The default is 10 minutes, which is recommended for typical use. The minimum is 5 minutes when you use the UI. Note that you can create a serverless SQL warehouse using the SQL warehouses API, in which case you can set the Auto Stop value as low as 1 minute.
  • Scaling sets the minimum and maximum number of clusters that will be used for a query. The default is a minimum and a maximum of one cluster. You can increase the maximum clusters if you want to handle more concurrent users for a given query. Azure Databricks recommends a cluster for every 10 concurrent queries.

    To maintain optimal performance, Databricks periodically recycles clusters. During a recycle period, you may temporarily see a cluster count that exceeds the maximum as Databricks transitions new workloads to the new cluster and waits to recycle the old cluster until all open workloads have completed.

  • Type determines the type of warehouse. If serverless is enabled in your account, serverless will be the default. See What are the available warehouse types and features? for the list.

Advanced options

Configure the following advanced options by expanding the Advanced options area when you create a new SQL warehouse or edit an existing SQL warehouse. You can also configure these options using the SQL Warehouse API.

  • Tags: Tags allow you to monitor the cost of cloud resources used by users and groups in your organization. You specify tags as key-value pairs.

  • Unity Catalog: If Unity Catalog is enabled for the workspace, it is the default for all new warehouses in the workspace. If Unity Catalog is not enabled for your workspace, you do not see this option. For more information about Unity Catalog, see Unity Catalog.

  • Channel: Use the Preview channel to test new functionality, including your queries and dashboards, before it becomes the Databricks SQL standard.

    The release notes list what’s in the latest preview version.

Important

Databricks recommends against using a preview version for production workloads. Because only workspace admins can view a warehouse’s properties, including its channel, consider indicating that a Databricks SQL warehouse uses a preview version in that warehouse’s name to prevent users from using it for production workloads.

Start a warehouse

To manually start a stopped SQL warehouse, click Endpoints Icon SQL Warehouses in the sidebar then click the start icon next to the warehouse.

Warehouse auto-restart

A warehouse auto-restarts in the following conditions:

  • A warehouse is stopped and you attempt to run a query.
  • A job assigned to a stopped warehouse is scheduled to run.
  • A connection is established to a stopped warehouse from a JDBC/ODBC interface.
  • A dashboard associated with a dashboard-level warehouse is opened.

Auto-restart works for all types of SQL warehouses, including serverless SQL warehouses, which start very quickly.

Manage SQL warehouses

Manage SQL warehouses using the web UI or the SQL Warehouse API.

  • To stop a running warehouse, click the stop icon next to the warehouse.
  • To start a stopped warehouse, click the start icon next to the warehouse.
  • To delete a warehouse, click the kebab menu Vertical Ellipsis, then click Delete. Note: Contact Support to restore warehouses deleted within 14 days.
  • To edit a warehouse, click the kebab menu Vertical Ellipsis then click Edit.
  • To add and edit permissions, click the kebab menu Vertical Ellipsis then click Permissions. To learn about permission levels, see SQL warehouse access control.

Upgrade a pro or classic SQL warehouse to serverless

Serverless SQL warehouses give users in your account instant access to fully managed and elastic compute resources. To upgrade your existing SQL warehouses to serverless:

  1. In the sidebar, click SQL Warehouses.
  2. Click the vertical ellipsis Vertical Ellipsis next to the warehouse you want to upgrade, then click Upgrade to Serverless.

To learn more about serverless computing in Databricks, see Serverless compute.

Monitor a SQL warehouse

To monitor a SQL warehouse, click the name of a SQL warehouse and then the Monitoring tab. On the Monitoring tab, you see the following monitoring elements:

  • Live statistics: Live statistics show the currently running and queued queries, active SQL sessions, the warehouse status, and the current cluster count.
  • Time scale filter: The monitoring time scale filter sets the time range for the query count chart, running cluster chart, and the query history and event log table. The default time range is 8 hours, but you can specify 24 hours, 7 days, or 14 days. You can also click and drag on the bar chart to change the time range.
  • Peak query count chart: The peak query count chart shows the maximum number of concurrent queries, either running or queued, on the warehouse during the selected time frame. The data that supplies this chart does not include metadata queries. Each data point in the chart is the peak within a 5-minute window.
  • Running clusters chart: The running clusters chart shows the number of clusters allocated to the warehouse during the selected time frame. During a cluster recycle, this count may temporarily exceed configured maximum.
  • Query history table: The query history table shows all of the queries active during the selected time frame, their start time and duration, and the user that executed the query. You can filter the queries by user, query duration, query status, and query type.

Note

The cluster count can be greater than one only if scaling is enabled and configured.

Warehouse sizing and autoscaling behavior

For information on how classic and pro SQL warehouses are sized and how autoscaling works, see SQL warehouse sizing, scaling, and queuing behavior.