Kopīgot, izmantojot


Manage serverless workspace base environments

Important

This feature is in Public Preview.

This page explains how to create and manage serverless base environments across a workspace.

Permissions

  • Only workspace admins can create and manage a workspace's base environments.
  • All workspace users have access to a workspace's base environments.
  • All workspace users can create custom serverless environment specifications.

How base environments work in Azure Databricks

In Azure Databricks, a base environment is a shareable YAML specification that defines a serverless environment version and a set of additional Python dependencies for serverless notebooks. Workspace admins create and manage base environments so users can quickly start from a consistent, cached environment and optionally add their own libraries.

Use workspace base environments

Users can select a workspace base environment from the Base environment dropdown in the Environment side panel. Workspace base environments appear in the dropdown alongside other options like Standard, AI, and Custom.

When a workspace base environment is selected, the pre-built cached environment loads quickly, reducing startup time for notebooks and jobs. For jobs, using workspace base environments improves performance because the dependencies are already cached.

Use workspace base environments

Users can select a workspace base environment for a notebook using the Base environment setting in the notebook's Environment side panel.

When a workspace base environment is selected, the pre-built cached environment loads quickly, reducing startup time for notebooks and jobs. For notebook tasks in jobs, using workspace base environments improves performance because the dependencies are already cached.

For instructions on configuring base environments in a notebooks, see Select a base environment.

Create and export an environment specification

The simplest way to create a valid YAML specification is to build the environment in the Environment side panel and then use the Export environment button to download the YAML file.

  1. Open a notebook and connect to serverless compute.
  2. Click the Environment Environment icon. button in the notebook's side panel.
  3. Under Base environment, select Standard or use More to choose a specific environment version. Databricks recommends using the latest serverless environment version supported by your workspace.
  4. In the Dependencies field, add whatever dependencies you would like the base environment to have. Click Add dependency after you enter each dependency. For more instructions on adding dependencies, see Add dependencies to the notebook.
  5. Click Apply on the bottom of the environment panel to ensure the specification is valid.
  6. Click the kebab menu icon Kebab menu icon. at the bottom of the environment panel then click Export environment.
  7. Give the YAML file a name and add to a Workspace folder or Unity Catalog volume.

Example environment specification

The following example YAML is based on the MLflow projects environment specification. It defines a base environment with a few library dependencies:

environment_version: '4'
dependencies:
  - --index-url https://pypi.org/simple
  - -r "/Workspace/Shared/requirements.txt"
  - my-library==6.1
  - /Workspace/Shared/Path/To/simplejson-3.19.3-py3-none-any.whl
  - git+https://github.com/databricks/databricks-cli

Add a base environment to your workspace

To add the environment specification as a base environment to the workspace:

  1. In the workspace, go to Settings.
  2. Under Workspace admin, select Compute.
  3. Next to Base environments for serverless compute, click Manage.
  4. Click Create new environment.
  5. Give your base environment a name. This is the name that users will see in the Base environment dropdown menu.
  6. Select the environment specification YAML file using the file picker. You can browse workspace files or Unity Catalog volumes.
  7. Click Create.

The base environment will start building. Check the Status column in the list of base environments. It will change to Ready to use when it's ready.

Build for serverless GPU compute

Important

This feature is in Beta. Workspace admins can control access to this feature from the Previews page. See Manage Azure Databricks previews.

When creating a base environment, you can optionally enable the Build for Serverless GPU Compute checkbox to build the environment for GPU workloads. This creates a GPU-compatible version of the base environment that appears in the GPU tab.

The base environments management page has two tabs:

  • CPU: Lists base environments for serverless compute (non-GPU workloads).
  • GPU: Lists base environments for serverless GPU compute. This tab also shows AI environment rows that correspond to AI base environments. For more information, see AI environment.

Standard Latest refers to the latest stable standard base environment version provided by Databricks.

Note

Usage records associated with building and refreshing base environments have the billing_origin_product column set to BASE_ENVIRONMENTS. Additionally, the specific base environment ID is populated in the usage_metadata.base_environment_id column.

Set the workspace's default base environment

By default, serverless notebooks in a workspace don't use a base environment. Workspace admins can select a base environment to apply to all new notebooks by default.

  1. In the workspace, go to Settings.
  2. Under Workspace admin, select Compute.
  3. Next to Base environments for serverless compute, click Manage.
  4. Click the star icon next to the base environment to set it as the default.

All new serverless notebooks will now default to the selected base environment.

Update a base environment

You might want to edit the base environment file to update version numbers or add or remove dependencies.

In the list of base environments, click the YAML file path of the base environment you want to update. This opens up the file in a new tab. You can review or update the file contents there. Changes are saved automatically.

After you make an update to the YAML specification, you must refresh the base environment so notebooks and jobs pick up the latest configuration.

  1. Next to the base environment you want to refresh, click the kebab menu icon Kebab menu icon. then select Refresh.
  2. Click Confirm.

New sessions now use the updated base environment. Existing notebook sessions must be restarted to get the updates.

Limitations

Base environments have the following limitations:

  • Custom base environments are supported only for serverless Python, Python wheel, and notebook task types. Other task types are not supported.
  • Workspace base environments are not supported in jobs. The only exception is notebook tasks, which can use workspace base environments only when the environment is configured directly in the notebook’s environment settings.
  • Lakeflow Spark Declarative Pipelines do not support base environments.
  • Only the relevant dependencies are installed at runtime.
  • Serverless environment version 1 is not supported. Use version 2 or higher.
  • Base environments are available to all workspace users.
  • Workspaces are limited to 10 base environments.