Manage dependencies for a Databricks app

A Databricks app can include Python dependencies, Node.js dependencies, or both. You declare dependencies in language-specific files that Azure Databricks installs when you deploy the app:

Define Python dependencies with pip

Apps that use pip come with a set of pre-installed Python libraries. To define additional Python libraries, use a requirements.txt file. If any listed packages match pre-installed ones, the versions in your file override the defaults.

Horizontally scaled apps (Beta) converted from standard apps can opt out of pre-installed libraries and run on a clean base OS image instead. See Opt out of pre-installed Python libraries for Databricks apps.

For example:

# Override default version of dash
dash==2.10.0

# Add additional libraries not pre-installed
requests==2.31.0
numpy==1.24.3

# Specify a compatible version range
scikit-learn>=1.2.0,<1.3.0

Pre-installed Python libraries

pip-based apps include the following pre-installed Python libraries. You don't need to add them to your requirements.txt unless you require a different version.

Library Version
databricks-sql-connector 3.4.0
databricks-sdk 0.33.0
mlflow-skinny 2.16.2
gradio 4.44.0
streamlit 1.38.0
shiny 1.1.0
dash 2.18.1
flask 3.0.3
fastapi 0.115.0
uvicorn[standard] 0.30.6
gunicorn 23.0.0
huggingface-hub 0.35.3
dash-ag-grid 31.2.0
dash-mantine-components 0.14.4
dash-bootstrap-components 1.6.0
plotly 5.24.1
plotly-resampler 0.10.0

Define Python dependencies with uv

If your app uses uv for dependency management, define Python dependencies in a pyproject.toml file instead of requirements.txt. uv-based apps don't include pre-installed libraries, so you must declare all dependencies in your pyproject.toml. You can also specify any Python version using the requires-python field, unlike pip-based apps, which use Python 3.11.

The same applies to horizontally scaled apps that have opted out of pre-installed libraries. See Opt out of pre-installed Python libraries for Databricks apps.

During deployment, Databricks Apps selects an install strategy based on which files are present:

  • If requirements.txt exists, the app uses pip to install dependencies, regardless of whether pyproject.toml is also present. requirements.txt always takes precedence.
  • If requirements.txt doesn't exist and both pyproject.toml and uv.lock exist, the app uses uv to install dependencies from the lock file.

The uv installer creates and manages its own virtual environment, so you don't need to create a .venv directory.

The following example shows a minimal pyproject.toml for a Databricks app:

[project]
name = "my-app"
requires-python = ">=3.11"
dependencies = [
    "dash==2.10.0",
    "requests==2.31.0",
]

To use uv, you must include a uv.lock file alongside your pyproject.toml. Generate it by running uv lock locally and include it in your app directory.

Define Node.js dependencies

To define Node.js libraries, include a package.json file in the root of your app. Azure Databricks supports both npm and pnpm, and selects the package manager based on the lock file you include:

  • If pnpm-lock.yaml is present, the app uses pnpm. See Use pnpm.
  • Otherwise, the app uses npm.
  • If both pnpm-lock.yaml and package-lock.json are present, pnpm takes precedence.

For example, a package.json file for a React app using Vite might look like this:

{
  "name": "react-fastapi-app",
  "version": "1.0.0",
  "private": true,
  "type": "module",
  "scripts": {
    "build": "npm run build:frontend",
    "build:frontend": "vite build frontend"
  },
  "dependencies": {
    "react": "^18.2.0",
    "react-dom": "^18.2.0",
    "typescript": "^5.0.0",
    "vite": "^5.0.0",
    "@vitejs/plugin-react": "^4.2.0",
    "@types/react": "^18.2.0",
    "@types/react-dom": "^18.2.0"
  }
}

Note

List all packages required for the build step under dependencies, not devDependencies. If you set NODE_ENV=production in your environment variables, the deployment process skips installing devDependencies.

Use pnpm

To build with pnpm, include a pnpm-lock.yaml file alongside your package.json. Generate it by running pnpm install locally and include it in your app directory. Azure Databricks provides pnpm through Corepack.

Note the following requirements for pnpm apps:

  • Dependencies install with pnpm install --frozen-lockfile, so pnpm-lock.yaml must stay in sync with package.json. If they drift, the build fails instead of updating the lock file. Regenerate the lock file with pnpm install after you change dependencies.
  • You must specify the start command in app.yaml. Unlike npm apps, pnpm apps don't fall back to a default start script. See Configure Databricks app execution with app.yaml.

For pnpm workspace projects (where a pnpm-workspace.yaml file is present), some app.yaml commands run pnpm recursively. For example, a build or start step might run pnpm -r run build. These commands must call corepack pnpm instead of pnpm so that the nested commands resolve correctly.

Avoid version conflicts

Follow these guidelines to avoid version conflicts:

  • For pip-based apps, overriding pre-installed packages can cause compatibility issues if your specified version differs significantly from the pre-installed one.
  • Always test your app to verify that package version changes don't introduce errors.
  • Pinning explicit versions in requirements.txt helps maintain consistent app behavior across deployments.
  • When using uv, include a uv.lock file for fully reproducible installs across deployments.

Dependency installation and management

Azure Databricks installs libraries defined in requirements.txt, pyproject.toml, and package.json directly on the container running on your dedicated compute. You're responsible for managing and patching these dependencies.

You can specify libraries from multiple sources in your dependency files:

  • Libraries downloaded from public repositories like PyPI and npm
  • Private repositories that authenticate using credentials stored in Azure Databricks secrets
  • Libraries stored in your /Volumes/ directory (for example, /Volumes/<catalog>/<schema>/<volume>/<path>)

Install from private repositories

To install packages from a private repository, configure environment variables for authentication. For example, set PIP_INDEX_URL to point to your private repository:

env:
  - name: PIP_INDEX_URL
    valueFrom: my-pypi-secret

Your workspace network configuration must allow access to the private repository. See Configure networking for Databricks Apps.

Install wheel files from Unity Catalog volumes

To install Python packages from wheel files stored in Unity Catalog volumes:

  1. Add the Unity Catalog volume as a resource to your app. See Unity Catalog volume.
  2. Reference the full wheel file path directly in your requirements.txt:
/Volumes/<catalog>/<schema>/<volume>/my_package-1.0.0-py3-none-any.whl

Note

Environment variable references are not supported in requirements.txt. You must hard-code the full wheel file path.

To enhance security when accessing external package repositories, use serverless egress controls to restrict access to public repositories and configure private networking. See Configure networking for Databricks Apps.