Events
Mar 31, 11 PM - Apr 2, 11 PM
The ultimate Microsoft Fabric, Power BI, SQL, and AI community-led event. March 31 to April 2, 2025.
Register todayThis browser is no longer supported.
Upgrade to Microsoft Edge to take advantage of the latest features, security updates, and technical support.
This article describes the syntax for declaring Databricks Asset Bundles library dependencies. Bundles enable programmatic management of Azure Databricks workflows. See What are Databricks Asset Bundles?.
In addition to notebooks, your Azure Databricks jobs will likely depend on libraries in order to work as expected. Databricks Asset Bundles dependencies for local development are specified in the requirements*.txt
file at the root of the bundle project, but job task library dependencies are declared in your bundle configuration files and are often necessary as part of the job task type specification.
Bundles provide support for the following library dependencies for Azure Databricks jobs:
Note
Whether or not a library is supported depends on the cluster configuration for the job and the library source. For complete library support information, see Libraries.
To add a Python wheel file to a job task, in libraries
specify a whl
mapping for each library to be installed. You can install a wheel file from workspace files, Unity Catalog volumes, cloud object storage, or a local file path.
Important
Libraries can be installed from DBFS when using Databricks Runtime 14.3 LTS and below. However, any workspace user can modify library files stored in DBFS. To improve the security of libraries in a Azure Databricks workspace, storing library files in the DBFS root is deprecated and disabled by default in Databricks Runtime 15.1 and above. See Storing libraries in DBFS root is deprecated and disabled by default.
Instead, Databricks recommends uploading all libraries, including Python libraries, JAR files, and Spark connectors, to workspace files or Unity Catalog volumes, or using library package repositories. If your workload does not support these patterns, you can also use libraries stored in cloud object storage.
The following example shows how to install three Python wheel files for a job task.
include
item in the sync
mapping, and is in the same local folder as the bundle configuration file.my-volume
in the Azure Databricks workspace.resources:
jobs:
my_job:
# ...
tasks:
- task_key: my_task
# ...
libraries:
- whl: ./my-wheel-0.1.0.whl
- whl: /Workspace/Shared/Libraries/my-wheel-0.0.1-py3-none-any.whl
- whl: /Volumes/main/default/my-volume/my-wheel-0.1.0.whl
To add a JAR file to a job task, in libraries
specify a jar
mapping for each library to be installed. You can install a JAR from workspace files, Unity Catalog volumes, cloud object storage, or a local file path.
Important
Libraries can be installed from DBFS when using Databricks Runtime 14.3 LTS and below. However, any workspace user can modify library files stored in DBFS. To improve the security of libraries in a Azure Databricks workspace, storing library files in the DBFS root is deprecated and disabled by default in Databricks Runtime 15.1 and above. See Storing libraries in DBFS root is deprecated and disabled by default.
Instead, Databricks recommends uploading all libraries, including Python libraries, JAR files, and Spark connectors, to workspace files or Unity Catalog volumes, or using library package repositories. If your workload does not support these patterns, you can also use libraries stored in cloud object storage.
The following example shows how to install a JAR file that was previously uploaded to the volume named my-volume
in the Azure Databricks workspace.
resources:
jobs:
my_job:
# ...
tasks:
- task_key: my_task
# ...
libraries:
- jar: /Volumes/main/default/my-volume/my-java-library-1.0.jar
To add a PyPI package to a job task definition, in libraries
, specify a pypi
mapping for each PyPI package to be installed. For each mapping, specify the following:
package
, specify the name of the PyPI package to install. An optional exact version specification is also supported.repo
, specify the repository where the PyPI package can be found. If not specified, the default pip
index is used (https://pypi.org/simple/).The following example shows how to install two PyPI packages.
pip
index.pip
index.resources:
jobs:
my_job:
# ...
tasks:
- task_key: my_task
# ...
libraries:
- pypi:
package: wheel==0.41.2
- pypi:
package: numpy==1.25.2
repo: https://pypi.org/simple/
To add a Maven package to a job task definition, in libraries
, specify a maven
mapping for each Maven package to be installed. For each mapping, specify the following:
coordinates
, specify the Gradle-style Maven coordinates for the package.repo
, specify the Maven repo to install the Maven package from. If omitted, both the Maven Central Repository and the Spark Packages Repository are searched.exclusions
, specify any dependencies to explicitly exclude. See Maven dependency exclusions.The following example shows how to install two Maven packages.
resources:
jobs:
my_job:
# ...
tasks:
- task_key: my_task
# ...
libraries:
- maven:
coordinates: com.databricks:databricks-sdk-java:0.8.1
- maven:
coordinates: com.databricks:databricks-dbutils-scala_2.13:0.1.4
repo: https://mvnrepository.com/
exclusions:
- org.scala-lang:scala-library:2.13.0-RC*
Events
Mar 31, 11 PM - Apr 2, 11 PM
The ultimate Microsoft Fabric, Power BI, SQL, and AI community-led event. March 31 to April 2, 2025.
Register todayTraining
Module
Automate workloads with Azure Databricks Jobs - Training
Automate workloads with Azure Databricks Jobs