Cluster libraries
Cluster libraries can be used by all notebooks running on a cluster. You can install a cluster library directly from a public repository such as PyPI or Maven, using a previously installed workspace library, or using an init script.
Install a library on a cluster
There are two primary ways to install a library on a cluster:
- Install a workspace library that has been already been uploaded to the workspace.
- Install a library for use with a specific cluster only.
In addition, if your library requires custom configuration, you may not be able to install it using the methods listed above. Instead, you can install the library using an init script that runs at cluster creation time.
Note
When you install a library on a cluster, a notebook already attached to that cluster will not immediately see the new library. You must first detach and then reattach the notebook to the cluster.
In addition to the approaches in this article, you can also install a library on a cluster by using the Databricks Terraform provider and databricks_library.
In this section:
Workspace library
Note
Azure Databricks processes all workspace libraries in the order that they were installed on the cluster. You might need to pay attention to the order of installation on the cluster if there are dependencies between libraries.
To install a library that already exists in the workspace, you can start from the cluster UI or the library UI:
Cluster
- Click
Compute in the sidebar.
- Click a cluster name.
- Click the Libraries tab.
- Click Install New.
- In the Library Source button list, select Workspace.
- Select a workspace library.
- Click Install.
- To configure the library to be installed on all clusters:
- Click the library.
- Select the Install automatically on all clusters checkbox.
- Click Confirm.
Library
- Go to the folder containing the library.
- Click the library name.
- Do one of the following:
To configure the library to be installed on all clusters, select the Install automatically on all clusters checkbox and click Confirm.
Important
This option does not install the library on clusters running Databricks Runtime 7.0 and above.
Select the checkbox next to the cluster that you want to install the library on and click Install.
The library is installed on the cluster.
Cluster-installed library
You can install a library on a specific cluster without making it available as a workspace library.
To install a library on a cluster:
- Click
Compute in the sidebar.
- Click a cluster name.
- Click the Libraries tab.
- Click Install New.
- Follow one of the methods for creating a workspace library. After you click Create, the library is installed on the cluster.
Init script
If your library requires custom configuration, you may not be able to install it using the workspace or cluster library interface. Instead, you can install the library using an init script.
Here is an example of an init script that uses pip to install Python libraries on a Databricks Runtime cluster at cluster initialization.
#!/bin/bash
/databricks/python/bin/pip install astropy
Uninstall a library from a cluster
Note
When you uninstall a library from a cluster, the library is removed only when you restart the cluster. Until you restart the cluster, the status of the uninstalled library appears as Uninstall pending restart.
To uninstall a library you can start from a cluster or a library:
Cluster
- Click
Compute in the sidebar.
- Click a cluster name.
- Click the Libraries tab.
- Select the checkbox next to the cluster you want to uninstall the library from, click Uninstall, then Confirm. The Status changes to Uninstall pending restart.
Library
- Go to the folder containing the library.
- Click the library name.
- Select the checkbox next to the cluster you want to uninstall the library from, click Uninstall, then Confirm. The Status changes to Uninstall pending restart.
- Click the cluster name to go to the cluster detail page.
Click Restart and Confirm to uninstall the library. The library is removed from the cluster’s Libraries tab.
View the libraries installed on a cluster
- Click
Compute in the sidebar.
- Click the cluster name.
- Click the Libraries tab. For each library, the tab displays the name and version, type, install status, and, if uploaded, the source file.
Update a cluster-installed library
To update a cluster-installed library, uninstall the old version of the library and install a new version.
Feedback
Submit and view feedback for