Connect to Labelbox
Labelbox is a training data platform used to create training data from images, video, audio, text, and tiled imagery. Using Labelbox, AI teams can customize a workflow to operate, manage and improve data labeling, data cataloging, and model debugging in a single, unified platform. Labelbox is designed to help AI teams build and operate production-grade machine learning systems.
You can connect your Azure Databricks clusters that have the Machine Learning version of the Databricks Runtime to Labelbox.
This section describes how to connect a cluster in your Azure Databricks workspace to Labelbox using Partner Connect.
To connect to Labelbox using Partner Connect, you follow the steps in Connect to ML partners using Partner Connect. The Labelbox connection is different from standard machine learning connections in the following ways:
- In addition to a cluster, a service principal, and a personal access token, Partner Connect creates a notebook named
labelbox_databricks_example.ipynb
in the Workspace/Shared/labelbox_demo folder in your Labelbox account, if it doesn’t already exist.
To connect to Labelbox using Partner Connect, do the following:
- Connect to ML partners using Partner Connect.
- Create a Labelbox API key for your Labelbox account, if you do not have one. Copy the API key and save it in a secure location, as the key will eventually be hidden from view, and you will need this key later.
- Set up the ML cluster and Labelbox starter notebook.
The steps in this section describe how to connect Labelbox to an Azure Databricks cluster.
Note
To connect faster, use Partner Connect.
You must have an available cluster running Databricks Runtime for Machine Learning. To check this for an existing cluster, look for ML in the Runtime column when you display the cluster in your workspace. If you do not have an available Databricks Runtime ML cluster, create a cluster and for Databricks Runtime Version, choose a version from the ML list.
To connect to Labelbox manually, do the following:
- Go to the Labelbox page to Sign Up for a new Labelbox account or to Log In to your existing Labelbox account.
- Create a Labelbox API key for your Labelbox account, if you do not have one. Copy the API key and save it in a secure location, as the key will eventually be hidden from view, and you will need this key later.
- Check for a Labelbox starter notebook in your workspace:
- In the sidebar, click Workspace > Shared.
- If a folder named labelbox_demo does not already exist, create it:
i. Click the down arrow next to Shared.
ii. Click Create > Folder.
iii. Enter
labelbox_demo
, iv. Click Create Folder. - Click the labelbox_demo folder. If a starter notebook named labelbox_databricks_example.ipynb does not exist in the folder, import it:
i. Click the down arrow next to labelbox_demo.
ii. Click Import.
iii. Click URL.
iv. Enter
https://github.com/Labelbox/labelbox-python/blob/develop/examples/integrations/databricks/labelbox_databricks_example.ipynb
and click Import.
- Continue to set up the ML cluster and Labelbox starter notebook.
- Check that the required Labelbox libraries are installed in your ML cluster:
In the sidebar, click Compute.
Click your ML cluster. Use the Filter box to find it, if necessary.
Note
If you used Partner Connect to connect to Labelbox, the ML cluster’s name should be LABELBOX_CLUSTER.
Click the Libraries tab.
If the labelbox package is not listed, install it: i. Click Install New. ii. Click PyPI. iii. For Package, enter labelbox. iv. Click Install.
If the labelspark package is not listed, install it: i. Click Install New. ii. Click PyPI. iii. For Package, enter labelspark. iv. Click Install.
- Attach your ML cluster to the starter notebook:
- In the sidebar, click Workspace > Shared > labelbox_demo > labelbox_databricks_example.ipynb.
- Attach your ML cluster to the notebook.
- Browse through the notebook to learn how to automate Labelbox.
- README in GitHub for the starter notebook
- Labelbox Docs
- Support