Use Azure Machine Learning labeling in Language Studio
Article
Labeling data is an important part of preparing your dataset. Using the labeling experience in Azure Machine Learning, you can experience easier collaboration, more flexibility, and the ability to outsource labeling tasks to external labeling vendors from the Azure Market Place. You can use Azure Machine Learning labeling for:
Connecting your labeling project to Azure Machine Learning is a one-to-one connection. If you disconnect your project, you will not be able to connect your project back to the same Azure Machine Learning project
You can't label in the Language Studio and Azure Machine Learning simultaneously. The labeling experience is enabled in one studio at a time.
The testing and training files in the labeling experience you switch away from will be ignored when training your model.
Only Azure Machine Learning's JSONL file format can be imported into Language Studio.
Projects with the multi-lingual option enabled can't be connected to Azure Machine Learning, and not all languages are supported.
The Azure Machine Learning workspace you're connecting to must be assigned to the same Azure Storage account that Language Studio is connected to. Be sure that the Azure Machine Learning workspace has the storage blob data reader permission on the storage account. The workspace needs to have been linked to the storage account during the creation process in the Azure portal.
Switching between the two labeling experiences isn't instantaneous. It may take time to successfully complete the operation.
Import your Azure Machine Learning labels into Language Studio
Language Studio supports the JSONL file format used by Azure Machine Learning. If you’ve been labeling data on Azure Machine Learning, you can import your up-to-date labels in a new custom project to utilize the features of both studios.
Start by creating a new project for custom text classification or custom named entity recognition.
In the Create a project screen that appears, follow the prompts to connect your storage account, and enter the basic information about your project. Be sure that the Azure resource you're using doesn't have another storage account already connected.
In the Choose container section, choose the option indicating that you already have a correctly formatted file. Then select your most recent Azure Machine Learning labels file.
Connect to Azure Machine Learning
Before you connect to Azure Machine Learning, you need an Azure Machine Learning account with a pricing plan that can accommodate the compute needs of your project. See the prerequisites section to make sure that you have successfully completed all the requirements to start connecting your Language Studio project to Azure Machine Learning.
Use the Azure portal to navigate to the Azure Blob Storage account connected to your language resource.
Ensure that the Storage Blob Data Contributor role is assigned to your AML workspace within the role assignments for your Azure Blob Storage account.
Navigate to your project in Language Studio. From the left navigation menu of your project, select Data labeling.
Select use Azure Machine Learning to label in either the Data labeling description, or under the Activity pane.
Select Connect to Azure Machine Learning to start the connection process.
In the window that appears, follow the prompts. Select the Azure Machine Learning workspace you’ve created previously under the same Azure subscription. Enter a name for the new Azure Machine Learning project that will be created to enable labeling in Azure Machine Learning.
Tip
Make sure your workspace is linked to the same Azure Blob Storage account and Language resource before continuing. You can create a new workspace and link to your storage account using the Azure portal. Ensure that the storage account is properly linked to the workspace.
(Optional) Turn on the vendor labeling toggle to use labeling vendor companies. Before choosing the vendor labeling companies, contact the vendor labeling companies on the Azure Marketplace to finalize a contract with them. For more information about working with vendor companies, see How to outsource data labeling.
You can also leave labeling instructions for the human labelers that will help you in the labeling process. These instructions can help them understand the task by leaving clear definitions of the labels and including examples for better results.
Review the settings for your connection to Azure Machine Learning and make changes if needed.
Important
Finalizing the connection is permanent. Attempting to disconnect your established connection at any point in time will permanently disable your Language Studio project from connecting to the same Azure Machine Learning project.
After the connection has been initiated, your ability to label data in Language Studio will be disabled for a few minutes to prepare the new connection.
Switch to labeling with Azure Machine Learning from Language Studio
Once the connection has been established, you can switch to Azure Machine Learning through the Activity pane in Language Studio at any time.
When you switch, your ability to label data in Language Studio will be disabled, and you will be able to label data in Azure Machine Learning. You can switch back to labeling in Language Studio at any time through Azure Machine Learning.
Train your model using labels from Azure Machine Learning
When you switch to labeling using Azure Machine Learning, you can still train, evaluate, and deploy your model in Language Studio. To train your model using updated labels from Azure Machine Learning:
Select Training jobs from the navigation menu on the left of the Language studio screen for your project.
Select Import latest labels from Azure Machine Learning from the Choose label origin section in the training page. This synchronizes the labels from Azure Machine Learning before starting the training job.
Switch to labeling with Language Studio from Azure Machine Learning
After you've switched to labeling with Azure Machine Learning, You can switch back to labeling with Language Studio project at any time.
Note
Only users with the correct roles in Azure Machine Learning have the ability to switch labeling.
When you switch to using Language Studio, labeling on Azure Machine learning will be disabled.
To switch back to labeling with Language Studio:
Navigate to your project in Azure Machine Learning and select Data labeling from the left navigation menu.
Select the Language Studio tab and select Switch to Language Studio.
The process takes a few minutes to complete, and your ability to label in Azure Machine Learning will be disabled until it's switched back from Language Studio.
Disconnecting from Azure Machine Learning
Disconnecting your project from Azure Machine Learning is a permanent, irreversible process and can't be undone. You will no longer be able to access your labels in Azure Machine Learning, and you won’t be able to reconnect the Azure Machine Learning project to any Language Studio project in the future. To disconnect from Azure Machine Learning:
Ensure that any updated labels you want to maintain are synchronized with Azure Machine Learning by switching the labeling experience back to the Language Studio.
Select Project settings from the navigation menu on the left in Language Studio.
Select the Disconnect from Azure Machine Learning button from the Manage Azure Machine Learning connections section.
Manage data ingestion and preparation, model training and deployment, and machine learning solution monitoring with Python, Azure Machine Learning and MLflow.