Once you are satisfied with how your model performs, it's ready to be deployed, and query it for predictions from utterances. Deploying a model makes it available for use through the prediction API.
After you have reviewed the model's performance and decide it's fit to be used in your environment, you need to assign it to a deployment to be able to query it. Assigning the model to a deployment makes it available for use through the prediction API. It is recommended to create a deployment named production to which you assign the best model you have built so far and use it in your system. You can create another deployment called staging to which you can assign the model you're currently working on to be able to test it. You can have a maximum on 10 deployments in your project.
Select Add deployment to start a new deployment job.
Select Create new deployment to create a new deployment and assign a trained model from the dropdown below. You can also Overwrite an existing deployment by selecting this option and select the trained model you want to assign to it from the dropdown below.
Note
Overwriting an existing deployment doesn't require changes to your prediction API call, but the results you get will be based on the newly assigned model.
No configurations are required for custom question answering or unlinked intents.
LUIS projects must be published to the slot configured during the Orchestration deployment, and custom question answering KBs must also be published to their Production slots.
Select Deploy to submit your deployment job
After deployment is successful, an expiration date will appear next to it. Deployment expiration is when your deployed model will be unavailable to be used for prediction, which typically happens twelve months after a training configuration expires.
Submit deployment job
Create a PUT request using the following URL, headers, and JSON body to start deploying an orchestration workflow model.
Use the following header to authenticate your request.
Key
Value
Ocp-Apim-Subscription-Key
The key to your resource. Used for authenticating your API requests.
Request Body
{
"trainedModelLabel": "{MODEL-NAME}",
}
Key
Placeholder
Value
Example
trainedModelLabel
{MODEL-NAME}
The model name that will be assigned to your deployment. You can only assign successfully trained models. This value is case-sensitive.
myModel
Once you send your API request, you will receive a 202 response indicating success. In the response headers, extract the operation-location value. It will be formatted like this:
The name for your project. This value is case-sensitive.
myProject
{DEPLOYMENT-NAME}
The name for your deployment. This value is case-sensitive.
staging
{JOB-ID}
The ID for locating your model's training status. This is in the location header value you received from the API in response to your model deployment request.
To delete a deployment from within Language Studio, go to the Deploy model page. Select the deployment you want to delete, and select Delete deployment from the top menu.
Create a DELETE request using the following URL, headers, and JSON body to delete a conversational language understanding deployment.
Go to the Deploying a model page in Language Studio.
Select the Regions tab.
Select Add deployment resource.
Select a Language resource in another region.
You are now ready to deploy your project to the regions where you have assigned resources.
Assigning deployment resources programmatically requires Microsoft Entra authentication**. Microsoft Entra ID is used to confirm you have access to the resources you are interested in assigning to your project for multi-region deployment. To programmatically use Microsoft Entra authentication when making REST API calls, see the Azure AI services authentication documentation.
Assign resource
Submit a POST request using the following URL, headers, and JSON body to assign deployment resources.
Request URL
Use the following URL when creating your API request. Replace the placeholder values below with your own values.
The custom subdomain of the resource you want to assign. Found in the Azure portal under the Keys and Endpoint tab for the resource, part of the Endpoint field in the URL https://<your-custom-subdomain>.cognitiveservices.azure.com/
contosoresource
region
{REGION-CODE}
A region code specifying the region of the resource you want to assign. Found in the Azure portal under the Keys and Endpoint tab for the resource, as part of the Location/Region field.
eastus
Get assign resource status
Use the following GET request to get the status of your assign deployment resource job. Replace the placeholder values below with your own values.
The name for your project. This value is case-sensitive.
myProject
{JOB-ID}
The job ID for getting your assign deployment status. This is in the operation-location header value you received from the API in response to your assign deployment resource request.
xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxxx
{API-VERSION}
The version of the API you're calling.
2022-10-01-preview
Headers
Use the following header to authenticate your request.
Key
Value
Ocp-Apim-Subscription-Key
The key to your resource. Used for authenticating your API requests.
Response Body
Once you send the request, you will get the following response. Keep polling this endpoint until the status parameter changes to "succeeded".
When unassigning or removing a deployment resource from a project, you will also delete all the deployments that have been deployed to that resource's region.
The name for your project. This value is case-sensitive.
myProject
{JOB-ID}
The job ID for getting your assign deployment status. This is in the operation-location header value you received from the API in response to your unassign deployment resource request.
xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxxx
{API-VERSION}
The version of the API you're calling.
2022-10-01-preview
Headers
Use the following header to authenticate your request.
Key
Value
Ocp-Apim-Subscription-Key
The key to your resource. Used for authenticating your API requests.
Response Body
Once you send the request, you will get the following response. Keep polling this endpoint until the status parameter changes to "succeeded".