Edit

Share via


Deploy Microsoft Foundry Models using code

Note

This document refers to the Microsoft Foundry (classic) portal.

🔄 Switch to the Microsoft Foundry (new) documentation if you're using the new portal.

Note

This document refers to the Microsoft Foundry (new) portal.

Important

If you're currently using an Azure AI Inference beta SDK with Microsoft Foundry Models or Azure OpenAI service, we strongly recommend that you transition to the generally available OpenAI/v1 API, which uses an OpenAI stable SDK.

For more information on how to migrate to the OpenAI/v1 API by using an SDK in your programming language of choice, see Migrate from Azure AI Inference SDK to OpenAI SDK.

You can decide and configure which models are available for inference in your Microsoft Foundry resource. When you configure a model, you can generate predictions from it by specifying its model name or deployment name in your requests. You don't need to make any other changes in your code to use the model.

In this article, you learn how to add a new model to a Foundry Models endpoint.

Prerequisites

To complete this article, you need:

  • Install the Azure CLI and the cognitiveservices extension for Foundry Tools.

    az extension add -n cognitiveservices
    
  • Some of the commands in this tutorial use the jq tool, which might not be installed on your system. For installation instructions, see Download jq.

  • Identify the following information:

    • Your Azure subscription ID.

    • Your Foundry Tools resource name.

    • The resource group where you deployed the Foundry Tools resource.

Add models

To add a model, first identify the model that you want to deploy. You can query the available models as follows:

  1. Sign in to your Azure subscription.

    az login
    
  2. If you have more than one subscription, select the subscription where your resource is located.

    az account set --subscription $subscriptionId
    
  3. Set the following environment variables with the name of the Foundry Tools resource you plan to use and resource group.

    accountName="<ai-services-resource-name>"
    resourceGroupName="<resource-group>"
    location="eastus2"
    
  4. If you didn't create a Foundry Tools account yet, create one.

    az cognitiveservices account create -n $accountName -g $resourceGroupName --custom-domain $accountName --location $location --kind AIServices --sku S0
    
  5. Check which models are available to you and under which SKU. SKUs, also known as deployment types, define how Azure infrastructure is used to process requests. Models might offer different deployment types. The following command lists all the model definitions available:

    az cognitiveservices account list-models \
        -n $accountName \
        -g $resourceGroupName \
    | jq '.[] | { name: .name, format: .format, version: .version, sku: .skus[0].name, capacity: .skus[0].capacity.default }'
    
  6. Outputs look as follows:

    {
      "name": "Phi-3.5-vision-instruct",
      "format": "Microsoft",
      "version": "2",
      "sku": "GlobalStandard",
      "capacity": 1
    }
    
  7. Identify the model you want to deploy. You need the properties name, format, version, and sku. The property format indicates the provider offering the model. You might also need capacity depending on the type of deployment.

  8. Add the model deployment to the resource. The following example adds Phi-3.5-vision-instruct:

    az cognitiveservices account deployment create \
        -n $accountName \
        -g $resourceGroupName \
        --deployment-name Phi-3.5-vision-instruct \
        --model-name Phi-3.5-vision-instruct \
        --model-version 2 \
        --model-format Microsoft \
        --sku-capacity 1 \
        --sku-name GlobalStandard
    
  9. The model is ready to use.

You can deploy the same model multiple times if needed as long as it's under a different deployment name. This capability might be useful if you want to test different configurations for a given model, including content filters.

Use the model

Deployed models in can be consumed using the Azure AI model's inference endpoint for the resource. When constructing your request, indicate the parameter model and insert the model deployment name you have created. You can programmatically get the URI for the inference endpoint using the following code:

Inference endpoint

az cognitiveservices account show  -n $accountName -g $resourceGroupName | jq '.properties.endpoints["Azure AI Model Inference API"]'

To make requests to the Microsoft Foundry Models endpoint, append the route models, for example https://<resource>.services.ai.azure.com/models. You can see the API reference for the endpoint at Azure AI Model Inference API reference page.

Inference keys

az cognitiveservices account keys list  -n $accountName -g $resourceGroupName

Manage deployments

You can see all the deployments available using the CLI:

  1. Run the following command to see all the active deployments:

    az cognitiveservices account deployment list -n $accountName -g $resourceGroupName
    
  2. You can see the details of a given deployment:

    az cognitiveservices account deployment show \
        --deployment-name "Phi-3.5-vision-instruct" \
        -n $accountName \
        -g $resourceGroupName
    
  3. You can delete a given deployment as follows:

    az cognitiveservices account deployment delete \
        --deployment-name "Phi-3.5-vision-instruct" \
        -n $accountName \
        -g $resourceGroupName
    

You can decide and configure which models are available for inference in your Microsoft Foundry resource. When you configure a model, you can generate predictions from it by specifying its model name or deployment name in your requests. You don't need to make any other changes in your code to use the model.

In this article, you learn how to add a new model to a Foundry Models endpoint.

Prerequisites

To complete this article, you need:

  • Install the Azure CLI.

  • Identify the following information:

    • Your Azure subscription ID.

    • Your Microsoft Foundry resource (formerly known as Azure AI Services resource) name.

    • The resource group where the Foundry resource is deployed.

    • The model name, provider, version, and SKU you want to deploy. You can use the Foundry portal or the Azure CLI to find this information. In this example, you deploy the following model:

      • Model name:: Phi-3.5-vision-instruct
      • Provider: Microsoft
      • Version: 2
      • Deployment type: Global standard

About this tutorial

The example in this article is based on code samples contained in the Azure-Samples/azureai-model-inference-bicep repository. To run the commands locally without having to copy or paste file content, use the following commands to clone the repository and go to the folder for your coding language:

git clone https://github.com/Azure-Samples/azureai-model-inference-bicep

The files for this example are in:

cd azureai-model-inference-bicep/infra

Permissions required to subscribe to Models from Partners and Community

Foundry Models from partners and community available for deployment (for example, Cohere models) require Azure Marketplace. Model providers define the license terms and set the price for use of their models using Azure Marketplace.

When deploying third-party models, ensure you have the following permissions in your account:

  • On the Azure subscription:
    • Microsoft.MarketplaceOrdering/agreements/offers/plans/read
    • Microsoft.MarketplaceOrdering/agreements/offers/plans/sign/action
    • Microsoft.MarketplaceOrdering/offerTypes/publishers/offers/plans/agreements/read
    • Microsoft.Marketplace/offerTypes/publishers/offers/plans/agreements/read
    • Microsoft.SaaS/register/action
  • On the resource group—to create and use the SaaS resource:
    • Microsoft.SaaS/resources/read
    • Microsoft.SaaS/resources/write

Add the model

  1. Use the template ai-services-deployment-template.bicep to describe model deployments:

    ai-services-deployment-template.bicep

    @description('Name of the Azure AI services account')
    param accountName string
    
    @description('Name of the model to deploy')
    param modelName string
    
    @description('Version of the model to deploy')
    param modelVersion string
    
    @allowed([
      'AI21 Labs'
      'Cohere'
      'Core42'
      'DeepSeek'
      'xAI'
      'Meta'
      'Microsoft'
      'Mistral AI'
      'OpenAI'
    ])
    @description('Model provider')
    param modelPublisherFormat string
    
    @allowed([
        'GlobalStandard'
        'DataZoneStandard'
        'Standard'
        'GlobalProvisioned'
        'Provisioned'
    ])
    @description('Model deployment SKU name')
    param skuName string = 'GlobalStandard'
    
    @description('Content filter policy name')
    param contentFilterPolicyName string = 'Microsoft.DefaultV2'
    
    @description('Model deployment capacity')
    param capacity int = 1
    
    resource modelDeployment 'Microsoft.CognitiveServices/accounts/deployments@2024-04-01-preview' = {
      name: '${accountName}/${modelName}'
      sku: {
        name: skuName
        capacity: capacity
      }
      properties: {
        model: {
          format: modelPublisherFormat
          name: modelName
          version: modelVersion
        }
        raiPolicyName: contentFilterPolicyName == null ? 'Microsoft.Nill' : contentFilterPolicyName
      }
    }
    
  2. Run the deployment:

    RESOURCE_GROUP="<resource-group-name>"
    ACCOUNT_NAME="<azure-ai-model-inference-name>" 
    MODEL_NAME="Phi-3.5-vision-instruct"
    PROVIDER="Microsoft"
    VERSION=2
    
    az deployment group create \
        --resource-group $RESOURCE_GROUP \
        --template-file ai-services-deployment-template.bicep \
        --parameters accountName=$ACCOUNT_NAME modelName=$MODEL_NAME modelVersion=$VERSION modelPublisherFormat=$PROVIDER
    

Use the model

Deployed models in can be consumed using the Azure AI model's inference endpoint for the resource. When constructing your request, indicate the parameter model and insert the model deployment name you have created. You can programmatically get the URI for the inference endpoint using the following code:

Inference endpoint

az cognitiveservices account show  -n $accountName -g $resourceGroupName | jq '.properties.endpoints["Azure AI Model Inference API"]'

To make requests to the Foundry Models endpoint, append the route models, for example https://<resource>.services.ai.azure.com/models. You can see the API reference for the endpoint at Azure AI Model Inference API reference page.

Inference keys

az cognitiveservices account keys list  -n $accountName -g $resourceGroupName

Next step