How to deploy Azure OpenAI models with Azure AI Studio

Note

Azure AI Studio is currently in public preview. This preview is provided without a service-level agreement, and we don't recommend it for production workloads. Certain features might not be supported or might have constrained capabilities. For more information, see Supplemental Terms of Use for Microsoft Azure Previews.

Azure OpenAI Service offers a diverse set of models with different capabilities and price points. Model availability varies by region. You can create Azure OpenAI model deployments in Azure AI Studio and consume them with prompt flow or your favorite tool. To learn more about the details of each model see Azure OpenAI Service models.

Deploying an Azure OpenAI model from the model catalog

To modify and interact with an Azure OpenAI model in the Azure AI Studio playground, first you need to deploy a base Azure OpenAI model to your project. Once the model is deployed and available in your project, you can consume its REST API endpoint as-is or customize further with your own data and other components (embeddings, indexes, etcetera).

  1. Choose a model you want to deploy from Azure AI Studio model catalog. Alternatively, you can initiate deployment by selecting + Create from your project>deployments

  2. Select Deploy to project on the model card details page.

  3. Choose the project you want to deploy the model to. For Azure OpenAI models, the Azure AI Content Safety filter is automatically turned on.

  4. Select Deploy.

  5. You land in the playground. Select View Code to obtain code samples that can be used to consume the deployed model in your application.

Regional availability and quota limits of a model

For Azure OpenAI models, the default quota for models varies by model and region. Certain models might only be available in some regions. For more information, see Azure OpenAI Service quotas and limits.

Quota for deploying and inferencing a model

For Azure OpenAI models, deploying and inferencing consumes quota that is assigned to your subscription on a per-region, per-model basis in units of Tokens-per-Minutes (TPM). When you sign up for Azure AI Studio, you receive default quota for most available models. Then, you assign TPM to each deployment as it is created, and the available quota for that model will be reduced by that amount. You can continue to create deployments and assign them TPM until you reach your quota limit.

Once that happens, you can only create new deployments of that model by:

See Azure AI Studio quota and Manage Azure OpenAI Service quota to learn more about quota.

Next steps