Edit

Share via


Bring Your Own Model (BYOM) with Voice Live API

The Voice Live API provides Bring Your Own Model (BYOM) capabilities, allowing you to integrate your custom models into the voice interaction workflow. BYOM is useful for the following scenarios:

  • Fine-tuned models: Use your custom Azure OpenAI or Azure Foundry models
  • Provisioned throughput: Use your PTU (Provisioned Throughput Units) deployments for consistent performance
  • Content safety: Apply customized content safety configurations with your LLM

Important

You can integrate any model that was deployed in the same Azure Foundry resource you're using to call the Voice Live API.

Tip

When you use your own model deployment with Voice Live, we recommend you set its content filtering configuration to Asynchronous filtering to reduce latency. Content filtering settings can be configured in the Azure AI Foundry portal.

Authentication setup

When using Microsoft Entra ID authentication with Voice Live API, in byom-azure-openai-chat-completion mode specifically, you need to configure proper permissions for your Foundry resource. Since tokens expire during long sessions, the system-assigned managed identity of the Foundry resource requires access to model deployments for the byom-azure-openai-chat-completion BYOM mode.

Run the following Azure CLI commands to configure the necessary permissions:

export subscription_id=<your-subscription-id>
export resource_group=<your-resource-group>
export foundry_resource=<your-foundry-resource>

# Enable system-assigned managed identity for the foundry resource
az cognitiveservices account identity assign --name ${foundry_resource} --resource-group ${resource_group} --subscription ${subscription_id}

# Get the system-assigned managed identity object ID
identity_principal_id=$(az cognitiveservices account show --name ${foundry_resource} --resource-group ${resource_group} --subscription ${subscription_id} --query "identity.principalId" -o tsv)

# Assign the Azure AI User role to the system identity of the foundry resource
az role assignment create --assignee-object-id ${identity_principal_id} --role "Azure AI User" --scope /subscriptions/${subscription_id}/resourceGroups/${resource_group}/providers/Microsoft.CognitiveServices/accounts/${foundry_resource}

Choose BYOM integration mode

The Voice Live API supports two BYOM integration modes:

Mode Description Example Models
byom-azure-openai-realtime Azure OpenAI realtime models for streaming voice interactions gpt-realtime, gpt-realtime-mini
byom-azure-openai-chat-completion Azure OpenAI chat completion models for text-based interactions. Also applies to other Foundry models gpt-4.1, gpt-5-chat, grok-3

Integrate BYOM

Update the endpoint URL in your API call to include your BYOM configuration:

wss://<your-foundry-resource>.cognitiveservices.azure.com/voice-live/realtime?api-version=2025-10-01&profile=<your-byom-mode>&model=<your-model-deployment>

Get the <your-model-deployment> value from the AI Foundry portal. It corresponds to the name you gave the model at deployment time.