An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
Hello @Muhammad Umar
Thank you for Reaching out Microsoft Q&A.
Based on the current Azure OpenAI implementation, the behavior you are seeing is expected.
At this time, Azure OpenAI supports: • realtime voice-to-voice interactions through GPT-4o realtime models (such as gpt-4o-realtime-preview) • Whisper-based transcription using models like whisper-1 and gpt-4o-transcribe • streaming Speech-to-Text scenarios through Azure Speech services
However, Azure OpenAI does not currently expose a dedicated gpt-4o-realtime-whisper model within the Azure realtime endpoint (/openai/v1/realtime) for simultaneous live transcription during active voice-to-voice sessions.
This means: running realtime voice conversations and using gpt-4o-realtime-whisper as an integrated parallel transcription model is not currently supported in Azure OpenAI in the same way it may function on the public OpenAI platform.
At present, if live transcription is required alongside a realtime voice session in Azure, the recommended approaches are:
- Run a parallel Speech-to-Text streaming connection You can use: Azure Speech SDK Conversation Transcription, standard Speech-to-Text streaming APIs, or
whisper-1/gpt-4o-transcribealongside the realtime GPT-4o voice session. - Monitor model availability in your Azure region Realtime model support in Azure OpenAI is dependent on: region availability, API version, deployment type, and staged rollout status.
When/if gpt-4o-realtime-whisper becomes available in Azure OpenAI, it would appear in the supported model list for your region and API version.
Based on current availability, the integrated realtime Whisper functionality appears to be available today on: the public OpenAI platform, and certain Microsoft Foundry scenarios but not yet fully exposed through Azure OpenAI realtime APIs.
Thank you!