An Azure service that integrates speech processing into apps and services.
There is no region-specific or feature-gating prerequisite for input_audio_transcription documented for Azure OpenAI Realtime beyond using the correct models and schema. A 500 response when adding input_audio_transcription indicates a service-side error rather than a validation or configuration issue on the client.
From the Realtime REST reference:
-
input_audio_transcriptionis an optional configuration object on the Realtime session. It “defaults to off and can be set tonullto turn off once on.” - It is not tied to a special feature flag in the public documentation.
- The supported transcription models for this property are:
-
whisper-1 -
gpt-4o-mini-transcribe -
gpt-4o-mini-transcribe-2025-12-15 -
gpt-4o-transcribe -
gpt-4o-transcribe-diarize
-
The only documented prerequisites for using input_audio_transcription are therefore:
- Use the GA Realtime API endpoints and protocol
- Path:
POST {endpoint}/openai/v1/realtime/client_secrets - Optional
api-versionquery:v1orpreview(defaults tov1if omitted).
- Path:
- Provide a valid Realtime session payload
-
session.typemust be set appropriately for a Realtime session. - If
input_audio_transcriptionis included, itsmodelvalue must be one of the supported transcription models listed above. - Optional fields like
languageandpromptfollow the documented schema.
-
- Use a supported Realtime model deployment
- For WebRTC/Realtime, use one of the Realtime models listed for the region (for example
gpt-4o-mini-realtime-preview,gpt-4o-realtime-preview,gpt-realtime, orgpt-realtime-minias documented for East US 2 and Sweden Central). - Ensure the deployment name in the session configuration matches the actual deployment.
- For WebRTC/Realtime, use one of the Realtime models listed for the region (for example
Given that:
- The same request succeeds (200) when
input_audio_transcriptionis omitted, and - The documentation does not list any additional region-specific requirements or feature flags for
input_audio_transcription,
this pattern is consistent with a backend issue in the Sweden Central Realtime service rather than a missing prerequisite in the request. The next steps that align with the documentation are:
- Verify that the
input_audio_transcription.modelis one of the supported transcription models listed above. - Confirm that the Realtime model deployment being used is supported in Sweden Central and that the deployment name is correct.
- If the 500 persists with a valid payload and supported model, treat it as a service-side incident and raise it via Azure support, including:
- Region (Sweden Central), resource name, and subscription ID.
- The exact
POST /openai/v1/realtime/client_secretspayload with and withoutinput_audio_transcription. - Timestamps and
apim-request-idvalues from the failing 500 responses.
There is no documented feature-gating or extra configuration required beyond the above; a persistent 500 in this scenario requires service investigation.
References: