Where to find List of Supported Languages for Azure REAL-TIME SPEECH-TO-TEXT TRANSCRIPTION?

Johan S Daniel 0 Reputation points
2025-01-05T18:31:31.2766667+00:00

https://learn.microsoft.com/en-us/azure/ai-services/speech-service/language-support?tabs=stt

Hello, I'm reaching out regarding the list of supported languages for the real-time speech-to-text transcription model.

I understand it is a Universal model under the hood and that fast transcription is a separate model.

I cross-checked with several Indic languages to see how fast-transcription and real-time speech transcription varied but unfortunately the API documentation does not mention anything about what all languages are supported for real-time speech transcription.

Some languages do not output Unicode characters but do recognize the end of sentences.

I would be really grateful if a support agent could point me to a resource that has the list of supported languages for real-time stt transcription.

Thank you.

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
2,069 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Marcin Policht 50,570 Reputation points MVP Volunteer Moderator
    2025-01-05T20:15:17.6966667+00:00

    As per https://learn.microsoft.com/en-us/azure/ai-services/speech-service/language-support?tabs=stt "The table in this section summarizes the locales supported for speech to text (real-time and batch transcription)."

    Regarding your comments, real-time transcription leverages the Universal model, which is designed to cover a broad range of languages. It may support sentence detection even when full Unicode character recognition isn't available, which is why some Indic languages may appear to handle sentence endings but not produce expected text outputs. Sentence-ending recognition (e.g., punctuation or pauses) might still work because it relies on acoustic and prosodic cues rather than text encoding.

    Fast transcription indeed uses a separate, optimized model for specific languages. The language support for this model tends to be more limited but is faster and more efficient for bulk transcription.


    If the above response helps answer your question, remember to "Accept Answer" so that others in the community facing similar issues can easily find the solution. Your contribution is highly appreciated.

    hth

    Marcin

    1 person found this answer helpful.

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.