Multiple locales error, REST, Speech to Text, fast transcription API

Question

Multiple locales error, REST, Speech to Text, fast transcription API

It is VMS 100

Here's the issue

I use multiple locales as described at https://learn.microsoft.com/en-us/azure/ai-services/speech-service/fast-transcription-create?tabs=multilingual-transcription-on#request-configuration-options

Locales given were "hi-IN, en-IN"

            'locales' => ['hi-IN', 'en-IN'],

& then I get something like, in part of the text: कर रही थी एंड इट वास् सो इंटेंस

I was expecting the "एंड इट वास् सो इंटेंस " to be "and it was so intense"

I guess since this is in preview mode, this behaviour is normal. OR.... Am i missing something here?

Thanks in advance!

Accepted answer

0 additional answers

Your answer

Answer 1

Hello !

Thank you for posting on Microsoft Learn.

What you are seeing it a known limitation of the Azure Speech-to-Text Fast Transcription API when using multiple locales (multilingual transcription).

What you did configures Azure to expect speech in both Hindi (India) and English (India). The API attempts to auto-detect and transcribe speech from either language within the same audio stream.

However, in preview mode, this multilingual support can sometimes:

Fail to switch languages accurately mid-sentence
Phonetically transcribe English words in Devanagari (Hindi script), especially when speakers switch languages rapidly or with an accent

So, instead of "and it was so intense", you're getting the text you showed.

This is a phonetic transliteration of English words written in Hindi script not a true language detection and switch.

I recommend that you set primary language explicitly if the majority is in one at least you avoid unexpected transliteration of English into Hindi script.

"locale": "en-IN"

If you need to do code-switching you may think about segmenting audio into clearer monolingual chunks if possible or simply use single-locale transcription and then apply post-processing.

Share via

Multiple locales error, REST, Speech to Text, fast transcription API

0 additional answers

Your answer