It seems like the language isn't detected correctly and it falls back on default of English. I think the reason for this could be due the last reason mentioned in the guidelines and limitations.
- If the audio contains languages other than the supported list, the result is unexpected.
- If Azure AI Video Indexer can't identify the language with a high enough confidence (greater than 0.6), the fallback language is English.
- Currently, there isn't support for files with mixed language audio. If the audio contains mixed languages, the result is unexpected.
- Low-quality audio may affect the model results.
- The model requires at least one minute of speech in the audio.
- The model is designed to recognize a spontaneous conversational speech (not voice commands, singing, and so on).
Did you also try to use auto detect multi language and check if there are better results?
If this answers your query, do click Accept Answer
and Yes
for was this answer helpful. And, if you have any further query do let us know.