Seeking Optimal Speech Transcription Service for Mixed Chinese and English Scenarios

Question

Our speech recognition scenario mainly involves a mix of Chinese and English. Currently, we have chosen the Chinese language recognition type (as there is no specific type for mixed Chinese and English). Besides manually adding hotwords and conducting plain-text training, is there a more suitable speech transcription service for a mixed Chinese and English scenario?

Accepted Answer

@hexarrior Welcome to Microsoft Q&A Forum, Thank you for posting your query here!

.

Automatic multi-lingual speech translation is available in public preview. This innovative feature revolutionizes the way language barriers are overcome, offering unparalleled capabilities for seamless communication across diverse linguistic landscapes.

More info is available here.

Key Highlights

Unspecified input language: Multi-lingual speech translation can receive audio in a wide range of languages, and there's no need to specify what the expected input language is. It makes it an invaluable feature to understand and collaborate across global contexts without the need for presetting.
Language switching: Multi-lingual speech translation allows for multiple languages to be spoken during the same session, and have them all translated into the same target language. There's no need to restart a session when the input language changes or any other actions by you.

.

How to access ?

Refer to the code samples at how to translate speech. This new feature is fully supported by all SDK versions from 1.37.0 onwards.

.

Batch transcription provides models with new architecture for these locales: es-ES, es-MX, fr-FR, it-IT, ja-JP, ko-KR, pt-BR, and zh-CN. These models significantly enhance readability and entity recognition.

.

For multi-lingual speech translation, these are the languages the Speech service can automatically detect and switch between from the input: Arabic (ar), Basque (eu), Bosnian (bs), Bulgarian (bg), Chinese Simplified (zh), Chinese Traditional (zhh), Czech (cs), Danish (da), Dutch (nl), English (en), Estonian (et), Finnish (fi), French (fr), Galician (gl), German (de), Greek (el), Hindi (hi), Hungarian (hu), Indonesian (id), Italian (it), Japanese (ja), Korean (ko), Latvian (lv), Lithuanian (lt), Macedonian (mk), Norwegian (nb), Polish (pl), Portuguese (pt), Romanian (ro), Russian (ru), Serbian (sr), Slovak (sk), Slovenian (sl), Spanish (es), Swedish (sv), Thai (th), Turkish (tr), Ukrainian (uk), Vietnamese (vi), and Welsh (cy).

For a list of the supported output (target) languages, see the Translate to text language table in the language and voice support documentation.

.

Hope this helps. If you have any follow-up questions, please let me know. I would be happy to help.

**

Please do not forget to "Accept the answer” and “up-vote” wherever the information provided helps you, this can be beneficial to other community members.

Share via

Seeking Optimal Speech Transcription Service for Mixed Chinese and English Scenarios

0 additional answers

Your answer