Language and voice support for the Speech service
The following tables summarize language support for speech to text, text to speech, pronunciation assessment, speech translation, speaker recognition, and more service features.
You can also get a list of locales and voices supported for each specific region or endpoint via:
Supported languages
Language support varies by Speech service functionality.
Note
See speech containers and embedded speech documentation for their supported languages.
Choose a Speech feature
- Speech to text
- Text to speech
- Pronunciation assessment
- Speech translation
- Language identification
- Speaker recognition
- Custom keyword
- Intent Recognition
The table in this section summarizes the locales supported for speech to text. For details, see the table footnotes.
More remarks for speech to text locales are included in the custom speech section of this article.
Tip
Try out the real-time speech to text tool without having to use any code.
Locale (BCP-47) | Language | Custom speech support |
---|---|---|
af-ZA |
Afrikaans (South Africa) | Plain text |
am-ET |
Amharic (Ethiopia) | Plain text |
ar-AE |
Arabic (United Arab Emirates) | Audio + human-labeled transcript Plain text |
ar-BH |
Arabic (Bahrain) | Audio + human-labeled transcript Plain text |
ar-DZ |
Arabic (Algeria) | Audio + human-labeled transcript Plain text |
ar-EG |
Arabic (Egypt) | Audio + human-labeled transcript Plain text |
ar-IL |
Arabic (Israel) | Audio + human-labeled transcript Plain text |
ar-IQ |
Arabic (Iraq) | Audio + human-labeled transcript Plain text |
ar-JO |
Arabic (Jordan) | Audio + human-labeled transcript Plain text |
ar-KW |
Arabic (Kuwait) | Audio + human-labeled transcript Plain text |
ar-LB |
Arabic (Lebanon) | Audio + human-labeled transcript Plain text |
ar-LY |
Arabic (Libya) | Audio + human-labeled transcript Plain text |
ar-MA |
Arabic (Morocco) | Audio + human-labeled transcript Plain text |
ar-OM |
Arabic (Oman) | Audio + human-labeled transcript Plain text |
ar-PS |
Arabic (Palestinian Authority) | Audio + human-labeled transcript Plain text |
ar-QA |
Arabic (Qatar) | Audio + human-labeled transcript Plain text |
ar-SA |
Arabic (Saudi Arabia) | Audio + human-labeled transcript Plain text Phrase list |
ar-SY |
Arabic (Syria) | Audio + human-labeled transcript Plain text |
ar-TN |
Arabic (Tunisia) | Audio + human-labeled transcript Plain text |
ar-YE |
Arabic (Yemen) | Audio + human-labeled transcript Plain text |
az-AZ |
Azerbaijani (Latin, Azerbaijan) | Plain text |
bg-BG |
Bulgarian (Bulgaria) | Plain text |
bn-IN |
Bengali (India) | Plain text |
bs-BA |
Bosnian (Bosnia and Herzegovina) | Plain text |
ca-ES |
Catalan | Plain text Pronunciation |
cs-CZ |
Czech (Czechia) | Audio + human-labeled transcript Plain text Structured text Pronunciation |
cy-GB |
Welsh (United Kingdom) | Plain text |
da-DK |
Danish (Denmark) | Audio + human-labeled transcript Plain text Structured text Output format Pronunciation |
de-AT |
German (Austria) | Audio + human-labeled transcript Plain text Structured text Pronunciation |
de-CH |
German (Switzerland) | Audio + human-labeled transcript Plain text Pronunciation Phrase list |
de-DE |
German (Germany) | Audio + human-labeled transcript Plain text Structured text Output format Pronunciation Phrase list |
el-GR |
Greek (Greece) | Plain text |
en-AU |
English (Australia) | Audio + human-labeled transcript Audio Plain text Structured text Output format Pronunciation Phrase list |
en-CA |
English (Canada) | Audio + human-labeled transcript Audio Plain text Structured text Output format Pronunciation Phrase list |
en-GB |
English (United Kingdom) | Audio + human-labeled transcript Audio Plain text Structured text Output format Pronunciation Phrase list |
en-GH |
English (Ghana) | Audio + human-labeled transcript Audio Plain text Structured text Pronunciation |
en-HK |
English (Hong Kong SAR) | Audio + human-labeled transcript Audio Plain text Structured text Output format Pronunciation |
en-IE |
English (Ireland) | Audio + human-labeled transcript Audio Plain text Structured text Output format Pronunciation Phrase list |
en-IN |
English (India) | Audio + human-labeled transcript Plain text Structured text Output format Pronunciation Phrase list |
en-KE |
English (Kenya) | Audio + human-labeled transcript Audio Plain text Structured text Pronunciation |
en-NG |
English (Nigeria) | Audio + human-labeled transcript Audio Plain text Structured text Output format Pronunciation |
en-NZ |
English (New Zealand) | Audio + human-labeled transcript Audio Plain text Structured text Output format Pronunciation |
en-PH |
English (Philippines) | Audio + human-labeled transcript Audio Plain text Structured text Output format Pronunciation |
en-SG |
English (Singapore) | Audio + human-labeled transcript Audio Plain text Structured text Output format Pronunciation |
en-TZ |
English (Tanzania) | Audio + human-labeled transcript Audio Plain text Structured text Pronunciation |
en-US |
English (United States) | Audio + human-labeled transcript Audio Plain text Structured text Output format Pronunciation Phrase list |
en-ZA |
English (South Africa) | Audio + human-labeled transcript Audio Plain text Structured text Pronunciation Phrase list |
es-AR |
Spanish (Argentina) | Plain text Structured text Pronunciation |
es-BO |
Spanish (Bolivia) | Audio + human-labeled transcript Plain text Structured text Pronunciation |
es-CL |
Spanish (Chile) | Audio + human-labeled transcript Plain text Structured text Pronunciation |
es-CO |
Spanish (Colombia) | Audio + human-labeled transcript Plain text Structured text Pronunciation |
es-CR |
Spanish (Costa Rica) | Audio + human-labeled transcript Plain text Structured text Pronunciation |
es-CU |
Spanish (Cuba) | Plain text Structured text Pronunciation |
es-DO |
Spanish (Dominican Republic) | Plain text Structured text Pronunciation |
es-EC |
Spanish (Ecuador) | Plain text Structured text Pronunciation |
es-ES |
Spanish (Spain) | Audio + human-labeled transcript Plain text Structured text Output format Pronunciation Phrase list |
es-GQ |
Spanish (Equatorial Guinea) | Audio + human-labeled transcript Plain text Structured text |
es-GT |
Spanish (Guatemala) | Audio + human-labeled transcript Plain text Structured text Pronunciation |
es-HN |
Spanish (Honduras) | Audio + human-labeled transcript Plain text Structured text Pronunciation |
es-MX |
Spanish (Mexico) | Audio + human-labeled transcript Plain text Structured text Output format Pronunciation Phrase list |
es-NI |
Spanish (Nicaragua) | Audio + human-labeled transcript Plain text Structured text Pronunciation |
es-PA |
Spanish (Panama) | Audio + human-labeled transcript Plain text Structured text Pronunciation |
es-PE |
Spanish (Peru) | Audio + human-labeled transcript Plain text Structured text Pronunciation |
es-PR |
Spanish (Puerto Rico) | Audio + human-labeled transcript Plain text Structured text Pronunciation |
es-PY |
Spanish (Paraguay) | Audio + human-labeled transcript Plain text Structured text Pronunciation |
es-SV |
Spanish (El Salvador) | Audio + human-labeled transcript Plain text Structured text Pronunciation |
es-US |
Spanish (United States) | Plain text Structured text Pronunciation Phrase list |
es-UY |
Spanish (Uruguay) | Audio + human-labeled transcript Plain text Structured text Pronunciation |
es-VE |
Spanish (Venezuela) | Plain text Structured text Pronunciation |
et-EE |
Estonian (Estonia) | Plain text Pronunciation |
eu-ES |
Basque | Plain text |
fa-IR |
Persian (Iran) | Plain text |
fi-FI |
Finnish (Finland) | Audio + human-labeled transcript Plain text Structured text Output format Pronunciation |
fil-PH |
Filipino (Philippines) | Plain text Pronunciation |
fr-BE |
French (Belgium) | Plain text |
fr-CA |
French (Canada) | Plain text Structured text Output format Pronunciation Phrase list |
fr-CH |
French (Switzerland) | Plain text Pronunciation |
fr-FR |
French (France) | Audio + human-labeled transcript Plain text Structured text Output format Pronunciation Phrase list |
ga-IE |
Irish (Ireland) | Plain text Pronunciation |
gl-ES |
Galician | Plain text |
gu-IN |
Gujarati (India) | Plain text |
he-IL |
Hebrew (Israel) | Audio + human-labeled transcript Plain text |
hi-IN |
Hindi (India) | Audio + human-labeled transcript Plain text Output format Phrase list |
hr-HR |
Croatian (Croatia) | Plain text Pronunciation |
hu-HU |
Hungarian (Hungary) | Plain text Pronunciation |
hy-AM |
Armenian (Armenia) | Plain text |
id-ID |
Indonesian (Indonesia) | Audio + human-labeled transcript Plain text Structured text Pronunciation Phrase list |
is-IS |
Icelandic (Iceland) | Plain text |
it-CH |
Italian (Switzerland) | Plain text |
it-IT |
Italian (Italy) | Audio + human-labeled transcript Plain text Structured text Output format Pronunciation Phrase list |
ja-JP |
Japanese (Japan) | Audio + human-labeled transcript Plain text Structured text Output format Phrase list |
jv-ID |
Javanese (Latin, Indonesia) | Plain text |
ka-GE |
Georgian (Georgia) | Plain text |
kk-KZ |
Kazakh (Kazakhstan) | Plain text |
km-KH |
Khmer (Cambodia) | Plain text |
kn-IN |
Kannada (India) | Plain text |
ko-KR |
Korean (Korea) | Audio + human-labeled transcript Plain text Structured text Output format Phrase list |
lo-LA |
Lao (Laos) | Plain text |
lt-LT |
Lithuanian (Lithuania) | Plain text Pronunciation |
lv-LV |
Latvian (Latvia) | Plain text Pronunciation |
mk-MK |
Macedonian (North Macedonia) | Plain text |
ml-IN |
Malayalam (India) | Plain text |
mn-MN |
Mongolian (Mongolia) | Plain text |
mr-IN |
Marathi (India) | Plain text |
ms-MY |
Malay (Malaysia) | Plain text |
mt-MT |
Maltese (Malta) | Plain text |
my-MM |
Burmese (Myanmar) | Plain text |
nb-NO |
Norwegian Bokmål (Norway) | Plain text Output format |
ne-NP |
Nepali (Nepal) | Plain text |
nl-BE |
Dutch (Belgium) | Plain text |
nl-NL |
Dutch (Netherlands) | Plain text Output format Pronunciation Phrase list |
pa-IN |
Punjabi (India) | Audio + human-labeled transcript |
pl-PL |
Polish (Poland) | Audio + human-labeled transcript Plain text Structured text Output format Pronunciation Phrase list |
ps-AF |
Pashto (Afghanistan) | Plain text |
pt-BR |
Portuguese (Brazil) | Audio + human-labeled transcript Plain text Structured text Output format Pronunciation Phrase list |
pt-PT |
Portuguese (Portugal) | Audio + human-labeled transcript Plain text Structured text Output format Pronunciation Phrase list |
ro-RO |
Romanian (Romania) | Plain text Pronunciation |
ru-RU |
Russian (Russia) | Audio + human-labeled transcript Plain text Phrase list |
si-LK |
Sinhala (Sri Lanka) | Plain text |
sk-SK |
Slovak (Slovakia) | Plain text Pronunciation |
sl-SI |
Slovenian (Slovenia) | Plain text Pronunciation |
so-SO |
Somali (Somalia) | Plain text |
sq-AL |
Albanian (Albania) | Plain text |
sr-RS |
Serbian (Cyrillic, Serbia) | Plain text |
sv-SE |
Swedish (Sweden) | Audio + human-labeled transcript Plain text Output format Pronunciation Phrase list |
sw-KE |
Kiswahili (Kenya) | Plain text |
sw-TZ |
Kiswahili (Tanzania) | Plain text |
ta-IN |
Tamil (India) | Plain text |
te-IN |
Telugu (India) | Plain text |
th-TH |
Thai (Thailand) | Audio + human-labeled transcript Plain text Structured text Phrase list |
tr-TR |
Turkish (Türkiye) | Audio + human-labeled transcript Plain text Structured text Output format |
uk-UA |
Ukrainian (Ukraine) | Plain text |
ur-IN |
Urdu (India) | Audio + human-labeled transcript |
uz-UZ |
Uzbek (Latin, Uzbekistan) | Plain text |
vi-VN |
Vietnamese (Vietnam) | Plain text Phrase list |
wuu-CN |
Chinese (Wu, Simplified) | Plain text |
yue-CN |
Chinese (Cantonese, Simplified) | Plain text |
zh-CN |
Chinese (Mandarin, Simplified) | Audio + human-labeled transcript Plain text Structured text Output format Phrase list |
zh-CN-shandong |
Chinese (Jilu Mandarin, Simplified) | Plain text |
zh-CN-sichuan |
Chinese (Southwestern Mandarin, Simplified) | Plain text |
zh-HK |
Chinese (Cantonese, Traditional) | Audio + human-labeled transcript Plain text Structured text Output format Phrase list |
zh-TW |
Chinese (Taiwanese Mandarin, Traditional) | Audio + human-labeled transcript Plain text Structured text Phrase list |
zu-ZA |
isiZulu (South Africa) | Plain text |
1 The model is bilingual and also supports English.
Custom speech
To improve speech to text recognition accuracy, customization is available for some languages and base models. Depending on the locale, you can upload audio + human-labeled transcripts, plain text, structured text, and pronunciation data. By default, plain text customization is supported for all available base models. To learn more about customization, see custom speech.
These are the locales that support the display text format feature: da-DK, de-DE, en-AU, en-CA, en-GB, en-HK, en-IE, en-IN, en-NG, en-NZ, en-PH, en-SG, en-US, es-ES, es-MX, fi-FI, fr-CA, fr-FR, hi-IN, it-IT, ja-JP, ko-KR, nb-NO, nl-NL, pl-PL, pt-BR, pt-PT, sv-SE, tr-TR, zh-CN, zh-HK.
Fast transcription
The supported locales for the fast transcription API are: en-US, es-ES, es-MX, fr-FR, hi-IN, it-IT, ja-JP, ko-KR, pt-BR, and zh-CN. You can only specify one locale per transcription request.