Language and voice support for the Speech service

The following tables summarize language support for speech to text, text to speech, pronunciation assessment, speech translation, speaker recognition, and more service features.

You can also get a list of locales and voices supported for each specific region or endpoint via:

Supported languages

Language support varies by Speech service functionality.

Note

See speech containers and embedded speech documentation for their supported languages.

Choose a Speech feature

The table in this section summarizes the locales supported for speech to text. For details, see the table footnotes.

More remarks for speech to text locales are included in the custom speech section of this article.

Tip

Try out the real-time speech to text tool without having to use any code.

Locale (BCP-47) Language Custom speech support
af-ZA Afrikaans (South Africa) Plain text
am-ET Amharic (Ethiopia) Plain text
ar-AE Arabic (United Arab Emirates) Audio + human-labeled transcript

Plain text
ar-BH Arabic (Bahrain) Audio + human-labeled transcript

Plain text
ar-DZ Arabic (Algeria) Audio + human-labeled transcript

Plain text
ar-EG Arabic (Egypt) Audio + human-labeled transcript

Plain text
ar-IL Arabic (Israel) Audio + human-labeled transcript

Plain text
ar-IQ Arabic (Iraq) Audio + human-labeled transcript

Plain text
ar-JO Arabic (Jordan) Audio + human-labeled transcript

Plain text
ar-KW Arabic (Kuwait) Audio + human-labeled transcript

Plain text
ar-LB Arabic (Lebanon) Audio + human-labeled transcript

Plain text
ar-LY Arabic (Libya) Audio + human-labeled transcript

Plain text
ar-MA Arabic (Morocco) Audio + human-labeled transcript

Plain text
ar-OM Arabic (Oman) Audio + human-labeled transcript

Plain text
ar-PS Arabic (Palestinian Authority) Audio + human-labeled transcript

Plain text
ar-QA Arabic (Qatar) Audio + human-labeled transcript

Plain text
ar-SA Arabic (Saudi Arabia) Audio + human-labeled transcript

Plain text

Phrase list
ar-SY Arabic (Syria) Audio + human-labeled transcript

Plain text
ar-TN Arabic (Tunisia) Audio + human-labeled transcript

Plain text
ar-YE Arabic (Yemen) Audio + human-labeled transcript

Plain text
az-AZ Azerbaijani (Latin, Azerbaijan) Plain text
bg-BG Bulgarian (Bulgaria) Plain text
bn-IN Bengali (India) Plain text
bs-BA Bosnian (Bosnia and Herzegovina) Plain text
ca-ES Catalan Plain text

Pronunciation
cs-CZ Czech (Czechia) Audio + human-labeled transcript

Plain text

Structured text

Pronunciation
cy-GB Welsh (United Kingdom) Plain text
da-DK Danish (Denmark) Audio + human-labeled transcript

Plain text

Structured text

Output format

Pronunciation
de-AT German (Austria) Audio + human-labeled transcript

Plain text

Structured text

Pronunciation
de-CH German (Switzerland) Audio + human-labeled transcript

Plain text

Pronunciation

Phrase list
de-DE German (Germany) Audio + human-labeled transcript

Plain text

Structured text

Output format

Pronunciation

Phrase list
el-GR Greek (Greece) Plain text
en-AU English (Australia) Audio + human-labeled transcript

Audio

Plain text

Structured text

Output format

Pronunciation

Phrase list
en-CA English (Canada) Audio + human-labeled transcript

Audio

Plain text

Structured text

Output format

Pronunciation

Phrase list
en-GB English (United Kingdom) Audio + human-labeled transcript

Audio

Plain text

Structured text

Output format

Pronunciation

Phrase list
en-GH English (Ghana) Audio + human-labeled transcript

Audio

Plain text

Structured text

Pronunciation
en-HK English (Hong Kong SAR) Audio + human-labeled transcript

Audio

Plain text

Structured text

Output format

Pronunciation
en-IE English (Ireland) Audio + human-labeled transcript

Audio

Plain text

Structured text

Output format

Pronunciation

Phrase list
en-IN English (India) Audio + human-labeled transcript

Plain text

Structured text

Output format

Pronunciation

Phrase list
en-KE English (Kenya) Audio + human-labeled transcript

Audio

Plain text

Structured text

Pronunciation
en-NG English (Nigeria) Audio + human-labeled transcript

Audio

Plain text

Structured text

Output format

Pronunciation
en-NZ English (New Zealand) Audio + human-labeled transcript

Audio

Plain text

Structured text

Output format

Pronunciation
en-PH English (Philippines) Audio + human-labeled transcript

Audio

Plain text

Structured text

Output format

Pronunciation
en-SG English (Singapore) Audio + human-labeled transcript

Audio

Plain text

Structured text

Output format

Pronunciation
en-TZ English (Tanzania) Audio + human-labeled transcript

Audio

Plain text

Structured text

Pronunciation
en-US English (United States) Audio + human-labeled transcript

Audio

Plain text

Structured text

Output format

Pronunciation

Phrase list
en-ZA English (South Africa) Audio + human-labeled transcript

Audio

Plain text

Structured text

Pronunciation

Phrase list
es-AR Spanish (Argentina) Plain text

Structured text

Pronunciation
es-BO Spanish (Bolivia) Audio + human-labeled transcript

Plain text

Structured text

Pronunciation
es-CL Spanish (Chile) Audio + human-labeled transcript

Plain text

Structured text

Pronunciation
es-CO Spanish (Colombia) Audio + human-labeled transcript

Plain text

Structured text

Pronunciation
es-CR Spanish (Costa Rica) Audio + human-labeled transcript

Plain text

Structured text

Pronunciation
es-CU Spanish (Cuba) Plain text

Structured text

Pronunciation
es-DO Spanish (Dominican Republic) Plain text

Structured text

Pronunciation
es-EC Spanish (Ecuador) Plain text

Structured text

Pronunciation
es-ES Spanish (Spain) Audio + human-labeled transcript

Plain text

Structured text

Output format

Pronunciation

Phrase list
es-GQ Spanish (Equatorial Guinea) Audio + human-labeled transcript

Plain text

Structured text
es-GT Spanish (Guatemala) Audio + human-labeled transcript

Plain text

Structured text

Pronunciation
es-HN Spanish (Honduras) Audio + human-labeled transcript

Plain text

Structured text

Pronunciation
es-MX Spanish (Mexico) Audio + human-labeled transcript

Plain text

Structured text

Output format

Pronunciation

Phrase list
es-NI Spanish (Nicaragua) Audio + human-labeled transcript

Plain text

Structured text

Pronunciation
es-PA Spanish (Panama) Audio + human-labeled transcript

Plain text

Structured text

Pronunciation
es-PE Spanish (Peru) Audio + human-labeled transcript

Plain text

Structured text

Pronunciation
es-PR Spanish (Puerto Rico) Audio + human-labeled transcript

Plain text

Structured text

Pronunciation
es-PY Spanish (Paraguay) Audio + human-labeled transcript

Plain text

Structured text

Pronunciation
es-SV Spanish (El Salvador) Audio + human-labeled transcript

Plain text

Structured text

Pronunciation
es-US Spanish (United States) Plain text

Structured text

Pronunciation

Phrase list
es-UY Spanish (Uruguay) Audio + human-labeled transcript

Plain text

Structured text

Pronunciation
es-VE Spanish (Venezuela) Plain text

Structured text

Pronunciation
et-EE Estonian (Estonia) Plain text

Pronunciation
eu-ES Basque Plain text
fa-IR Persian (Iran) Plain text
fi-FI Finnish (Finland) Audio + human-labeled transcript

Plain text

Structured text

Output format

Pronunciation
fil-PH Filipino (Philippines) Plain text

Pronunciation
fr-BE French (Belgium) Plain text
fr-CA French (Canada) Plain text

Structured text

Output format

Pronunciation

Phrase list
fr-CH French (Switzerland) Plain text

Pronunciation
fr-FR French (France) Audio + human-labeled transcript

Plain text

Structured text

Output format

Pronunciation

Phrase list
ga-IE Irish (Ireland) Plain text

Pronunciation
gl-ES Galician Plain text
gu-IN Gujarati (India) Plain text
he-IL Hebrew (Israel) Audio + human-labeled transcript

Plain text
hi-IN Hindi (India) Audio + human-labeled transcript

Plain text

Output format

Phrase list
hr-HR Croatian (Croatia) Plain text

Pronunciation
hu-HU Hungarian (Hungary) Plain text

Pronunciation
hy-AM Armenian (Armenia) Plain text
id-ID Indonesian (Indonesia) Audio + human-labeled transcript

Plain text

Structured text

Pronunciation

Phrase list
is-IS Icelandic (Iceland) Plain text
it-CH Italian (Switzerland) Plain text
it-IT Italian (Italy) Audio + human-labeled transcript

Plain text

Structured text

Output format

Pronunciation

Phrase list
ja-JP Japanese (Japan) Audio + human-labeled transcript

Plain text

Structured text

Output format

Phrase list
jv-ID Javanese (Latin, Indonesia) Plain text
ka-GE Georgian (Georgia) Plain text
kk-KZ Kazakh (Kazakhstan) Plain text
km-KH Khmer (Cambodia) Plain text
kn-IN Kannada (India) Plain text
ko-KR Korean (Korea) Audio + human-labeled transcript

Plain text

Structured text

Output format

Phrase list
lo-LA Lao (Laos) Plain text
lt-LT Lithuanian (Lithuania) Plain text

Pronunciation
lv-LV Latvian (Latvia) Plain text

Pronunciation
mk-MK Macedonian (North Macedonia) Plain text
ml-IN Malayalam (India) Plain text
mn-MN Mongolian (Mongolia) Plain text
mr-IN Marathi (India) Plain text
ms-MY Malay (Malaysia) Plain text
mt-MT Maltese (Malta) Plain text
my-MM Burmese (Myanmar) Plain text
nb-NO Norwegian Bokmål (Norway) Plain text

Output format
ne-NP Nepali (Nepal) Plain text
nl-BE Dutch (Belgium) Plain text
nl-NL Dutch (Netherlands) Plain text

Output format

Pronunciation

Phrase list
pa-IN Punjabi (India) Audio + human-labeled transcript
pl-PL Polish (Poland) Audio + human-labeled transcript

Plain text

Structured text

Output format

Pronunciation

Phrase list
ps-AF Pashto (Afghanistan) Plain text
pt-BR Portuguese (Brazil) Audio + human-labeled transcript

Plain text

Structured text

Output format

Pronunciation

Phrase list
pt-PT Portuguese (Portugal) Audio + human-labeled transcript

Plain text

Structured text

Output format

Pronunciation

Phrase list
ro-RO Romanian (Romania) Plain text

Pronunciation
ru-RU Russian (Russia) Audio + human-labeled transcript

Plain text

Phrase list
si-LK Sinhala (Sri Lanka) Plain text
sk-SK Slovak (Slovakia) Plain text

Pronunciation
sl-SI Slovenian (Slovenia) Plain text

Pronunciation
so-SO Somali (Somalia) Plain text
sq-AL Albanian (Albania) Plain text
sr-RS Serbian (Cyrillic, Serbia) Plain text
sv-SE Swedish (Sweden) Audio + human-labeled transcript

Plain text

Output format

Pronunciation

Phrase list
sw-KE Kiswahili (Kenya) Plain text
sw-TZ Kiswahili (Tanzania) Plain text
ta-IN Tamil (India) Plain text
te-IN Telugu (India) Plain text
th-TH Thai (Thailand) Audio + human-labeled transcript

Plain text

Structured text

Phrase list
tr-TR Turkish (Türkiye) Audio + human-labeled transcript

Plain text

Structured text

Output format
uk-UA Ukrainian (Ukraine) Plain text
ur-IN Urdu (India) Audio + human-labeled transcript
uz-UZ Uzbek (Latin, Uzbekistan) Plain text
vi-VN Vietnamese (Vietnam) Plain text

Phrase list
wuu-CN Chinese (Wu, Simplified) Plain text
yue-CN Chinese (Cantonese, Simplified) Plain text
zh-CN Chinese (Mandarin, Simplified) Audio + human-labeled transcript

Plain text

Structured text

Output format

Phrase list
zh-CN-shandong Chinese (Jilu Mandarin, Simplified) Plain text
zh-CN-sichuan Chinese (Southwestern Mandarin, Simplified) Plain text
zh-HK Chinese (Cantonese, Traditional) Audio + human-labeled transcript

Plain text

Structured text

Output format

Phrase list
zh-TW Chinese (Taiwanese Mandarin, Traditional) Audio + human-labeled transcript

Plain text

Structured text

Phrase list
zu-ZA isiZulu (South Africa) Plain text

1 The model is bilingual and also supports English.

Custom speech

To improve speech to text recognition accuracy, customization is available for some languages and base models. Depending on the locale, you can upload audio + human-labeled transcripts, plain text, structured text, and pronunciation data. By default, plain text customization is supported for all available base models. To learn more about customization, see custom speech.

These are the locales that support the display text format feature: da-DK, de-DE, en-AU, en-CA, en-GB, en-HK, en-IE, en-IN, en-NG, en-NZ, en-PH, en-SG, en-US, es-ES, es-MX, fi-FI, fr-CA, fr-FR, hi-IN, it-IT, ja-JP, ko-KR, nb-NO, nl-NL, pl-PL, pt-BR, pt-PT, sv-SE, tr-TR, zh-CN, zh-HK.

Fast transcription

The supported locales for the fast transcription API are: en-US, es-ES, es-MX, fr-FR, hi-IN, it-IT, ja-JP, ko-KR, pt-BR, and zh-CN. You can only specify one locale per transcription request.

Next steps