Language support: custom models

This content applies to: checkmark v4.0 (preview) | Previous versions: blue-checkmark v3.1 (GA) blue-checkmark v3.0 (GA) blue-checkmark v2.1 (GA)

This content applies to: checkmark v3.1 (GA) | Latest version: purple-checkmark v4.0 (preview) | Previous versions: blue-checkmark v3.0 blue-checkmark v2.1

This content applies to: checkmark v3.0 (GA) | Latest versions: purple-checkmark v4.0 (preview) purple-checkmark v3.1 | Previous version: blue-checkmark v2.1

This content applies to: checkmark v2.1 | Latest version: blue-checkmark v4.0 (preview)

Azure AI Document Intelligence models provide multilingual document processing support. Our language support capabilities enable your users to communicate with your applications in natural ways and empower global outreach. Custom models are trained using your labeled datasets to extract distinct data from structured, semi-structured, and unstructured documents specific to your use cases. Standalone custom models can be combined to create composed models. The following tables list the available language and locale support by model and feature:

Custom classifier

Language—Locale code Default
English (United States)—en-US English (United States)—en-US
Language Code (optional)
Afrikaans af
Albanian sq
Arabic ar
Bulgarian bg
Chinese (Han (Simplified variant)) zh-Hans
Chinese (Han (Traditional variant)) zh-Hant
Croatian hr
Czech cs
Danish da
Dutch nl
Estonian et
Finnish fi
French fr
German de
Hebrew he
Hindi hi
Hungarian hu
Indonesian id
Italian it
Japanese ja
Korean ko
Latvian lv
Lithuanian lt
Macedonian mk
Marathi mr
Modern Greek (1453-) el
Nepali (macrolanguage) ne
Norwegian no
Panjabi pa
Persian fa
Polish pl
Portuguese pt
Romanian rm
Russian ru
Slovak sk
Slovenian sl
Somali (Arabic) so
Somali (Latin) so-latn
Spanish es
Swahili (macrolanguage) sw
Swedish sv
Tamil ta
Thai th
Turkish tr
Ukrainian uk
Urdu ur
Vietnamese vi

Custom neural

The following table lists the supported languages for printed text.

Language Code (optional)
Afrikaans af
Albanian sq
Arabic ar
Bulgarian bg
Chinese Simplified zh-Hans
Chinese Traditional zh-Hant
Croatian hr
Czech cs
Danish da
Dutch nl
Estonian et
Finnish fi
French fr
German de
Hebrew he
Hindi hi
Hungarian hu
Indonesian id
Italian it
Japanese ja
Korean ko
Latvian lv
Lithuanian lt
Macedonian mk
Marathi mr
Modern Greek (1453-) el
Nepali (macrolanguage) ne
Norwegian no
Panjabi pa
Persian fa
Polish pl
Portuguese pt
Romanian rm
Russian ru
Slovak sk
Slovenian sl
Somali (Arabic) so
Somali (Latin) so-latn
Spanish es
Swahili (macrolanguage) sw
Swedish sv
Tamil ta
Thai th
Turkish tr
Ukrainian uk
Urdu ur
Vietnamese vi

Neural models support added languages for the v3.1 and later APIs.

Languages API version
English v4.0:2024-02-29-preview, 2023-10-31-preview, v3.1:2023-07-31 (GA), v3.0:2022-08-31 (GA)
German v4.0:2024-02-29-preview, 2023-10-31-preview, v3.1:2023-07-31 (GA)
Italian v4.0:2024-02-29-preview, 2023-10-31-preview, v3.1:2023-07-31 (GA)
French v4.0:2024-02-29-preview, 2023-10-31-preview, v3.1:2023-07-31 (GA)
Spanish v4.0:2024-02-29-preview, 2023-10-31-preview, v3.1:2023-07-31 (GA)
Dutch v4.0:2024-02-29-preview, 2023-10-31-preview, v3.1:2023-07-31 (GA)

Custom template

The following table lists the supported languages for printed text.

Language Code (optional)
Abaza abq
Abkhazian ab
Achinese ace
Acoli ach
Adangme ada
Adyghe ady
Afar aa
Afrikaans af
Akan ak
Albanian sq
Algonquin alq
Angika (Devanagari) anp
Arabic ar
Asturian ast
Asu (Tanzania) asa
Avaric av
Awadhi-Hindi (Devanagari) awa
Aymara ay
Azerbaijani (Latin) az
Bafia ksf
Bagheli bfy
Bambara bm
Bashkir ba
Basque eu
Belarusian (Cyrillic) be, be-cyrl
Belarusian (Latin) be, be-latn
Bemba (Zambia) bem
Bena (Tanzania) bez
Bhojpuri-Hindi (Devanagari) bho
Bikol bik
Bini bin
Bislama bi
Bodo (Devanagari) brx
Bosnian (Latin) bs
Brajbha bra
Breton br
Bulgarian bg
Bundeli bns
Buryat (Cyrillic) bua
Catalan ca
Cebuano ceb
Chamling rab
Chamorro ch
Chechen ce
Chhattisgarhi (Devanagari) hne
Chiga cgg
Chinese Simplified zh-Hans
Chinese Traditional zh-Hant
Choctaw cho
Chukot ckt
Chuvash cv
Cornish kw
Corsican co
Cree cr
Creek mus
Crimean Tatar (Latin) crh
Croatian hr
Crow cro
Czech cs
Danish da
Dargwa dar
Dari prs
Dhimal (Devanagari) dhi
Dogri (Devanagari) doi
Duala dua
Dungan dng
Dutch nl
Efik efi
English en
Erzya (Cyrillic) myv
Estonian et
Faroese fo
Fijian fj
Filipino fil
Finnish fi
Language Code (optional)
Fon fon
French fr
Friulian fur
Ga gaa
Gagauz (Latin) gag
Galician gl
Ganda lg
Gayo gay
German de
Gilbertese gil
Gondi (Devanagari) gon
Greek el
Greenlandic kl
Guarani gn
Gurung (Devanagari) gvr
Gusii guz
Haitian Creole ht
Halbi (Devanagari) hlb
Hani hni
Haryanvi bgc
Hawaiian haw
Hebrew he
Herero hz
Hiligaynon hil
Hindi hi
Hmong Daw (Latin) mww
Ho(Devanagiri) hoc
Hungarian hu
Iban iba
Icelandic is
Igbo ig
Iloko ilo
Inari Sami smn
Indonesian id
Ingush inh
Interlingua ia
Inuktitut (Latin) iu
Irish ga
Italian it
Japanese ja
Jaunsari (Devanagari) Jns
Javanese jv
Jola-Fonyi dyo
Kabardian kbd
Kabuverdianu kea
Kachin (Latin) kac
Kalenjin kln
Kalmyk xal
Kangri (Devanagari) xnr
Kanuri kr
Karachay-Balkar krc
Kara-Kalpak (Cyrillic) kaa-cyrl
Kara-Kalpak (Latin) kaa
Kashubian csb
Kazakh (Cyrillic) kk-cyrl
Kazakh (Latin) kk-latn
Khakas kjh
Khaling klr
Khasi kha
K'iche' quc
Kikuyu ki
Kildin Sami sjd
Kinyarwanda rw
Komi kv
Kongo kg
Korean ko
Korku kfq
Koryak kpy
Kosraean kos
Kpelle kpe
Kuanyama kj
Kumyk (Cyrillic) kum
Kurdish (Arabic) ku-arab
Kurdish (Latin) ku-latn
Kurukh (Devanagari) kru
Kyrgyz (Cyrillic) ky
Lak lbe
Lakota lkt
Language Code (optional)
Latin la
Latvian lv
Lezghian lex
Lingala ln
Lithuanian lt
Lower Sorbian dsb
Lozi loz
Lule Sami smj
Luo (Kenya and Tanzania) luo
Luxembourgish lb
Luyia luy
Macedonian mk
Machame jmc
Madurese mad
Mahasu Pahari (Devanagari) bfz
Makhuwa-Meetto mgh
Makonde kde
Malagasy mg
Malay (Latin) ms
Maltese mt
Malto (Devanagari) kmj
Mandinka mnk
Manx gv
Maori mi
Mapudungun arn
Marathi mr
Mari (Russia) chm
Masai mas
Mende (Sierra Leone) men
Meru mer
Meta' mgo
Minangkabau min
Mohawk moh
Mongolian (Cyrillic) mn
Mongondow mog
Montenegrin (Cyrillic) cnr-cyrl
Montenegrin (Latin) cnr-latn
Morisyen mfe
Mundang mua
Nahuatl nah
Navajo nv
Ndonga ng
Neapolitan nap
Nepali ne
Ngomba jgo
Niuean niu
Nogay nog
North Ndebele nd
Northern Sami (Latin) sme
Norwegian no
Nyanja ny
Nyankole nyn
Nzima nzi
Occitan oc
Ojibwa oj
Oromo om
Ossetic os
Pampanga pam
Pangasinan pag
Papiamento pap
Pashto ps
Pedi nso
Persian fa
Polish pl
Portuguese pt
Punjabi (Arabic) pa
Quechua qu
Ripuarian ksh
Romanian ro
Romansh rm
Rundi rn
Russian ru
Rwa rwk
Sadri (Devanagari) sck
Sakha sah
Samburu saq
Samoan (Latin) sm
Sango sg
Language Code (optional)
Sangu (Gabon) snq
Sanskrit (Devanagari) sa
Santali(Devanagiri) sat
Scots sco
Scottish Gaelic gd
Sena seh
Serbian (Cyrillic) sr-cyrl
Serbian (Latin) sr, sr-latn
Shambala ksb
Shona sn
Siksika bla
Sirmauri (Devanagari) srx
Skolt Sami sms
Slovak sk
Slovenian sl
Soga xog
Somali (Arabic) so
Somali (Latin) so-latn
Songhai son
South Ndebele nr
Southern Altai alt
Southern Sami sma
Southern Sotho st
Spanish es
Sundanese su
Swahili (Latin) sw
Swati ss
Swedish sv
Tabassaran tab
Tachelhit shi
Tahitian ty
Taita dav
Tajik (Cyrillic) tg
Tamil ta
Tatar (Cyrillic) tt-cyrl
Tatar (Latin) tt
Teso teo
Tetum tet
Thai th
Thangmi thf
Tok Pisin tpi
Tongan to
Tsonga ts
Tswana tn
Turkish tr
Turkmen (Latin) tk
Tuvan tyv
Udmurt udm
Uighur (Cyrillic) ug-cyrl
Ukrainian uk
Upper Sorbian hsb
Urdu ur
Uyghur (Arabic) ug
Uzbek (Arabic) uz-arab
Uzbek (Cyrillic) uz-cyrl
Uzbek (Latin) uz
Vietnamese vi
Volapük vo
Vunjo vun
Walser wae
Welsh cy
Western Frisian fy
Wolof wo
Xhosa xh
Yucatec Maya yua
Zapotec zap
Zarma dje
Zhuang za
Zulu zu