Phonetic Alphabet Reference

A phonetic alphabet contains combinations of letters, numbers, and characters which are known as “phones”. A phone represents a discrete sound in a spoken language. Phones are used to create phonetic spellings that determine how a word should be pronounced to be recognized or spoken. System.Speech supports three phonetic alphabets:

  • International Phonetic Alphabet (IPA). A system of phonetic notation based in part on the Latin alphabet, devised as a standardized representation of the sounds of spoken language. You can use this phonetic alphabet to specify pronunciations for any language.

  • Universal Phone Set (UPS). A machine-readable phonetic alphabet, created by Microsoft, which is based on the International Phonetic Alphabet (IPA). You can use this phonetic alphabet to specify pronunciations for any language except those that use the SAPI phonetic alphabet, see the next item.

  • Speech API (SAPI) Phone Set. The pronunciation alphabet used in System.Speech for the following languages:

Language-Culture Code

Language Name

Language ID


Chinese (Taiwan)



Chinese (PRC)



English (United States)



French (Standard)



German (Standard)






Spanish (Spain, Traditional Sort)


Speech Sounds

Humans create speech sounds by generating airflow with one or more of the lungs, ribs, diaphragm, larynx, tongue, or cheeks and by modifying the airflow in the vocal tract. Typically, some part of the tongue moves relative to some part of the roof of the mouth to restrict the airflow in varying degrees.

From greatest to least stricture, speech sounds may be classified as stop consonants (with occlusion, or blocked airflow), fricative consonants (with partially blocked and therefore strongly turbulent airflow), approximants (with reduced airflow but no turbulence), and vowels (with full unimpeded airflow). Affricates are a sequence of a stop consonant plus a fricative consonant that behave as a single phoneme.

Phone Tables

This section contains lists of phones for each of the speech sound classifications. The tables encompass the phonetic alphabets that System.Speech supports, and include Unicode and ASCII equivalents, where applicable.

  • Consonants are speech sounds that are articulated with complete (voiceless) or partial closure of the vocal tract.

  • Vowels are speech sounds that are articulated with an open vocal tract.

  • Diacritics are used to modify segmental phones (vowels, consonants, clicks, and ejectives) with additional phonetic detail.

  • Suprasegmentals describe the features of a language above the level of individual consonants and vowels, such as prosody, tone, length, and stress.

  • Clicks and Ejectives are voiceless consonants with specific velaric and glottalic airflow features. An example of a click in US English is tsk! tsk!.

  • Tones describe the use of pitch in speech sounds to distinguish lexical or grammatical meaning.

  • Other Phones contains rare phones that are not included in the main IPA consonant table.