Note

Please see Azure Cognitive Services for Speech documentation for the latest supported speech solutions.

Microsoft Speech Platform

SPVISEMES

SPVISEMES lists the visemes defined by the Speech Platform. This set is based on the Disney 13 Visemes.  Examples given are for the SAPI English Phoneme set.

<pre IsFakePre="true" xmlns="http://www.w3.org/1999/xhtml"> <strong>typedef enum SPVISEMES {</strong> // English examples //------------------ <strong>SP_VISEME_0,</strong> // silence <strong>SP_VISEME_1,</strong> // ae, ax, ah <strong>SP_VISEME_2,</strong> // aa <strong>SP_VISEME_3,</strong> // ao <strong>SP_VISEME_4,</strong> // ey, eh, uh <strong>SP_VISEME_5,</strong> // er <strong>SP_VISEME_6,</strong> // y, iy, ih, ix <strong>SP_VISEME_7,</strong> // w, uw <strong>SP_VISEME_8,</strong> // ow <strong>SP_VISEME_9,</strong> // aw <strong>SP_VISEME_10,</strong> // oy <strong>SP_VISEME_11,</strong> // ay <strong>SP_VISEME_12,</strong> // h <strong>SP_VISEME_13,</strong> // r <strong>SP_VISEME_14,</strong> // l <strong>SP_VISEME_15,</strong> // s, z <strong>SP_VISEME_16,</strong> // sh, ch, jh, zh <strong>SP_VISEME_17,</strong> // th, dh <strong>SP_VISEME_18,</strong> // f, v <strong>SP_VISEME_19,</strong> // d, t, n <strong>SP_VISEME_20,</strong> // k, g, ng <strong>SP_VISEME_21</strong> // p, b, m <strong>} SPVISEMES;</strong> </pre>

Elements

  • SP_VISEME_0
    Silence
  • SP_VISEME_1
    ae, ax, ah
  • SP_VISEME_2
    aa
  • SP_VISEME_3
    ao
  • SP_VISEME_4
    ey, eh, uh
  • SP_VISEME_5
    er
  • SP_VISEME_6
    y, iy, ih, ix
  • SP_VISEME_7
    w, uw
  • SP_VISEME_8
    ow
  • SP_VISEME_9
    aw
  • SP_VISEME_10
    oy
  • SP_VISEME_11
    ay
  • SP_VISEME_12
    h
  • SP_VISEME_13
    r
  • SP_VISEME_14
    l
  • SP_VISEME_15
    s, z
  • SP_VISEME_16
    sh, ch, jh, zh
  • SP_VISEME_17
    th, dh
  • SP_VISEME_18
    f, v
  • SP_VISEME_19
    d, t, n
  • SP_VISEME_20
    k, g, ng
  • SP_VISEME_21
    p, b, m

Remarks

A viseme is the basic position of the mouth and face when pronouncing a phoneme. Visemes are visual representations of phonemes.