Get facial pose events - Text-to-Speech

Question

Hi !

I see that viseme events are only available for en-US-AriaNeural voice for now.
I was wondering if support for other languages (especially French) is planned in the near future?
And if so, is an approximate date known?

Thanks,
gma

Accepted Answer

Thanks for your answer @Ramr-msft . I am a developer for a company and we indeed want to animate 3d characters. We also need realtime viseme.
To be more precise, we are looking to something like this :

{"time":0,"type":"sentence","start":0,"end":23,"value":"Mary had a little lamb."}
{"time":6,"type":"word","start":0,"end":4,"value":"Mary"}
{"time":6,"type":"viseme","value":"p"}
{"time":73,"type":"viseme","value":"E"}
{"time":180,"type":"viseme","value":"r"}
{"time":292,"type":"viseme","value":"i"}
...

Your solution to get facial pose events may interest us but we need it in more than just en-US-AriaNeural voice. Specially, french voice is important for us.
Hence my question about an approximate date for this functionality in other languages

Thanks,
gma

Get facial pose events - Text-to-Speech

0 additional answers