Hello Ananth !
Thank you for posting on Microsoft Learn.
This functionality is in preview and not available by default. You’ll need to:
- Apply for access via Microsoft. This usually involves filling out a preview request form or contacting your Microsoft AI or Azure account representative.
- Once approved, navigate to your Speech resource in the Azure portal.
- Under Preview features (or similar), look for Text-to-Speech Avatars or Avatar real-time streaming, and switch it ON.
After approval and enabling preview:
- Verify that you’re using the Speech SDK v1.40+ or later
- Use regions that support avatars, and your Speech resource S0 tier
- In code, create a WebSocket or WebRTC connection using the real-time avatar endpoint. Microsoft provides detailed samples for JavaScript, Python, C# + JS mobile clients, showing how to consume both live video and viseme (lip-sync) streams.
Links to help you :
https://www.youtube.com/watch?v=JUIs063K6z
Your resource should be located in one of these regions and uses S0 (Standard) tier.
- Southeast Asia
- North Europe
- West Europe
- Sweden Central
- South Central US
- East US 2
- West US 2
Supported voices include standard neural voices and other neural TTS voices listed in the region voice catalog).