Dear Microsoft Community,
I am interested in using your Custom Text-to-Speech Avatars service for my company. We are looking to create more engaging video content using AI-generated avatars that can speak text with realistic voices and animations.
I have some questions I'm hoping you can help me understand better:
Pricing and Costs:
- What are the pricing options for using this service (pay-per-use, monthly subscriptions, etc.)?
- Roughly how much does it cost to generate a minute of avatar video speech?
- Are there any minimum usage requirements or upfront costs?
Technical Requirements:
- What kind of data or inputs are required to create a custom avatar voice (audio recordings, text transcripts, etc.)?
- How much data is typically needed for good quality results?
- Are there recommendations for the recording equipment needed?
Getting Started:
- How do I go about requesting access to use this service?
- What is the typical timeline for onboarding, creating an avatar model, and deployment?
- Is there documentation or guides available I can review?
My company creates training videos, marketing content, virtual assistants and more. Being able to use AI avatars would really enhance our content creation capabilities. Please let me know any other details you need from me. I'd be happy to discuss our use case further.