Speech-to-text documentation
Speech-to-text from the Speech service, also known as speech recognition, enables real-time and batch transcription of audio streams into text. With additional reference text input, it also enables real-time pronunciation assessment and gives speakers feedback on the accuracy and fluency of spoken audio.
About speech-to-text
Overview
- What is real-time speech-to-text?
- What is batch speech-to-text?
- What is Custom Speech?
- Use the Speech CLI for speech-to-text with no code
Quickstart
Develop with speech-to-text
How-To Guide
- Choose speech recognition mode
- Improve accuracy with Custom Speech
- Use compressed audio input formats
- Migrate from v3.0 to v3.1