Speech Synthesis Markup Language (SSML) overview
Speech Synthesis Markup Language (SSML) is an XML-based markup language that can be used to fine-tune the text-to-speech output attributes such as pitch, pronunciation, speaking rate, volume, and more. You have more control and flexibility compared to plain text input.
Tip
You can hear voices in different styles and pitches reading example text via the Voice Gallery.
Scenarios
You can use SSML to:
- Define the input text structure that determines the structure, content, and other characteristics of the text-to-speech output. For example, you can use SSML to define a paragraph, a sentence, a break or a pause, or silence. You can wrap text with event tags such as bookmark or viseme that can be processed later by your application.
- Choose the voice, language, name, style, and role. You can use multiple voices in a single SSML document. Adjust the emphasis, speaking rate, pitch, and volume. You can also use SSML to insert pre-recorded audio, such as a sound effect or a musical note.
- Control pronunciation of the output audio. For example, you can use SSML with phonemes and a custom lexicon to improve pronunciation. You can also use SSML to define how a word or mathematical expression is pronounced.
Use SSML
Important
You're billed for each character that's converted to speech, including punctuation. Although the SSML document itself is not billable, optional elements that are used to adjust how the text is converted to speech, like phonemes and pitch, are counted as billable characters. For more information, see text-to-speech pricing notes.
You can use SSML in the following ways:
- Audio Content Creation tool: Author plain text and SSML in Speech Studio: You can listen to the output audio and adjust the SSML to improve speech synthesis. For more information, see Speech synthesis with the Audio Content Creation tool.
- Batch synthesis API: Provide SSML via the
inputs
property. - Speech CLI: Provide SSML via the
spx synthesize --ssml SSML
command line argument. - Speech SDK: Provide SSML via the "speak" SSML method.
Next steps
Feedback
Submit and view feedback for