Note

Please see Azure Cognitive Services for Speech documentation for the latest supported speech solutions.

Microsoft.Speech.Synthesis Namespace

The Microsoft.Speech.Synthesis namespace contains classes for initializing and configuring a speech synthesis engine, for creating prompts, for generating speech, for responding to events, and for modifying voice characteristics. Speech synthesis is often referred to as text-to-speech or TTS.

Initialize and Configure

The SpeechSynthesizer class provides access to the functionality of a speech synthesis engine in the Microsoft Speech Platform Runtime 11. A TTS engine can use any of the installed Runtime Languages as a voice to perform text-to-speech in a particular language, for example Microsoft Helen for US English.

A voice is an installed Runtime Language for speech synthesis (TTS, or text-to-speech). The Speech Platform Runtime 11 and Microsoft Speech Platform SDK 11 do not include any Runtime Languages for speech synthesis. You must download and install a Runtime Language for each language in which you want to generate synthesized speech. A Runtime Language includes the language model, acoustic model, and other data necessary to provision a speech engine to perform speech synthesis in a particular language. See InstalledVoice for more information.

To configure a SpeechSynthesizer instance to use one of the installed voices, call the SelectVoice(String) or SelectVoiceByHints() methods. To get information about which voices are installed, use the GetInstalledVoices() method.

You can route the output of the SpeechSynthesizer to a stream, a file, the default audio device, or to a null device by using one of the methods in the SpeechSynthesizer class whose name begins with “SetOutputTo”.

Create Prompts

Use one the methods of the PromptBuilder class whose name begins with “Append” to build content for prompts from text, Speech Synthesis Markup Language (SSML), files containing text or SSML markup, or prerecorded audio files.

See Construct a Complex Prompt (Microsoft.Speech)in the Microsoft Speech Programming Guide for more information and examples.

Generate Speech

To generate speech from a string or from a Prompt or PromptBuilder object, use the Speak() or the SpeakAsync() methods. To generate speech from SSML markup, use the SpeakSsml(String) or the SpeakSsmlAsync(String) methods. See Speech Synthesis Markup Language Reference (Microsoft.Speech)for a guide to SSML markup.

You can guide the pronunciation of words by using the AppendTextWithHint() or AppendTextWithPronunciation(String, String) methods, and by adding or removing lexicons for a SpeechSynthesizer instance using the AddLexicon(Uri, String) and RemoveLexicon(Uri) methods.

Respond to Events

The SpeechSynthesizer class includes events that inform a speech application that the SpeechSynthesizer encountered a specific feature in a prompt, as reported by the SpeakProgressEventArgs, BookmarkReachedEventArgs, and VoiceChangeEventArgs classes.

To get information about the beginning and end of the speaking of a prompt by the SpeechSynthesizer, use the SpeakStartedEventArgs and SpeakCompletedEventArgs classes.

See Use Speech Synthesis Events (Microsoft.Speech) in the Microsoft Speech Programming Guide for more information and examples.

Modify Voice Characteristics

The PromptStyle class and StartStyle(PromptStyle) and AppendText() methods let you modify characteristics of a SpeechSynthesizer voice using Emphasis, Rate, and Volume parameters. To modify characteristics of a voice such as culture, age, and gender, use one of the StartVoice() methods of the PromptBuilder class or the SelectVoiceByHints() methods of the SpeechSynthesizer class.

See Control Voice Attributes (Microsoft.Speech) in the Microsoft Speech Programming Guide for more information and examples.

Classes

  Class Description
Public class BookmarkReachedEventArgs Returns data from the BookmarkReached event.
Public class FilePrompt Represents a prompt created from a file.
Public class InstalledVoice Contains information about an installed speech synthesis voice.
Public class Prompt Represents information about what can be rendered, either text or an audio file, by the SpeechSynthesizer.
Public class PromptBuilder Creates an empty Prompt object and provides methods for adding content, selecting voices, controlling voice attributes, and controlling the pronunciation of spoken words.
Public class PromptEventArgs Represents the base class for EventArgs classes in the Microsoft.Speech.Synthesis namespace.
Public class PromptStyle Defines a style for speaking prompts that consists of settings for emphasis, rate, and volume.
Public class ProprietaryEngineEventArgs Returns data from an event raised by a proprietary speech synthesis engine.
Public class SpeakCompletedEventArgs Returns notification from the SpeakCompleted event.
Public class SpeakProgressEventArgs Returns data from the SpeakProgress event.
Public class SpeakStartedEventArgs Returns notification from the SpeakStarted event.
Public class SpeechSynthesizer Provides access to the functionality of an installed a speech synthesis engine.
Public class StateChangedEventArgs Returns data from the StateChanged event.
Public class VoiceChangeEventArgs Returns data from the VoiceChange event.
Public class VoiceInfo Represents an installed Runtime Language for speech synthesis.

Enumerations

  Enumeration Description
Public enumeration PromptBreak Enumerates values for intervals of prosodic separation (breaks) between word boundaries.
Public enumeration PromptEmphasis Enumerates values for the levels of speaking emphasis in prompts.
Public enumeration PromptRate Enumerates values for the speaking rate of prompts.
Public enumeration PromptVolume Enumerates values for volume levels (loudness) in prompts.
Public enumeration SayAs Enumerates the content types for the speaking of elements such as times, dates, and currency.
Public enumeration SynthesisMediaType Enumerates the types of media files.
Public enumeration SynthesisTextFormat Enumerates the types of text formats that may be used to construct a Prompt object.
Public enumeration SynthesizerState Enumerates values for the state of the SpeechSynthesizer.
Public enumeration VoiceAge Defines the values for the age of a synthesized voice.
Public enumeration VoiceGender Defines the values for the gender of synthesized voices.