SpeechSynthesizer Class
Definition
Important
Some information relates to prerelease product that may be substantially modified before it’s released. Microsoft makes no warranties, express or implied, with respect to the information provided here.
Provides access to the functionality of an installed speech synthesis engine (voice) for Text-to-speech (TTS) services.
public ref class SpeechSynthesizer sealed : IClosable
/// [Windows.Foundation.Metadata.Activatable(65536, Windows.Foundation.UniversalApiContract)]
/// [Windows.Foundation.Metadata.ContractVersion(Windows.Foundation.UniversalApiContract, 65536)]
/// [Windows.Foundation.Metadata.MarshalingBehavior(Windows.Foundation.Metadata.MarshalingType.Agile)]
class SpeechSynthesizer final : IClosable
/// [Windows.Foundation.Metadata.ContractVersion(Windows.Foundation.UniversalApiContract, 65536)]
/// [Windows.Foundation.Metadata.MarshalingBehavior(Windows.Foundation.Metadata.MarshalingType.Agile)]
/// [Windows.Foundation.Metadata.Activatable(65536, "Windows.Foundation.UniversalApiContract")]
class SpeechSynthesizer final : IClosable
[Windows.Foundation.Metadata.Activatable(65536, typeof(Windows.Foundation.UniversalApiContract))]
[Windows.Foundation.Metadata.ContractVersion(typeof(Windows.Foundation.UniversalApiContract), 65536)]
[Windows.Foundation.Metadata.MarshalingBehavior(Windows.Foundation.Metadata.MarshalingType.Agile)]
public sealed class SpeechSynthesizer : System.IDisposable
[Windows.Foundation.Metadata.ContractVersion(typeof(Windows.Foundation.UniversalApiContract), 65536)]
[Windows.Foundation.Metadata.MarshalingBehavior(Windows.Foundation.Metadata.MarshalingType.Agile)]
[Windows.Foundation.Metadata.Activatable(65536, "Windows.Foundation.UniversalApiContract")]
public sealed class SpeechSynthesizer : System.IDisposable
function SpeechSynthesizer()
Public NotInheritable Class SpeechSynthesizer
Implements IDisposable
- Inheritance
- Attributes
- Implements
Windows requirements
Device family |
Windows 10 (introduced in 10.0.10240.0)
|
API contract |
Windows.Foundation.UniversalApiContract (introduced in v1.0)
|
Examples
The following example shows how to generate a speech audio stream from a basic text string.
// The media object for controlling and playing audio.
MediaElement mediaElement = this.media;
// The object for controlling the speech synthesis engine (voice).
var synth = new Windows.Media.SpeechSynthesis.SpeechSynthesizer();
// Generate the audio stream from plain text.
SpeechSynthesisStream stream = await synth.SynthesizeTextToStreamAsync("Hello World");
// Send the stream to the media object.
mediaElement.SetSource(stream, stream.ContentType);
mediaElement.Play();
// The object for controlling the speech synthesis engine (voice).
synth = ref new SpeechSynthesizer();
// The media object for controlling and playing audio.
media = ref new MediaElement();
// The string to speak.
String^ text = "Hello World";
// Generate the audio stream from plain text.
task<SpeechSynthesisStream ^> speakTask = create_task(synth->SynthesizeTextToStreamAsync(text));
speakTask.then([this, text](SpeechSynthesisStream ^speechStream)
{
// Send the stream to the media object.
// media === MediaElement XAML object.
media->SetSource(speechStream, speechStream->ContentType);
media->AutoPlay = true;
media->Play();
});
This example shows how to generate a speech audio stream from an SSML string, which includes some modulation elements that control the pitch, speaking rate, and volume of the speech output.
// The string to speak with SSML customizations.
string Ssml =
@"<speak version='1.0' " +
"xmlns='http://www.w3.org/2001/10/synthesis' xml:lang='en-US'>" +
"Hello <prosody contour='(0%,+80Hz) (10%,+80%) (40%,+80Hz)'>World</prosody> " +
"<break time='500ms'/>" +
"Goodbye <prosody rate='slow' contour='(0%,+20Hz) (10%,+30%) (40%,+10Hz)'>World</prosody>" +
"</speak>";
// The media object for controlling and playing audio.
MediaElement mediaElement = this.media;
// The object for controlling the speech synthesis engine (voice).
var synth = new Windows.Media.SpeechSynthesis.SpeechSynthesizer();
// Generate the audio stream from plain text.
SpeechSynthesisStream stream = await synth.synthesizeSsmlToStreamAsync(Ssml);
// Send the stream to the media object.
mediaElement.SetSource(stream, stream.ContentType);
mediaElement.Play();
// The object for controlling the speech synthesis engine (voice).
synth = ref new SpeechSynthesizer();
// The media object for controlling and playing audio.
media = ref new MediaElement();
// The string to speak.
String^ ssml =
"<speak version='1.0' "
"xmlns='http://www.w3.org/2001/10/synthesis' xml:lang='en-US'>"
"Hello <prosody contour='(0%,+80Hz) (10%,+80%) (40%,+80Hz)'>World</prosody>"
"<break time='500ms' /> "
"Goodbye <prosody rate='slow' contour='(0%,+20Hz) (10%,+30%) (40%,+10Hz)'>World</prosody>"
"</speak>";
// Generate the audio stream from SSML.
task<SpeechSynthesisStream ^> speakTask = create_task(synth->SynthesizeSsmlToStreamAsync(ssml));
speakTask.then([this, ssml](SpeechSynthesisStream ^speechStream)
{
// Send the stream to the media object.
// media === MediaElement XAML object.
media->SetSource(speechStream, speechStream->ContentType);
media->AutoPlay = true;
media->Play();
});
Remarks
Only Microsoft-signed voices installed on the system can be used to generate speech.
Windows includes various Microsoft-signed voices that can be used for a number of languages. Each voice generates synthesized speech in a single language, as spoken in a specific country/region.
By default, a new SpeechSynthesizer object uses the current system voice (call DefaultVoice to find out what the default voice is).
To specify any of the other speech synthesis (text-to-speech) voices installed on the user's system, use the Voice method (to find out which voices are installed on the system, call AllVoices).
If you don't specify a language, the voice that most closely corresponds to the language selected in the Language control panel is loaded.
Use a SpeechSynthesizer object to:
- Generate speech from plain text using SynthesizeTextToStreamAsync, or Speech Synthesis Markup Language (SSML) Version 1.1 using SynthesizeSsmlToStreamAsync (
- The generated audio stream is played through a MediaElement object), which lets you manage all media playback.
- Control the speech output with the various SpeechSynthesizerOptions settings exposed through SpeechSynthesizer.Options.
Version history
Windows version | SDK version | Value added |
---|---|---|
1703 | 15063 | Options |
1709 | 16299 | TrySetDefaultVoiceAsync |
Constructors
SpeechSynthesizer() |
Initializes a new instance of a SpeechSynthesizer object. |
Properties
AllVoices |
Gets a collection of all installed speech synthesis engines (voices). |
DefaultVoice |
Gets the default speech synthesis engine (voice). |
Options |
Gets a reference to the collection of options that can be set on the SpeechSynthesizer object. |
Voice |
Gets or sets the speech synthesis engine (voice). |
Methods
Close() |
Closes the SpeechSynthesizer and releases system resources. |
Dispose() |
Performs application-defined tasks associated with freeing, releasing, or resetting unmanaged resources. |
SynthesizeSsmlToStreamAsync(String) |
Asynchronously generate and control speech output from a Speech Synthesis Markup Language (SSML) Version 1.1 string. |
SynthesizeTextToStreamAsync(String) |
Asynchronously generate speech output from a string. |
TrySetDefaultVoiceAsync(VoiceInformation) |
Asynchronously attempts to set the voice used for speech synthesis on an IoT device. Note This method is available only in Embedded mode. |