Text-to-speech (TTS) for Windows Phone 8

Article
05/20/2016

[ This article is for Windows Phone 8 developers. If you’re developing for Windows 10, see the latest documentation. ]

You can use the Windows.Phone.Speech.Synthesis API to generate synthesized speech, also known as text-to-speech (TTS), in your Windows Phone 8 app. For example, your app could prompt the user for input, read the contents of a message, present search results, and more.

Note

To use TTS, you must set the ID_CAP_SPEECH_RECOGNITION capability in the app manifest. If you don’t set this capability, your app might not work correctly. For more info, see App capabilities and hardware requirements for Windows Phone 8.

Basic TTS example

The quickest and easiest way to generate TTS is to provide a plain text string to the SpeechSynthesizerSpeakTextAsync(String) method. The following code example shows how to do this in the handler for the Click event of a button.

private async void ButtonSimpleTTS_Click(object sender, RoutedEventArgs e)
{
  SpeechSynthesizer synth = new SpeechSynthesizer();
    
  await synth.SpeakTextAsync("You have a meeting with Peter in 15 minutes.");
}

Typically, you'll apply the await operator to the SpeakTextAsync method, and use the async modifier on the method that contains SpeakTextAsync. Because the asynchronous SpeakTextAsync method is preceded by the await operator, SpeakTextAsync returns immediately and doesn’t wait for the speech synthesizer to finish speaking the text string. However, the await operator suspends execution of the containing method until SpeakTextAsync completes.

Selecting a speaking voice

Windows Phone 8 includes speaking voices for a variety of languages. Each voice generates synthesized speech in a single language, as spoken in a specific country/region. After you create a SpeechSynthesizer object, you can specify the language of a voice to load. A SpeechSynthesizer instance can load any voice that is installed on the phone and use it to generate speech. If no language is specified, the API will load a voice that matches the language that the user selected in Settings/Speech on the phone.

The following example creates an instance of a speech synthesizer and sets its language with the help of a LINQ query. The LINQ query searches through the VoiceInformation objects that describe each of the installed voices to find one whose language property has a value of "fr-FR", which indicates the French language as spoken in France. The variable frenchVoices is implicitly typed as a VoiceInformation object.

The argument to the SpeechSynthesizerSetVoice(VoiceInformation) method specifies an index for the voices returned by the LINQ query. The query returns two voices, because there are two French voices installed; one is female, the other is male. To return only the female or only the male voice that speaks French, you can add an expression to the where clause that filters for gender.

// Declare the SpeechSynthesizer object at the class level.
SpeechSynthesizer synth;

// Handle the button click event.
private async void SpeakFrench_Click_1(object sender, RoutedEventArgs e)
{
  // Initialize the SpeechSynthesizer object.
  synth = new SpeechSynthesizer();

  // Query for a voice that speaks French.
  IEnumerable<VoiceInformation> frenchVoices = from voice in InstalledVoices.All
                     where voice.Language == "fr-FR"
                     select voice;
            
  // Set the voice as identified by the query.
  synth.SetVoice(frenchVoices.ElementAt(0));

  // Count in French.
  await synth.SpeakTextAsync("un, deux, trois, quatre");
}

You can also select a speaking voice of a particular language using Speech Synthesis Markup Language (SSML). For more info, see Speech Synthesis Markup Language Reference.

Starting TTS

The speech synthesizer can speak either plain text or text that contains markup that conforms to the Speech Synthesis Markup Language (SSML) Version 1.0. You can either insert SSML markup inline in your code, or reference a standalone SSML document from your code. The speech synthesis API has three methods to initiate speech output, each of which speaks one of the accepted source formats.

SpeechSynthesizerSpeakTextAsync(String). Speaks a string of plain text that you provide as an argument to the method. For more info, see the code example at the top of this topic.
SpeechSynthesizerSpeakSsmlAsync(String). Speaks a string of text with SSML markup that you provide as an argument to the method.
SpeechSynthesizerSpeakSsmlFromUriAsync(Uri). Speaks the contents of a standalone SSML document, which you reference in argument to the method.

SpeakSsmlAsync code example

The following code example shows how to use the SpeakSsmlAsync method to speak a string of text with SSML markup.

// Speaks a string of text with SSML markup.
private async void SpeakSsml_Click(object sender, RoutedEventArgs e)
{
   SpeechSynthesizer synth = new SpeechSynthesizer();

   // Build an SSML prompt in a string.
   string ssmlPrompt = "<speak version=\"1.0\" ";
   ssmlPrompt += "xmlns=\"http://www.w3.org/2001/10/synthesis\" xml:lang=\"en-US\">";
   ssmlPrompt += "This voice speaks English. </speak>";

   // Speak the SSML prompt.
   await synth.SpeakSsmlAsync(ssmlPrompt);
}

SpeakSsmlFromUriAsync code example

The following code example shows how to use the SpeakSsmlFromUriAsync method to speak the contents of a standalone SSML document.

// Speaks the content of a standalone SSML file.
private async void SpeakSsmlFromFile_Click(object sender, RoutedEventArgs e)
{
   SpeechSynthesizer synth = new SpeechSynthesizer();

   // Set the path to the SSML-compliant XML file.
   string path = Package.Current.InstalledLocation.Path + "\\ChangeVoice.ssml";
   Uri changeVoice = new Uri(path, UriKind.Absolute);

   // Speak the SSML prompt.
   await synth.SpeakSsmlFromUriAsync(changeVoice);
}

To make sure that the SSML prompt file is correctly deployed, add the file to your solution using the Add > Existing Item command from Solution Explorer. Set the Build Action property for the file to Content, and set the Copy To Output Directory property to Copy if newer. Then use the path syntax given in the above code example to reference the SSML prompt.

TTS errors and exceptions

You might encounter TTS errors and exceptions when working with the feature. For more info about these errors and exceptions, see Handling errors in speech apps for Windows Phone 8.