Text-to-Speech

Article
11/29/2023

This article describes how you can use the .NET Multi-platform App UI (.NET MAUI) ITextToSpeech interface. This interface enables an application to utilize the built-in text-to-speech engines to speak back text from the device. You can also use it to query for available languages.

The default implementation of the ITextToSpeech interface is available through the TextToSpeech.Default property. Both the ITextToSpeech interface and TextToSpeech class are contained in the Microsoft.Maui.Media namespace.

Get started

To access text-to-speech functionality, the following platform-specific setup is required.

If your project's Target Android version is set to Android 11 (R API 30) or higher, you must update your Android Manifest with an intent filter for the text-to-speech (TTS) engine. For more information about intents, see Android's documentation on Intents and Intent Filters.

In the Platforms/Android/AndroidManifest.xml file, add the following queries/intent nodes to the manifest node:

<queries>
  <intent>
    <action android:name="android.intent.action.TTS_SERVICE" />
  </intent>
</queries>

Using Text-to-Speech

Text-to-speech works by calling the SpeakAsync method with the text to speak, as the following code example demonstrates:

public async void Speak() =>
    await TextToSpeech.Default.SpeakAsync("Hello World");

This method takes in an optional CancellationToken to stop the utterance once it starts.

CancellationTokenSource cts;

public async Task SpeakNowDefaultSettingsAsync()
{
    cts = new CancellationTokenSource();
    await TextToSpeech.Default.SpeakAsync("Hello World", cancelToken: cts.Token);

    // This method will block until utterance finishes.
}

// Cancel speech if a cancellation token exists & hasn't been already requested.
public void CancelSpeech()
{
    if (cts?.IsCancellationRequested ?? true)
        return;
    
    cts.Cancel();
}

Text-to-Speech will automatically queue speech requests from the same thread.

bool isBusy = false;

public void SpeakMultiple()
{
    isBusy = true;

    Task.WhenAll(
        TextToSpeech.Default.SpeakAsync("Hello World 1"),
        TextToSpeech.Default.SpeakAsync("Hello World 2"),
        TextToSpeech.Default.SpeakAsync("Hello World 3"))
        .ContinueWith((t) => { isBusy = false; }, TaskScheduler.FromCurrentSynchronizationContext());
}

Settings

To control the volume, pitch, and locale of the voice, use the SpeechOptions class. Pass an instance of that class to the SpeakAsync(String, SpeechOptions, CancellationToken) method. The GetLocalesAsync() method retrieves a collection of the locales provided by the operating system.

public async void SpeakSettings()
{
    IEnumerable<Locale> locales = await TextToSpeech.Default.GetLocalesAsync();

    SpeechOptions options = new SpeechOptions()
    {
        Pitch = 1.5f,   // 0.0 - 2.0
        Volume = 0.75f, // 0.0 - 1.0
        Locale = locales.FirstOrDefault()
    };

    await TextToSpeech.Default.SpeakAsync("How nice to meet you!", options);
}

The following are supported values for these parameters:

Parameter	Minimum	Maximum
`Pitch`	0	2.0
`Volume`	0	1.0

Limitations

Utterance queueing isn't guaranteed if called across multiple threads.
Background audio playback isn't officially supported.

Text-to-Speech

Get started

Using Text-to-Speech

Settings

Limitations

Feedback

Additional resources