What is speech translation?

In this article, you learn about the benefits and capabilities of the speech translation service, which enables real-time, multi-language speech-to-speech and speech to text translation of audio streams.

By using the Speech SDK or Speech CLI, you can give your applications, tools, and devices access to source transcriptions and translation outputs for the provided audio. Interim transcription and translation results are returned as speech is detected, and the final results can be converted into synthesized speech.

For a list of languages supported for speech translation, see Language and voice support.

Core features

  • Speech to text translation with recognition results.
  • Speech-to-speech translation.
  • Support for translation to multiple target languages.
  • Interim recognition and translation results.

Get started

As your first step, try the Speech translation quickstart. The speech translation service is available via the Speech SDK and the Speech CLI.

You'll find Speech SDK speech to text and translation samples on GitHub. These samples cover common scenarios, such as reading audio from a file or stream, continuous and single-shot recognition and translation, and working with custom models.

Next steps