Types of speech API services

You can use the Azure Cognitive Services Speech service to perform spoken language transformations, including speech-to-text, text-to-speech, speech translation, and speaker recognition.

Note

Use Azure Cognitive Service for Language if you want to gather insights on terms or phrases or get detailed contextual analysis of spoken or written language.

Services

  • Speech-to-text can convert audio streams to text in real time or in batch.
  • Text-to-speech enables applications to convert text to human-like speech.
  • Speech translation provides multi-language speech-to-speech and speech-to-text translation of audio streams.

How to choose a speech service

This flow chart can help you choose the speech service that suits your needs:

Diagram that shows how to choose a speech service.

The left side of the diagram illustrates audio-to-audio or audio-to-text processes.

  • Speech-to-text is used to convert speech from an audio source to a text format.
  • Speech-to-speech is used to translate speech in one language to speech in another language.

The right side of the diagram illustrates text-to-audio processes.

  • Text-to-speech is used to generate spoken audio from a text source.

Common use cases

The following table recommends services for some common use cases.

Use case Service to use
Provide closed captions for recorded or live videos Speech-to-text
Create a transcript of a phone call or meeting Speech-to-text
Implement automated note dictation Speech-to-text
Determine intended user input for further processing Speech-to-text
Generate spoken responses to user input Text-to-speech
Create voice menus for telephone systems Text-to-speech
Read email or text messages aloud in hands-free scenarios Text-to-speech
Broadcast announcements in public locations, like railway stations or airports Text-to-speech
Produce real-time closed captioning for a speech or simultaneous two-way translation of a spoken conversation Speech-to-text

Contributors

This article is maintained by Microsoft. It was originally written by the following contributors.

Principal authors:

Other contributors:

To see nonpublic LinkedIn profiles, sign in to LinkedIn.

Next steps