Share via

Conversation Transcription vs Speech-to-Text

Allen Hansen 1 Reputation point
2022-12-11T14:29:03.723+00:00

My team and I are looking for a service to transcribe audio files with diarization. We started to use the Conversation Transcription speech service but stumbled upon the Speech-to-Text REST API while doing research on how to best architect our product solution and had a few questions.

  1. Diarization is supported in the Conversation Transcription service with the DifferentiateGuestSpeakers property enabled and in the Speech-to-Text v3.1 REST API with the diarizationEnabled property enabled, but these versions are both in preview. Does anyone know when each of these services will no longer be in preview?
  2. The Speech-to-Text REST API appears to be more feature rich. For example, it supports passing a Bob Storage audio file URL opposed to uploading the audio manually, word-level timestamps, and web hook support opposed to polling for updates. The Conversation Transcription service doesn't appear to have any of these features? Does anyone know if the Conversation Transcription services utilizes the Speech-to-Text REST API behind the scenes and if there is a way to utilize these extra features?

Thank you~

Azure AI Speech
Azure AI Speech

An Azure service that integrates speech processing into apps and services.

Foundry Tools
Foundry Tools

Formerly known as Azure AI Services or Azure Cognitive Services is a unified collection of prebuilt AI capabilities within the Microsoft Foundry platform


1 answer

Sort by: Most helpful
  1. Allen Hansen 1 Reputation point
    2022-12-19T01:50:52.357+00:00

    @Rohit Mungi Apologies for the delayed response. Yes, the information you have provided was helpful. Thank you~

    0 comments No comments

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.