Conversation Transcription vs Speech-to-Text

Allen Hansen 1 Reputation point
2022-12-11T14:29:03.723+00:00

My team and I are looking for a service to transcribe audio files with diarization. We started to use the Conversation Transcription speech service but stumbled upon the Speech-to-Text REST API while doing research on how to best architect our product solution and had a few questions.

  1. Diarization is supported in the Conversation Transcription service with the DifferentiateGuestSpeakers property enabled and in the Speech-to-Text v3.1 REST API with the diarizationEnabled property enabled, but these versions are both in preview. Does anyone know when each of these services will no longer be in preview?
  2. The Speech-to-Text REST API appears to be more feature rich. For example, it supports passing a Bob Storage audio file URL opposed to uploading the audio manually, word-level timestamps, and web hook support opposed to polling for updates. The Conversation Transcription service doesn't appear to have any of these features? Does anyone know if the Conversation Transcription services utilizes the Speech-to-Text REST API behind the scenes and if there is a way to utilize these extra features?

Thank you~

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
2,070 questions
Azure AI services
Azure AI services
A group of Azure services, SDKs, and APIs designed to make apps more intelligent, engaging, and discoverable.
3,632 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Allen Hansen 1 Reputation point
    2022-12-19T01:50:52.357+00:00

    @romungi-MSFT Apologies for the delayed response. Yes, the information you have provided was helpful. Thank you~

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.