Speech to text - Diarization Batch API does not work

Hi 11

Hi,

I am using STT API 3.0 (endpoint : https://southcentralus.api.cognitive.microsoft.com/speechtotext/v3.0/transcriptions)

I am using the API Batch Transcription API since I am working with audio files.
I am then retrieving the JSON results and more specifically the property "display" from "combinedRecognizedPhrases".

I am using audio files which contain interviews.
I set the property diarizationEnabled to true to get the distinction between speakers but nothing seems to work and I do not see anything which allows me to understand who is speaking.

Does it work with WAV file with 2 channels?
Do I need to do something specific ?

romungi-MSFT 45,311 Reputation points Microsoft Employee

2021-04-08T08:52:52.94+00:00

@Hi The API should support diarization and is capable of recognizing two speakers on mono channel recordings, The setup for the same is mentioned in detail in this documentation. Hope this helps.

Share via

Speech to text - Diarization Batch API does not work

Your answer