Can MultiChannel audios be Diarized using Azure STT Batch Rest APIs?

Question

Can MultiChannel audios be Diarized using Azure STT Batch Rest APIs?

Indira Priyadarshini 60

Hi. I am using Azure STT Batch Transcription Rest API on an audio file that is MUltiChannel. When I set property Diarize_Enabled = True, it gives error sayin audio not in right format! Is this expected? Is there any other way around to diarize MultiChannel audios?

Indira Priyadarshini 60 Reputation points

2024-02-22T11:10:58.39+00:00

Below is the error message I get on setting Diarize_Enable = True and wordLevelTimeStamps = True. "status": "Failed", "errorMessage": "Reason BadChannelConfiguration, Details: Stereo audio is not supported when using diarization. (JobId 73d94bcd-5d9e-420e-a656-897d465871fd).", "errorKind": "BadChannelConfiguration"

Accepted answer

1 additional answer

Your answer

Indira Priyadarshini 60 Reputation points

2024-02-22T11:10:58.39+00:00

Below is the error message I get on setting Diarize_Enable = True and wordLevelTimeStamps = True. "status": "Failed", "errorMessage": "Reason BadChannelConfiguration, Details: Stereo audio is not supported when using diarization. (JobId 73d94bcd-5d9e-420e-a656-897d465871fd).", "errorKind": "BadChannelConfiguration"

Answer 1

romungi-MSFT 48,911 Microsoft Employee Moderator

@Indira Priyadarshini As per the error message Stereo is used in the audio file used. Details: Stereo audio is not supported when using diarization This is not a supported configuration for diarization. Please see this page for details which confirms the same. User's image

If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.

Indira Priyadarshini 60 Reputation points

2024-02-23T06:05:25.9533333+00:00

Does this mean there is no way to Diarize Stereo recordings in Azure STT Batch Transcription Rest API? Any other work around?
romungi-MSFT 48,911 Reputation points Microsoft Employee Moderator

2024-02-23T07:07:10.3333333+00:00

Stereo audio is supported with the REST API for short audio. See this sample from GitHub for reference.
Indira Priyadarshini 60 Reputation points

2024-02-26T07:45:36.95+00:00

Hi @romungi-MSFT. By Short audio, do you mean there is a limitation to the size of the audio? Is there any documentation on this. Meanwhile, I will try implementing the solution as given in the GitHub for this. Thanks!

Answer 2

Sedat SALMAN 14,180 MVP

Yes, you can diarize multichannel audio using Azure STT Batch REST APIs. Here's how:

Ensure your audio is in a supported format (WAV, MP3, etc.).
Use the Batch Transcription API endpoint, and in your JSON request:
- Set diarizationEnabled to true.
- Set wordLevelTimestampsEnabled to true.
- Provide the URLs of your audio files in contentUrls.

The output will include speaker labels (e.g., Speaker_1) and their corresponding timestamps. here is the example config

{
  "contentUrls": [
    "https://mystorage.blob.core.windows.net/audio/meeting_multichannel.wav"
  ],
  "properties": {
    "diarizationEnabled": true,
    "wordLevelTimestampsEnabled": true,
    "punctuationMode": "DictatedAndAutomatic",  
    "profanityFilterMode": "Masked"  
  },
  "locale": "en-US",
  "displayName": "Multichannel audio transcription with diarization"
}

Indira Priyadarshini 60 Reputation points

2024-02-22T10:29:22.6866667+00:00
Hi, My audio is a MultiChannel one. It works well in Batch TRanscription Rest API for features like Punctuation, MultiChannel, Profanity_filter. But when I tried Diarization, it fails. I tried with your suggested body params of enabling Diarize and WordLevelTimeStamps like below. However I get error as like in attachment.

"punctuationMode": "DictatedAndAutomatic", "profanityFilterMode": "Removed", "wordLevelTimestampsEnabled": "True", "displayFormWordLevelTimestampsEnabled": "True", "diarizationEnabled": "True"

Share via

Can MultiChannel audios be Diarized using Azure STT Batch Rest APIs?

1 additional answer

Your answer