Can MultiChannel audios be Diarized using Azure STT Batch Rest APIs?

Indira Priyadarshini 60 Reputation points
2024-02-22T10:01:19.8766667+00:00

Hi. I am using Azure STT Batch Transcription Rest API on an audio file that is MUltiChannel. When I set property Diarize_Enabled = True, it gives error sayin audio not in right format! Is this expected? Is there any other way around to diarize MultiChannel audios?

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
2,069 questions
Azure AI services
Azure AI services
A group of Azure services, SDKs, and APIs designed to make apps more intelligent, engaging, and discoverable.
3,623 questions
{count} votes

Accepted answer
  1. romungi-MSFT 48,911 Reputation points Microsoft Employee Moderator
    2024-02-23T05:59:46.0466667+00:00

    @Indira Priyadarshini As per the error message Stereo is used in the audio file used. Details: Stereo audio is not supported when using diarization This is not a supported configuration for diarization. Please see this page for details which confirms the same. User's image

    If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.

    1 person found this answer helpful.

1 additional answer

Sort by: Most helpful
  1. Sedat SALMAN 14,180 Reputation points MVP
    2024-02-22T10:12:18.83+00:00

    Yes, you can diarize multichannel audio using Azure STT Batch REST APIs. Here's how:

    • Ensure your audio is in a supported format (WAV, MP3, etc.).
    • Use the Batch Transcription API endpoint, and in your JSON request:
      • Set diarizationEnabled to true.
      • Set wordLevelTimestampsEnabled to true.
      • Provide the URLs of your audio files in contentUrls.

    The output will include speaker labels (e.g., Speaker_1) and their corresponding timestamps. here is the example config

    {
      "contentUrls": [
        "https://mystorage.blob.core.windows.net/audio/meeting_multichannel.wav"
      ],
      "properties": {
        "diarizationEnabled": true,
        "wordLevelTimestampsEnabled": true,
        "punctuationMode": "DictatedAndAutomatic",  
        "profanityFilterMode": "Masked"  
      },
      "locale": "en-US",
      "displayName": "Multichannel audio transcription with diarization"
    }
    
    

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.