How to split sdk.SpeechRecognizer result by speakers in "Azure Speech to Text" using NodeJS SDK ?

Vovkotrub Bohdan 30 Reputation points
2023-02-28T21:35:53.93+00:00

I want to recognize speech to text witn this format:

Speaker #1: Hello, I am speaker 1 Speaker #2: Hello, I am speaker 2.

I use SDK "microsoft-cognitiveservices-speech-sdk" in NodeJS.

speech is "example.wav"

I enable differeniate guest speakers

speechConfig.setProperty("DifferentiateGuestSpeakers", true);

I get result with text but "privSpeakerId" is undefined.

What need to enable speakerId ?

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
2,070 questions
Azure AI services
Azure AI services
A group of Azure services, SDKs, and APIs designed to make apps more intelligent, engaging, and discoverable.
3,632 questions
0 comments No comments
{count} votes

Accepted answer
  1. romungi-MSFT 48,911 Reputation points Microsoft Employee Moderator
    2023-03-01T10:32:21.1933333+00:00

    Vovkotrub Bohdan I think you are using the property without setting the property for conversation transcription. Here is a similar issue about the usage of this property when you are not using voice profiles for users but still want to have speaker differentiation.

    There is a quickstart on setting the conversation transcription with the config so you can recognize the speakers with/without enrolling them. The quickstart snippet is using SpeechTranslationConfig() which is incorrect and should be SpeechConfig()

    I hope this helps. Thanks!!

        var speechConfig = sdk.SpeechConfig.fromSubscription(subscriptionKey, region);
        var audioConfig = sdk.AudioConfig.fromWavFileInput(fs.readFileSync(filepath));
        speechConfig.setProperty("ConversationTranscriptionInRoomAndOnline", "true");
    
        // en-us by default. Adding this code to specify other languages, like zh-cn.
        speechConfig.speechRecognitionLanguage = "en-US";
        speechConfig.setProperty("DifferentiateGuestSpeakers", true);
        
        // create conversation and transcriber
        var conversation = sdk.Conversation.createConversationAsync(speechConfig, "myConversation");
        var transcriber = new sdk.ConversationTranscriber(audioConfig);
    
    
    1 person found this answer helpful.

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.