Azure Speech SDK - Listen to multiple (agent and customer) audio streams

Arun Srinivasan 70 Reputation points
2024-02-09T12:14:26.78+00:00

We are using Azure Speech SDK for speech recognition (microsoft-cognitiveservices-speech-sdk: 1.35.0). We are able to set up speechConfig, audioConfig and are able to recognize the text both in live and batch mode. During the live mode, the Audio config is set to default microphone input. It is able to seek the microphone input well using the below command. speechsdk.AudioConfig.fromDefaultMicrophoneInput())

async sttFromMic() {
      const tokenObj = await getTokenOrRefresh();      
      const speechConfig = speechsdk.SpeechConfig.fromAuthorizationToken(tokenObj.authToken, tokenObj.region);
      speechConfig.speechRecognitionLanguage = 'en-US';         
      const audioConfig = speechsdk.AudioConfig.fromDefaultMicrophoneInput();
      recognizer = new speechsdk.SpeechRecognizer(speechConfig, audioConfig);
....

Assume it is a conversation between Agent and Customer. The agent microphone input stream is being listened and recognized using the above code. But how do we listen to the customer audio stream? Kindly advise.

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
1,555 questions
{count} votes