Speech-to-text detect speakers while using microphone

2023-07-27T06:03:53.0633333+00:00

Hey there, we are facing an issue when using speech to text in realtime steam. The microphone always pick both phone-speakers voice and voice from the person who is speaking, then convert the mixed voice to text. We want to add an interrupt mechanism - when it detects user start to speak, the phone-speaker could automatically switches to silent mode. But because of the mixed voices, it can not be done. 

Is there any VAD mode or speech activity detection services in Azure we can use,  like it can detect how many people are speaking, or other ways to resolve this?

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
2,062 questions
{count} votes

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.