Azure speech service-speaker separation agent, customer did not have korea language audio file

student_learn 0 Reputation points

I wonder if speaker separation using the dotnet run command in GitHub's call-center related program in Azure Speech Service is not supported in Korean.

Channels 0 and 1 were all displayed without distinction, and the results for wav files, flac files, and mp3 files were different.

The English audio file had good speaker separation, but the Korean audio file did not have speaker separation.


dotnet run --languageKey YourResourceKey --languageEndpoint YourResourceEndpoint --language ko --locale ko-KR --speechKey YourResourceKey --speechRegion YourResourceRegion --input --stereo --output summary.json

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
1,067 questions
{count} votes

1 answer

Sort by: Most helpful
  1. romungi-MSFT 38,646 Reputation points Microsoft Employee

    student_learn I believe you are referring to the following sample from the SDK repo that is used for the above.

    If yes, then the parameters passed are correct, but this sample uses stereo audio, and you are also passing stereo option along with the parameters. In this case diarization will not be enabled as diarization is enabled for mono audio input. If your audio file is mono, then the speaker separation will not work as stereo option is passed. Try not setting the stereo option with your file and check if it works.

    If it still does not work, then the sample needs some change to explicitly set diarizationEnabled to true and use the mono audio recording.

    I would personally first try out the speech studio to upload the audio file and check if the required result is seen and then modify the code to generate a similar output. I hope this helps!!

    If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.

    0 comments No comments