Azure speech to text with Diarization receives stop event in case of long pause in audio file

Question

Azure speech to text with Diarization receives stop event in case of long pause in audio file

Harish A 55

Hi

I am using continuous reading of a wav file and sending in frames to Azure cognitive services API to get the text. I use with Diarization. (

speechsdk.transcription.ConversationTranscriber

)

Everything works fine when audio file has speech in it, that means people speak continously. Otherwise, if the is a long pause, that is people dont speak for longer time, or there is a music in it. Session Stop event gets fired and my program terminates.

Example: Let us take a scenario, where there is a meeting for long 4 hours with short breaks of 15-30 mins in between. Now if I try to use this audio, when ever there is a break for 15 mins, "Session Stop" event gets triggered and my python script ends.

Is there a way to handle this. Obviously, what I am looking for is, irrespective of there is speech in audio, I should not receive Session Stop or Cancelled event.

Is there any such property that I can set?

I tried properties like "Conversation_Initial_Silence_Timeout", but when I read through the documentation, I dont think this serves my purpose.

harish alwala 0 Reputation points

2023-11-01T15:32:01.6633333+00:00

Hi, I will check applying these properties and update here.
Harish A 55 Reputation points

2023-11-01T16:21:26.03+00:00

Hi @dupammi

This is working. I tested for 2 hours.

Thank you.

Accepted answer

0 additional answers

Your answer

harish alwala 0 Reputation points

2023-11-01T15:32:01.6633333+00:00

Hi, I will check applying these properties and update here.
Harish A 55 Reputation points

2023-11-01T16:21:26.03+00:00

Hi @dupammi

This is working. I tested for 2 hours.

Thank you.

Answer 1

Hi @Harish A ,

Thank you for using the Microsoft Q&A.

I can understand that you were looking for a method that can allow the maximum duration of silence, before the conversation is considered complete. I will be happy to assist you with this.

You can use the speech_config.set_service_property() method to set any one of the below 2 parameters:

conversationEndSilenceTimeoutMs

(OR)

Speech_SegmentationSilenceTimeoutMs

By setting any one of these 2 properties, you can handle silence and avoid the "session stop" event from getting triggered. Then the code can handle the conversation ending detection timeout.

# Set conversation ending detection timeout (4 hours in seconds)
conversation_ending_detection_timeout = 4 * 60 * 60
    speech_config.set_service_property("conversationEndSilenceTimeoutMs", str(conversation_ending_detection_timeout * 1000), speechsdk.ServicePropertyChannel.UriQueryParameter)

(OR)

# Set conversation ending detection timeout (4 hours in seconds)     conversation_ending_detection_timeout = 4 * 60 * 60     
    speech_config.set_service_property("speechsdk.PropertyId.Speech_SegmentationSilenceTimeoutMs", str(conversation_ending_detection_timeout * 1000), speechsdk.ServicePropertyChannel.UriQueryParameter)

Here is the link, where you can find more details.
How to recognize speech - Speech service - Azure AI services | Microsoft Learn

Hope this helps.

Thank you!

If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.

Share via

Azure speech to text with Diarization receives stop event in case of long pause in audio file

0 additional answers

Your answer