having problem with InitialSilenceTimeout reason from Cognitive Service, Speech to Text

Davaadulam Davaakhuu 46 Reputation points
2021-08-24T20:46:59.027+00:00

We have a problem that many of our voice files cannot get recognized when we used SpeechRecognition. They mostly fail with InitialSilenceTimeout reason.
When we listen the voices, they don't sound any silence in the beginning and also there are couple other voice files get recognized correctly. So we don't think it is code issue but we cannot figure out why it fails for some voices. There is Conversation_Initial_Silence_Timeout property and we set it to some seconds, but that doesnt help.
Also if I test exact same voice files in SpeechStudio with same Resourcegroup, they pass the test and recognized.
Does anyone had any similar issue? would there be any property that we missed to configure in the code?

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
2,052 questions
0 comments No comments
{count} vote

1 answer

Sort by: Most helpful
  1. romungi-MSFT 48,901 Reputation points Microsoft Employee Moderator
    2021-08-25T07:28:35.18+00:00

    @Davaadulam Davaakhuu Have you tried to set the following property too with the speech config?

    speechConfig.SetProperty(PropertyId.SpeechServiceConnection_InitialSilenceTimeoutMs, "45000");  
    

    This thread from the speech SDK repo details the different scenarios of timeout for silence. Segmentation timeout and max segmentation timeout are couple of other timeouts that are not exposed with the SDK and if they are reached with RecognizeOnceAsync then I think such behavior could occur. It also depends on the quality of audio but since the speech studio is able to recognize I think this could be an issue with SDK or setting the above property could help.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.