Microsoft.CognitiveSerivces.Speech Sdk recognizes silence every 15 seconds with SpeechContinousRecognizeAsync and billed for that.

Question

Microsoft.CognitiveSerivces.Speech Sdk recognizes silence every 15 seconds with SpeechContinousRecognizeAsync and billed for that.

Dhilip Swaminathan 0

I have intergrated Microsoft.CognitiveSerivces.Speech sdk with my .Net Windows Service appication. I have integrated Cognitive Service to listen to audio input for any voice commands using SpeechContinuousRecognitionAsync. When there is voice command, Cognitive Service recognizes and converts the speech to text.. I am good with that.

But every 15 seconds... the Recognized event is hit with Empty result text. And due to this the billing goes up every month.

We are supposed to be billed only if a voice is recognized.

Any help would be appreciated.

romungi-MSFT 48,911 Reputation points Microsoft Employee Moderator

2023-05-31T11:39:16.9366667+00:00

@Dhilip Swaminathan Did the below information help to clarify the billing in this case?
Dhilip Swaminathan 0 Reputation points

2023-06-02T07:11:11.6433333+00:00

Yes. I am facing another issue. I am running StartContinuousRecognitionAsync in a windows service and the Cognitive Service is listening to Audio Stream. The windows service would run 24 x 7. NAudio WaveIn would be listening to Audio Input and would write to Audio Stream when there is an audio. The issue is after few hours, the recognition stops and when the audio is coming in and writtent to Audio Stream, Congnitive Serivce does nothing. Again it works only if the connection is re-established after restarting the windows service.
romungi-MSFT 48,911 Reputation points Microsoft Employee Moderator

2023-06-05T11:10:03.98+00:00

@Dhilip Swaminathan I think the SDK might have internal timers that may reset the connection if no audio is passed. I would suggest raising an issue with SDK team to check how you can design your application to listen continuously without any connection resets.

1 answer

Your answer

romungi-MSFT 48,911 Reputation points Microsoft Employee Moderator

2023-05-31T11:39:16.9366667+00:00

@Dhilip Swaminathan Did the below information help to clarify the billing in this case?
Dhilip Swaminathan 0 Reputation points

2023-06-02T07:11:11.6433333+00:00

Yes. I am facing another issue. I am running StartContinuousRecognitionAsync in a windows service and the Cognitive Service is listening to Audio Stream. The windows service would run 24 x 7. NAudio WaveIn would be listening to Audio Input and would write to Audio Stream when there is an audio. The issue is after few hours, the recognition stops and when the audio is coming in and writtent to Audio Stream, Congnitive Serivce does nothing. Again it works only if the connection is re-established after restarting the windows service.
romungi-MSFT 48,911 Reputation points Microsoft Employee Moderator

2023-06-05T11:10:03.98+00:00

@Dhilip Swaminathan I think the SDK might have internal timers that may reset the connection if no audio is passed. I would suggest raising an issue with SDK team to check how you can design your application to listen continuously without any connection resets.

Answer 1

@Dhilip Swaminathan If you are running continuous recognition the first 15 seconds is the approximate timeout limit if there is no utterance detected. This is also documented here for reference. There is also a thread that details on how the timeout are set at SDK level, this might be helpful to app developers to change it accordingly.

With respect to the billing, the billing is based on audio that is processed with per second billing. It is not based on detection of voice in your input. Please see the pricing page for the billing criteria.

User's image

If you are running continuous recognition and do not have any utterance the 15 seconds of audio will count as processed audio.

Based on your scenario, instead of using continuous recognition to listen to input, consider using keyword recognition. Once a key word is detected use the audio after the keyword to convert to text and process it further based on the input. See this QuickStart on creating a keyword and the sample at the end of the page to use the same to save as audio. I hope this helps!!

If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.

Share via

Microsoft.CognitiveSerivces.Speech Sdk recognizes silence every 15 seconds with SpeechContinousRecognizeAsync and billed for that.

1 answer

Your answer