Hi, I am running a docker container for cognitive-services-speech and getting the cancelled session error while transcribing an audio file.
The code I am running is taken from here: https://github.com/MicrosoftDocs/azure-docs/blob/main/articles/ai-services/speech-service/includes/quickstarts/stt-diarization/python.md
I only changed how I initialize the speech_config to communicate with the container and the language:
speech_config = speechsdk.SpeechConfig(host='ws://localhost:5000') speech_config.speech_recognition_language="nl-NL"
This is the result I am getting:
SESSION STARTED: SessionEventArgs(session_id=eec27cda985349959130becbd8e83291)
CANCELED ConversationTranscriptionCanceledEventArgs(session_id=eec27cda985349959130becbd8e83291, result=ConversationTranscriptionResult(result_id=430b8ba53d6b4994ab527f037fc0ced5, speaker_id=, text=, reason=ResultReason.Canceled))
CLOSING ConversationTranscriptionCanceledEventArgs(session_id=eec27cda985349959130becbd8e83291, result=ConversationTranscriptionResult(result_id=430b8ba53d6b4994ab527f037fc0ced5, speaker_id=, text=, reason=ResultReason.Canceled))
SESSION STOPPED SessionEventArgs(session_id=eec27cda985349959130becbd8e83291)
CLOSING SessionEventArgs(session_id=eec27cda985349959130becbd8e83291)
I also tried extending the timeout for silences with no result:
speech_config.set_property( speechsdk.PropertyId.SpeechServiceConnection_InitialSilenceTimeoutMs, "500000") speech_config.set_property( speechsdk.PropertyId.SpeechServiceConnection_EndSilenceTimeoutMs, "50000")
The same audio file works just fine using a different speech recognition pipeline:
speech_recognizer = speechsdk.SpeechRecognizer(speech_config=speech_config, audio_config=audio_config)
results = []
while True:
result = speech_recognizer.recognize_once_async().get()
results.append(result)
if result.reason is speechsdk.ResultReason.RecognizedSpeech:
print("Recognized: {}".format(result.text))
elif result.reason is speechsdk.ResultReason.NoMatch:
print("NOMATCH: Speech could not be recognized.")
elif result.reason is speechsdk.ResultReason.Canceled:
cancellation = result.cancellation_details
print("Speech Recognition canceled: {}".format(cancellation.reason))
if cancellation.reason == speechsdk.CancellationReason.Error:
print("Error details: {}".format(cancellation.error_details))
break
But I need speaker recognition, not only transcript, so I was hoping to use speechsdk.transcription.ConversationTranscriber(speech_config=speech_config, audio_config=audio_config) instead. Besides, some files are too long to use speech_recognizer.recognize_once_async().
I am running azure.cognitiveservices.speech==1.38.0
Do you have any suggestions?
Thank you.