Getting CANCELED ConversationTranscriptionCanceledEventArgs - ResultReason.Canceled using a docker container

Julia Berezutskaya 0 Reputation points
2024-06-20T10:48:21.9633333+00:00

Hi, I am running a docker container for cognitive-services-speech and getting the cancelled session error while transcribing an audio file.

The code I am running is taken from here: https://github.com/MicrosoftDocs/azure-docs/blob/main/articles/ai-services/speech-service/includes/quickstarts/stt-diarization/python.md

I only changed how I initialize the speech_config to communicate with the container and the language:

speech_config = speechsdk.SpeechConfig(host='ws://localhost:5000')    speech_config.speech_recognition_language="nl-NL"

This is the result I am getting:

SESSION STARTED: SessionEventArgs(session_id=eec27cda985349959130becbd8e83291)

CANCELED ConversationTranscriptionCanceledEventArgs(session_id=eec27cda985349959130becbd8e83291, result=ConversationTranscriptionResult(result_id=430b8ba53d6b4994ab527f037fc0ced5, speaker_id=, text=, reason=ResultReason.Canceled))

CLOSING ConversationTranscriptionCanceledEventArgs(session_id=eec27cda985349959130becbd8e83291, result=ConversationTranscriptionResult(result_id=430b8ba53d6b4994ab527f037fc0ced5, speaker_id=, text=, reason=ResultReason.Canceled))

SESSION STOPPED SessionEventArgs(session_id=eec27cda985349959130becbd8e83291)

CLOSING SessionEventArgs(session_id=eec27cda985349959130becbd8e83291)

I also tried extending the timeout for silences with no result:

speech_config.set_property(         speechsdk.PropertyId.SpeechServiceConnection_InitialSilenceTimeoutMs, "500000")   speech_config.set_property(         speechsdk.PropertyId.SpeechServiceConnection_EndSilenceTimeoutMs, "50000")

The same audio file works just fine using a different speech recognition pipeline:

speech_recognizer = speechsdk.SpeechRecognizer(speech_config=speech_config, audio_config=audio_config)
results = []
while True:
    result = speech_recognizer.recognize_once_async().get()
    results.append(result)
    if result.reason is speechsdk.ResultReason.RecognizedSpeech:
        print("Recognized: {}".format(result.text))
    elif result.reason is speechsdk.ResultReason.NoMatch:
        print("NOMATCH: Speech could not be recognized.")
    elif result.reason is speechsdk.ResultReason.Canceled:
        cancellation = result.cancellation_details
        print("Speech Recognition canceled: {}".format(cancellation.reason))
        if cancellation.reason == speechsdk.CancellationReason.Error:
            print("Error details: {}".format(cancellation.error_details))
        break

But I need speaker recognition, not only transcript, so I was hoping to use speechsdk.transcription.ConversationTranscriber(speech_config=speech_config, audio_config=audio_config) instead. Besides, some files are too long to use speech_recognizer.recognize_once_async().

I am running azure.cognitiveservices.speech==1.38.0

Do you have any suggestions?

Thank you.

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
1,540 questions
{count} votes

1 answer

Sort by: Most helpful
  1. VasaviLankipalle-MSFT 15,861 Reputation points
    2024-06-21T00:35:35.5066667+00:00

    Hello @Julia Berezutskaya , Thanks for using Microsoft Q&A Platform.

    May I know the pricing tier you are working on? Please note that the Max audio length for real-time diarization is 240 minutes per file for standard pricing tier.

    Also, Microsoft limits access to speaker recognition. You can apply for access through the Azure AI services speaker recognition limited access review. For more information, see Limited access for speaker recognition.

    As a Limited Access feature, Speaker Recognition requires registration. Customers who wish to use this feature are required to register by submitting a registration form

    I hope this helps.

    Regards,

    Vasavi

    -Please kindly accept the answer and vote 'yes' if you feel helpful to support the community, thanks.

    0 comments No comments