Why Session stopped automatically when using speech_recognizer.start_continuous_recognition() by adding phrase?(API by Python)

Rakuwa 6 Reputation points
2022-11-17T02:04:22.99+00:00

Hi,

When I use api written by python, I found that when I use 'speech_recognizer.start_continuous_recognition()' while adding PhraseListGrammar to recognize an audio file, the session would stop automatically somehow. Therefore, the result text I got is only a part of the full audio. Meanwhile, without adding PhraseListGrammar, the program can run normally that I can obtain a full result text recognized.

Is there any restriction on the length of the audio when utilizing 'speechsdk.PhraseListGrammar.from_recognizer(speech_recognizer)' ??
Or is there any wrong with my programming code?

The specific details are following:

length of audio file:  about 1hour.
Session stopped when 10 minutes has passed from when the speaker start to speak.

Phrase Amount: over 1,000.

code in python:

def speech_recognize_continuous_from_file_addphrase(audiofile, phraselist):  
  
        # <SpeechRecognitionWithFile>  
        audio_filename = audiofile  
        audio_input = speechsdk.audio.AudioConfig(filename=audio_filename)  
        speech_config = speechsdk.SpeechConfig(subscription=speech_key, region=service_region)  
        # Ask for detailed recognition result  
        speech_config.output_format = speechsdk.OutputFormat.Detailed  
        # Creates a recognizer with the given settings  
        speech_recognizer = speechsdk.SpeechRecognizer(speech_config=speech_config, audio_config=audio_input, language="ja-JP")  
  
        #Adding phrase list  
        phrase_list_grammar = speechsdk.PhraseListGrammar.from_recognizer(speech_recognizer)  
        for phrase in phraselist:  
            phrase_list_grammar.addPhrase(phrase)  
  
        # Starts speech recognition, and returns after text is recognized.  
        done = False  
        result =[]  
        print("Recognizing...")  
  
        def recognized(evt):  
            result.append(evt.result.text)  
            return result  
  
        def start(evt):  
            print('SESSION STARTED: {}'.format(evt))  
  
        def stop(evt):  
            print('SESSION STOPPED {}'.format(evt))  
            speech_recognizer.stop_continuous_recognition()  
            nonlocal done  
            done = True  
  
        speech_recognizer.recognized.connect(recognized)  
        speech_recognizer.session_started.connect(start)  
        speech_recognizer.session_stopped.connect(stop)  
  
        try:  
            speech_recognizer.start_continuous_recognition()  
            while not done:  
                time.sleep(.5)  
        except KeyboardInterrupt:  
            print("bye.")  
            speech_recognizer.recognized.disconnect_all()  
            speech_recognizer.session_started.disconnect_all()  
            speech_recognizer.session_stopped.disconnect_all()  
                  
        return result  

Sorry for my poor English, if you want to know any more information, please let me know.
Thank you.

Rakuwa

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
1,402 questions
Azure AI services
Azure AI services
A group of Azure services, SDKs, and APIs designed to make apps more intelligent, engaging, and discoverable.
2,393 questions
{count} votes