question

sanghunjeon-9189 avatar image
0 Votes"
sanghunjeon-9189 asked romungi-MSFT commented

speech to text (realtime)

I need help.

I want to recognize real-time speech and see a list of predicted words.

So, I want to apply a function called NBest to Python, but it doesn't work properly.
I would appreciate it if someone could tell me the problem with the simple code now.



import azure.cognitiveservices.speech as speechsdk
import requests

def get_token(subscription_key):
fetch_token_url = 'https://koreacentral.api.cognitive.microsoft.com/sts/v1.0/issueToken'
headers = {'Ocp-Apim-Subscription-Key': subscription_key}
response = requests.post(fetch_token_url, headers=headers)
access_token = str(response.text)
print(access_token)

def from_mic(subscription_key):
speech_config = speechsdk.SpeechConfig(subscription=subscription_key, region="koreacentral")
speech_recognizer = speechsdk.SpeechRecognizer(speech_config=speech_config, language="ko-KR")
result = speech_recognizer.recognize_once_async().get()
recognition = ''.join(filter(str.isalnum, result.text))
print(recognition)


if name == "main":
subscription_key = 'xxxxxxxxxxxxxxx'

 while True:
     from_mic(subscription_key)
     get_token(subscription_key)
azure-speech
· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

@sanghunjeon-9189 Did you get a chance to check and try the sample below?

0 Votes 0 ·

1 Answer

romungi-MSFT avatar image
0 Votes"
romungi-MSFT answered romungi-MSFT edited

@sanghunjeon-9189 I am not sure if the code snippet is correctly indented or if it is an issue with this page formatting. Part of the snippet does not seem to be correctly indented.

Your call to both these methods are not required, a simple call to recognize_once_async() should actually return the required result. The call to get_token() is only printing the token but is not used elsewhere in the script.

You can refer the usage of the method in this sample, the sample also uses pronunciation assessment but that is not required. I think this snippet should help you recognize the result correctly.

 speech_key='your_key'
 service_region='your_region'
    
 config = speechsdk.SpeechConfig(subscription=speech_key, region=service_region)
    
 recognizer = speechsdk.SpeechRecognizer(speech_config=config)
    
 result = recognizer.recognize_once_async().get()
    
 if result.reason == speechsdk.ResultReason.RecognizedSpeech:
     print("Recognized: {}".format(result.text))
 elif result.reason == speechsdk.ResultReason.NoMatch:
     print("No speech could be recognized: {}".format(result.no_match_details))
 elif result.reason == speechsdk.ResultReason.Canceled:
     cancellation_details = result.cancellation_details
     print("Speech Recognition canceled: {}".format(cancellation_details.reason))
     if cancellation_details.reason == speechsdk.CancellationReason.Error:
         print("Error details: {}".format(cancellation_details.error_details))

If an answer is helpful, please click on 130616-image.png or upvote 130671-image.png which might help other community members reading this thread.



5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.