Pronunciation assessment SDK is getting stuck

Dan Tang 0 Reputation points
2024-07-23T14:24:57.8833333+00:00

I'm trying to integrate the pronunciation assessment speech services Python SDK - specifically a web front-end will upload an audio file to a fastapi backend, which will then utilise whisper to transcribe and then send the transcription together with the audio file to MSFT's endpoint for evaluation. However, each time I do so, it hangs and I get a (Timeout: no recognition result received) error after 10+ seconds.

I suspect that the error might be similar to Microsoft Cognitive SpeechRecognizer Stuck, but a) I'm using the Python SDK which does not have the FromWavFileInput method, b) I tried adding 100kb of empty buffer, but it still does not work.

Wondering if anyone has any suggestions? I've posted my code on https://stackoverflow.com/questions/78783121/microsoft-cognitive-speech-services-sdk-python-is-getting-stuck?noredirect=1#comment138902133_78783121 as well.

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
1,554 questions
Azure AI services
Azure AI services
A group of Azure services, SDKs, and APIs designed to make apps more intelligent, engaging, and discoverable.
2,638 questions
{count} votes