Pronunciation assessment SDK is getting stuck
I'm trying to integrate the pronunciation assessment speech services Python SDK - specifically a web front-end will upload an audio file to a fastapi backend, which will then utilise whisper to transcribe and then send the transcription together with the audio file to MSFT's endpoint for evaluation. However, each time I do so, it hangs and I get a (Timeout: no recognition result received) error after 10+ seconds.
I suspect that the error might be similar to Microsoft Cognitive SpeechRecognizer Stuck, but a) I'm using the Python SDK which does not have the FromWavFileInput method, b) I tried adding 100kb of empty buffer, but it still does not work.
Wondering if anyone has any suggestions? I've posted my code on https://stackoverflow.com/questions/78783121/microsoft-cognitive-speech-services-sdk-python-is-getting-stuck?noredirect=1#comment138902133_78783121 as well.