Is there a way to make speech service transcription faster (diarization with speakers differentiated)?
kk
0
Reputation points
Currently the speed seems to be half the time for wav and 1:1 ratio for mp4 with gstreamer.
From this post, it seems half the time for wav file is the maximum.
If this is true, how can I make at least the other file format transcription with gstreamer (like mp4) be as fast as wav file?
I am following this code for python from the doc.
https://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-use-codec-compressed-audio-input-streams?tabs=linux%2Cdebian%2Cjava-android%2Cterminal&pivots=programming-language-python
Thank you for your help!
Sign in to answer