@@alex Thanks, With Video Indexer you can upload audio files using Video Indexer Studio and See sample here for media services video indexer.
Speech to text and speech analysis for videos and video streams.
Hello there.
I am working on a platform where I need analysis of what people talk in videos and streams (using the Extract key phrases, Named Entity Recognition (NER) and Find linked entities from the Azure Text Analytics service).
So far for the videos, I am able to get the audio out of the them, but they are really huge (in size and length).
My plan was to get the audio out of the videos then send it to Speech To Text Azure service, get the transcription back and run the transcription thru the Text Analytics Azure service.
Is that the right approach? Is there any better way to do this? Should I use Batch transcription or Speech SDK? Where can I find such examples?
Whats the best way to do the same for the live streams?
I am reading the documentation of the Speech To Text services but I really cant grasp enough from the documentation to know how to do this. I am more confused after wasting like 3 days on the documentation than before.
Thanks in advance and best wishes.