Thanks for reaching out to us. For your question one, you can use Speech Studio or Speech Service SDK to do so, but not REST API.
1.Can we get captions file in webvtt format with Microsoft speech service through rest api?
The answer is yes, the Speech service supports output formats such as SRT (SubRip Text) and WebVTT (Web Video Text Tracks). These can be loaded onto most video players such as VLC, automatically adding the captions on to your video.
2.And also can we provide custom name to the caption file?
Like the sample here - python captioning.py --input caption.this.mp4 --format any --output caption.output.txt --srt --realTime --threshold 5 --delay 0 --profanity mask --phrases "Contoso;Jessie;Rehaan"
You may want to try --output with the name
reference document - https://learn.microsoft.com/en-us/azure/ai-services/speech-service/captioning-quickstart?tabs=linux%2Cterminal&pivots=programming-language-python#create-captions-from-speech
Please take a look at it and let me know if you have more questions.
Regards,
Yutong
-Please kindly accept the answer and vote 'Yes' if you feel helpful to support the community, thanks a lot.