Hello @tanmay
Thanks for reaching out to us in Microsoft Q&A platform. I think you are mentioning below work flow - Voice enable your bot
The voice-enabled chat bot that you make in this tutorial follows these steps:
The sample client application is configured to connect to the Direct Line Speech channel and the echo bot.
When the user presses a button, voice audio streams from the microphone. Or audio is continuously recorded when a custom keyword is used.
If a custom keyword is used, keyword detection happens on the local device, gating audio streaming to the cloud.
By using the Speech SDK, the sample client application connects to the Direct Line Speech channel and streams audio.
Optionally, higher-accuracy keyword verification happens on the service.
The audio is passed to the speech recognition service and transcribed to text.
The recognized text is passed to the echo bot as a Bot Framework activity.
The response text is turned into audio by the text-to-speech service, and streamed back to the client application for playback.
More information please refer to - https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/tutorial-voice-enable-your-bot-speech-sdk
Regards,
Yutong
-Please kindly aceept the answer if you feel helpful to support the community, thanks a lot.