Hello @Rakesh Indla
Thanks for reaching out to us, from Azure Speech service point, you can do it by the Real-time diarization (Preview) feature.
You can run an application for speech to text transcription with real-time diarization. Here, diarization is distinguishing between the different speakers participating in the conversation. The Speech service provides information about which speaker was speaking a particular part of transcribed speech.
Please take a look at the document - https://learn.microsoft.com/en-us/azure/ai-services/speech-service/get-started-stt-diarization?tabs=linux&pivots=programming-language-python
I hope this helps, please let me know if you need further assistance.
Regards,
Yutong
-Please kindly accept the answer and vote 'Yes' if you feel helpful to support the community, thanks a lot.