Azure Real-Time diarization

Question

Azure Real-Time diarization

Karyna Khinevich 0

Hi! I am working on a project in Python, in which I use Azure AI Speech Service.

More specifically, I implemented real-time dairization using the azure.cognitiveservices.speech.transcription.ConversationTranscriber class. And now I am working on speaker recognition, so that instead of Guest-1, the transcription displays the name of the speaker, which I previously saved in the system.

I found a suitable Participant class for this, to which I need to pass a voice signature, but the services that allow you to get a voice signature are either unavailable in Python or will be depricated.

What alternatives does Azure currently offer for using the azure.cognitiveservices.speech.transcription.Participant class and similar ones? Or are these classes also planned to be depricated?

0 comments

1 answer

Your answer

Answer 1

Hello Karyna !

Thank you for posting on Microsoft Learn.

The Participant and voice signature flow belongs to the meeting transcription scenario and Microsoft has announced that Speaker Recognition (voice profiles / voice signatures) will be retired on September 30, 2025. After that date, apps won’t be able to use speaker recognition. Meanwhile, real-time diarization with ConversationTranscriber intentionally does not use voice signatures, it only gives generic IDs like Guest-1, Guest-2.

https://learn.microsoft.com/en-us/azure/ai-services/speech-service/speaker-recognition-overview

You can stay with ConversationTranscriber + diarization and map names yourself where you leep a dictionary of {speakerId -> displayName} and let users claim a name the first time they speak or pre-map by device/channel if you capture multichannel audio.

This is the supported real-time path, it returns stable speaker IDs per participant but not identities. https://learn.microsoft.com/en-us/azure/ai-services/speech-service/get-started-stt-diarization

For recordings, you can use the Fast Transcription REST API with diarization, then do post-processing to attach names .https://learn.microsoft.com/en-us/azure/ai-services/speech-service/rest-speech-to-text

Share via

Azure Real-Time diarization

1 answer

Your answer