Share via

Azure Real-Time diarization

Karyna Khinevich 0 Reputation points
2025-07-11T10:34:16.3133333+00:00

Hi! I am working on a project in Python, in which I use Azure AI Speech Service.

More specifically, I implemented real-time dairization using the azure.cognitiveservices.speech.transcription.ConversationTranscriber class. And now I am working on speaker recognition, so that instead of Guest-1, the transcription displays the name of the speaker, which I previously saved in the system.

I found a suitable Participant class for this, to which I need to pass a voice signature, but the services that allow you to get a voice signature are either unavailable in Python or will be depricated.

What alternatives does Azure currently offer for using the azure.cognitiveservices.speech.transcription.Participant class and similar ones? Or are these classes also planned to be depricated?

Azure Speech in Foundry Tools
0 comments No comments

1 answer

Sort by: Most helpful
  1. Amira Bedhiafi 41,386 Reputation points MVP Volunteer Moderator
    2025-08-25T18:59:29.0133333+00:00

    Hello Karyna !

    Thank you for posting on Microsoft Learn.

    The Participant and voice signature flow belongs to the meeting transcription scenario and Microsoft has announced that Speaker Recognition (voice profiles / voice signatures) will be retired on September 30, 2025. After that date, apps won’t be able to use speaker recognition. Meanwhile, real-time diarization with ConversationTranscriber intentionally does not use voice signatures, it only gives generic IDs like Guest-1, Guest-2.

    https://learn.microsoft.com/en-us/azure/ai-services/speech-service/speaker-recognition-overview

    You can stay with ConversationTranscriber + diarization and map names yourself where you leep a dictionary of {speakerId -> displayName} and let users claim a name the first time they speak or pre-map by device/channel if you capture multichannel audio.

    This is the supported real-time path, it returns stable speaker IDs per participant but not identities. https://learn.microsoft.com/en-us/azure/ai-services/speech-service/get-started-stt-diarization

    For recordings, you can use the Fast Transcription REST API with diarization, then do post-processing to attach names .https://learn.microsoft.com/en-us/azure/ai-services/speech-service/rest-speech-to-text

    1 person found this answer helpful.
    0 comments No comments

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.