Hello @vinit sawant , Thanks for reaching out with a great question. I hope the below information really helps with your initial query!
The Azure Percept Audio (sometimes called the Percept Ear) is a "System on a Module" or SoM.
The Azure Percept Audio SoM makes use of a couple of Azure Services to process Audio;
LUIS: Allows interaction with applications and devices using natural language.
Cognitive Speech Identification: is an Azure Service offering Text-to-speech, speech-to-text, speech translation and speaker recognition.
Updated: 7/26/2022: Below is the update from the AIML engineers on the initial query. I Hope this helps!
"Speech Identification verifies and identifies speakers by their unique voice characteristics, by using voice biometry. You provide audio training data for a single speaker, which creates an enrollment profile based on the unique characteristics of the speaker's voice. You can then cross-check audio voice samples against this profile to verify that the speaker is the same person (speaker verification). You can also cross-check audio voice samples against a group of enrolled speaker profiles to see if it matches any profile in the group (speaker identification). Then you do recognition, it will recognize based on the file.
Characteristics somehow like nose, eye, lip,... in face recognition"
Please let us know if you need further help in this matter, happy to help!