Text Independent - Identify Single Speaker
Identify Single Speaker Profile
Identifies who is speaking in input audio among a list of candidate profiles.
Limitations:
Minimum audio input length is 1 second
Maximum audio input length is 120 seconds
Minimum candidate speakers count is 1
Maximum candidate speakers count is 50
Minimum effective speech length (excluding silence and other non-speech frames) is 4 seconds This limitation can be disabled by setting "ignoreMinLength" to true.
Minimum audio Signal-to-noise ratio (SNR) is 2dB
POST {endpoint}/speaker-recognition/identification/text-independent/profiles:identifySingleSpeaker?api-version=2021-09-05&profileIds={profileIds}
POST {endpoint}/speaker-recognition/identification/text-independent/profiles:identifySingleSpeaker?api-version=2021-09-05&profileIds={profileIds}&ignoreMinLength={ignoreMinLength}
URI Parameters
Name | In | Required | Type | Description |
---|---|---|---|---|
endpoint
|
path | True |
string |
Supported Cognitive Services endpoints (protocol and hostname, for example: https://westus.api.cognitive.microsoft.com). |
api-version
|
query | True |
string |
Specifies the version of the operation to use for this request. |
profile
|
query | True |
string[] |
Comma-delimited profile IDs. Maximum supported number is 50 IDs. |
ignore
|
query |
boolean |
If true, the minimum amount of speech needed for identification is skipped. Default is false. |
Request Header
Media Types: "audio/wav; codecs=audio/pcm"
Name | Required | Type | Description |
---|---|---|---|
Ocp-Apim-Subscription-Key | True |
string |
Request Body
Media Types: "audio/wav; codecs=audio/pcm"
Name | Type | Description |
---|---|---|
audioData |
object |
Binary audio file. Supported formats are audio/wav; codecs=audio/pcm. Supports audio up to 5MB. |
Responses
Name | Type | Description |
---|---|---|
200 OK |
OK |
|
Other Status Codes |
Failure Headers x-ms-error-code: string |
Security
Ocp-Apim-Subscription-Key
Type:
apiKey
In:
header
Examples
Successful Query
Sample request
POST https://westus.api.cognitive.microsoft.com/speaker-recognition/identification/text-independent/profiles:identifySingleSpeaker?api-version=2021-09-05&profileIds=3669fa29-1bf3-45ad-beea-6b348d058d7e,111f427c-3791-468f-b709-fcef7660fff9,0e196cd9-32d5-4883-8631-54a0e7c7cb3d,0e196cd9-32d5-4883-8631-54a0e7c7cb3d,726e57d9-04e0-4214-b482-7f786fa83560,f95189fd-1bf5-4485-9c2e-e5897e0c98ca
"{binary file date}"
Sample response
Content-Type: application/json
{
"identifiedProfile": {
"profileId": "111f427c-3791-468f-b709-fcef7660fff9",
"score": 0.63
},
"profilesRanking": [
{
"profileId": "111f427c-3791-468f-b709-fcef7660fff9",
"score": 0.63
},
{
"profileId": "3669fa29-1bf3-45ad-beea-6b348d058d7e",
"score": 0.49
},
{
"profileId": "0e196cd9-32d5-4883-8631-54a0e7c7cb3d",
"score": 0.4
},
{
"profileId": "726e57d9-04e0-4214-b482-7f786fa83560",
"score": 0.1
},
{
"profileId": "f95189fd-1bf5-4485-9c2e-e5897e0c98ca",
"score": 0.03
}
]
}
Content-Type: application/json
x-ms-error-code: Error Code
{
"error": {
"code": "Error Code",
"message": "Erro Messae"
}
}
Definitions
Name | Description |
---|---|
Error | |
Identified |
|
Identify |
Identified speaker info |
Speaker |
Speaker error message |
Error
Name | Type | Description |
---|---|---|
code |
string |
|
message |
string |
IdentifiedSingleSpeakerInfo
Name | Type | Description |
---|---|---|
identifiedProfile |
Object containing data of identified profile. |
|
profilesRanking |
Object containing data of the top 5 profiles (including identified profile) sorted in descending order by score. |
IdentifyInfo
Identified speaker info
Name | Type | Description |
---|---|---|
profileId |
string |
ID of identified of profile. If no candidate is identified as the right speaker, the value is set to empty GUID. |
score |
number |
A float number indicating the similarity between input audio and targeted voice print. This number must be between 0 and 1. A higher number means higher similarity. |
SpeakerErrorInfo
Speaker error message
Name | Type | Description |
---|---|---|
error |