Getting "Confidence": 0 in the response for every word

Anju
31
Reputation points
I am using Speech to Text REST API in my work.
Taking help of the documentation https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/rest-speech-to-text-short#pronunciation-assessment-parameters
The response does not match with the documentation. I am getting a new variable with value 0 e.g.
"Words": [
{
"Word": "how",
"Offset": 600000,
"Duration": 2400000,
"ErrorType": "None",
"Confidence": 0,
"AccuracyScore": 62
},
{
"Word": "is",
"Offset": 3100000,
"Duration": 900000,
"ErrorType": "None",
"Confidence": 0,
"AccuracyScore": 92
},
{
"Word": "the",
"Offset": 4100000,
"Duration": 1100000,
"ErrorType": "None",
"Confidence": 0,
"AccuracyScore": 100
}
]
Can you please help as what is the purpose of the variable "Confidence" and why is it value always 0?
Thanks
Hi @Anju , the "Confidence" score in the pronunciation assessment at the word level is a value between 0 and 1 that represents the level of confidence from 0.0 (no confidence) to 1.0 (full confidence).
If the value is always 0, I believe it could indicate that there is an issue with the audio input or the recognition settings.
Please share the code details with us so we can check on our end as well.
Hi,
Adding the response.
At the word level,If the accuracy score is 100.0 but the confidence score is 0.0, it looks like the two scores are telling entirely opposite thing
@Anju , as mentioned earlier there could be many reasons. Is it possible to share the code you are trying with so that I can reproduce it on my end and help you more on this?
Hi,
I m using the curl command for this. Sharing the request json (removing the API key)
curl -X POST "https://centralindia.stt.speech.microsoft.com/speech/recognition/conversation/cognitiveservices/v1?language=en-US&format=simple" -H "accept: application/json" -H "Ocp-Apim-Subscription-Key: <api-key>" -H "Content-Type: audio/wav; codecs=audio/pcm; samplerate=16000" -H "Pronunciation-Assessment: ewogICJSZWZlcmVuY2VUZXh0IjogIkhvdyBkbyBJIHJ1biB0aGlzIHByb2dyYW0uIiwKICAiR3JhZGluZ1N5c3RlbSI6ICJIdW5kcmVkTWFyayIsCiAgIkdyYW51bGFyaXR5IjogIldvcmQiCn0=" --data-binary @./en-US_0.wav
The PronunciationAssessment block is created from this input block
Sign in to comment