Getting "Confidence": 0 in the response for every word

Anju 31

I am using Speech to Text REST API in my work.

Taking help of the documentation https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/rest-speech-to-text-short#pronunciation-assessment-parameters

The response does not match with the documentation. I am getting a new variable with value 0 e.g.

 "Words": [
          {
            "Word": "how",
            "Offset": 600000,
            "Duration": 2400000,
            "ErrorType": "None",
            "Confidence": 0,
            "AccuracyScore": 62
          },
          {
            "Word": "is",
            "Offset": 3100000,
            "Duration": 900000,
            "ErrorType": "None",
            "Confidence": 0,
            "AccuracyScore": 92
          },
          {
            "Word": "the",
            "Offset": 4100000,
            "Duration": 1100000,
            "ErrorType": "None",
            "Confidence": 0,
            "AccuracyScore": 100
          }
]

Can you please help as what is the purpose of the variable "Confidence" and why is it value always 0?

Thanks

VasaviLankipalle-MSFT 17,111 Reputation points

2023-03-11T02:23:02.5566667+00:00

Hi @Anju , Thanks for using Microsoft Q&A Platform.

Is it possible for you to share the code snippets with us so that we can reproduce them? Have you checked the confidence score generated for a sentence?
VasaviLankipalle-MSFT 17,111 Reputation points

2023-03-14T02:38:31.23+00:00

Hi @Anju , the "Confidence" score in the pronunciation assessment at the word level is a value between 0 and 1 that represents the level of confidence from 0.0 (no confidence) to 1.0 (full confidence).
If the value is always 0, I believe it could indicate that there is an issue with the audio input or the recognition settings.

Please share the code details with us so we can check on our end as well.

Anju 31

Hi,

Adding the response.

At the word level,If the accuracy score is 100.0 but the confidence score is 0.0, it looks like the two scores are telling entirely opposite thing

{"RecognitionStatus":"Success","Offset":10800000,"Duration":20000000,"NBest":[{"Confidence":0.9891657,"Lexical":"How do I run this program","ITN":"How do I run this program","MaskedITN":"how do i run this program","Display":"How do I run this program?","AccuracyScore":100.0,"Words":[{"Word":"How","Offset":10800000,"Duration":3300000,"Confidence":0.0,"AccuracyScore":97.0},{"Word":"do","Offset":14200000,"Duration":3200000,"Confidence":0.0,"AccuracyScore":100.0},{"Word":"I","Offset":17500000,"Duration":1000000,"Confidence":0.0,"AccuracyScore":100.0},{"Word":"run","Offset":18600000,"Duration":2900000,"Confidence":0.0,"AccuracyScore":100.0},{"Word":"this","Offset":21600000,"Duration":2300000,"Confidence":0.0,"AccuracyScore":100.0},{"Word":"program","Offset":24000000,"Duration":6800000,"Confidence":0.0,"AccuracyScore":100.0}]}],"DisplayText":"How do I run this program?"}

VasaviLankipalle-MSFT 17,111 Reputation points

2023-03-16T15:12:28.31+00:00

@Anju , as mentioned earlier there could be many reasons. Is it possible to share the code you are trying with so that I can reproduce it on my end and help you more on this?
Anju 31 Reputation points

2023-03-22T12:12:16.3+00:00
Hi,

I m using the curl command for this. Sharing the request json (removing the API key)

curl -X POST "https://centralindia.stt.speech.microsoft.com/speech/recognition/conversation/cognitiveservices/v1?language=en-US&format=simple" -H "accept: application/json" -H "Ocp-Apim-Subscription-Key: <api-key>" -H "Content-Type: audio/wav; codecs=audio/pcm; samplerate=16000" -H "Pronunciation-Assessment: ewogICJSZWZlcmVuY2VUZXh0IjogIkhvdyBkbyBJIHJ1biB0aGlzIHByb2dyYW0uIiwKICAiR3JhZGluZ1N5c3RlbSI6ICJIdW5kcmVkTWFyayIsCiAgIkdyYW51bGFyaXR5IjogIldvcmQiCn0=" --data-binary @./en-US_0.wav

The PronunciationAssessment block is created from this input block

{ "ReferenceText": "How do I run this program.", "GradingSystem": "HundredMark", "Granularity": "Word" }
Anju 31 Reputation points

2023-04-06T14:13:42.12+00:00

@VasaviLankipalle-MSFT , any updates? Thanks

Share via

Getting "Confidence": 0 in the response for every word

Your answer