Cognitive services pronunciation assessment always gives 100% score, even with badly pronounced words

Question

Cognitive services pronunciation assessment always gives 100% score, even with badly pronounced words

Schoolblocks 0

I built a svelte (javascript) application that uses the microsoft speech sdk (v1.36), and i am using it to evaluate pronunciation in 3 languages: english, german and french.

Initially i was using RecognizeOnceAsync() which waits for silence at the end of the user's speech to then evaluate the pronunciation, but since we use it in a classroom, i switched to startContinuousRecognitionAsync() which allows the user to start and stop the speech, making it better for crowded rooms.

The problem is that the pronunciation assessment is almost always 100% in all parameters (accuracy, fluency, completeness and prosody - when available (currently only english has it)). i notice that for shorter phrases, like "good morning" or "guten morgen", the score is always 100, no matter how weirdly i speak or make my pronunciation wrong. If the phrase is slightly longer, i see better results, with some words marked as 80 accuracy, etc.

This is causing the product to be unusable. We are evaluating children's pronunciation, and always getting a 100 score defeats the purpose.

The results from the first one shot method were much better than with continuous recognition, i.e. it correctly assessed my pronunciation on all languages, giving adequate scores to my speech.

Some information that might be useful for your reply:

i cloned the sdk repo (https://github.com/microsoft/cognitive-services-speech-sdk-js) and built it locally to use the sdk minimized build - using latest;
i am including the sdk using <script> tag
my project uses svelte framework (v3.48);
not using any backend - its the frontend that calls microsoft directly using the sdk
the scores i mention being 100 are straight from raw microsoft response, so no treatment on my end
i have the service settings at: gradingSystem: HundredMark, granularity: Phoneme and Dimension: Comprehensive, prosody enabled, miscue enabled.
tried configuring the phoneme alphabet, both for SAPI and IPA, results are the same
investigated the resulting JSON from the response and both word scores and phoneme scores are all 100, no matter how badly i pronounce the text

We are a paying customer and this is an essential feature in our language learning product. What can i do about this? I see this as a huge fault in your service, i.e. a pronunciation evaluation that always gives perfect scores no matter how the pronunciation is done is basically useless.

Any hints, suggestions is very welcome.

Please ask if you need any extra information.

VasaviLankipalle-MSFT 18,676 Reputation points Moderator

2024-05-17T20:50:29.52+00:00

Hello @Schoolblocks , Thanks for using Microsoft Q&A Platform.

For deeper investigation on this, if you have support plan, I suggest raising the support ticket in the Azure portal. Please let us know if you require one, we are happy to provide one-time free support ticket.
Schoolblocks 0 Reputation points

2024-05-20T09:06:04.1166667+00:00

@VasaviLankipalle-MSFT we are aa Azure customer, but dont have a specific (paid) support plan. How can i access what you mentioned: the one-time support ticket? I.e. how can i escalate this issue so we can get more technical support and, hopefully, solve it?

Your answer

VasaviLankipalle-MSFT 18,676 Reputation points Moderator

2024-05-17T20:50:29.52+00:00

Hello @Schoolblocks , Thanks for using Microsoft Q&A Platform.

For deeper investigation on this, if you have support plan, I suggest raising the support ticket in the Azure portal. Please let us know if you require one, we are happy to provide one-time free support ticket.
Schoolblocks 0 Reputation points

2024-05-20T09:06:04.1166667+00:00

@VasaviLankipalle-MSFT we are aa Azure customer, but dont have a specific (paid) support plan. How can i access what you mentioned: the one-time support ticket? I.e. how can i escalate this issue so we can get more technical support and, hopefully, solve it?

Share via

Cognitive services pronunciation assessment always gives 100% score, even with badly pronounced words

Your answer