Azure Pronunciation Assessment returning the same accuracy score for all phonemes

Dominar el ingles 20 Reputation points
2023-05-14T16:30:53.75+00:00

I am trying to incorporate Azure Pronunciation Assessment into my application but I'm having a problem with the accuracy scores in the results. This happens even in the Speech Studio where Azure let's me test out the feature.

The scores for all phonemes in words are nearly exactly the same, no matter how poorly I pronounce a particular phoneme. So for example if the word it "happily" and I pronounce "hoppily", even though I only mispronounced one phoneme, all phonemes are ranked lower at about the same score.

Is this a bug in the service? Or is this service simply unable to determine which phonemes are pronounced well and which ones are not?

Here's the phoneme output for this example from the Speech Studio. I only mispronounced the phoneme "ae":

"Phonemes": [
                            {
                                "Phoneme": "h",
                                "PronunciationAssessment": {
                                    "AccuracyScore": 18
                                },
                                "Offset": 3600000,
                                "Duration": 1200000
                            },
                            {
                                "Phoneme": "ae",
                                "PronunciationAssessment": {
                                    "AccuracyScore": 16
                                },
                                "Offset": 4900000,
                                "Duration": 400000
                            },
                            {
                                "Phoneme": "p",
                                "PronunciationAssessment": {
                                    "AccuracyScore": 18
                                },
                                "Offset": 5400000,
                                "Duration": 1300000
                            },
                            {
                                "Phoneme": "ih",
                                "PronunciationAssessment": {
                                    "AccuracyScore": 18
                                },
                                "Offset": 6800000,
                                "Duration": 300000
                            },
                            {
                                "Phoneme": "l",
                                "PronunciationAssessment": {
                                    "AccuracyScore": 18
                                },
                                "Offset": 7200000,
                                "Duration": 700000
                            },
                            {
                                "Phoneme": "iy",
                                "PronunciationAssessment": {
                                    "AccuracyScore": 18
                                },
                                "Offset": 8000000,
                                "Duration": 2700000
                            }
                        ]
Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
1,393 questions
{count} votes