How are syllable and word accuracy scores computed from phoneme-level accuracy scores in pronunciation assessments?

Danica Shi 0 Reputation points
2023-08-25T15:53:52.9833333+00:00

I have received the pronunciation assessment result for the word "time" and I have two questions about the results:

  1. How are syllable and word accuracy scores computed from phoneme-level accuracy scores in pronunciation assessments?
  2. As the word "time" only has one syllable, why is the word-level accuracy score different from the syllable-level accuracy score?

Shown below is the pronunciation assessment result for the word "time". Thank you in advance.

{
                            "Word": "time",
                            "Offset": 14900000,
                            "Duration": 4800000,
                            "PronunciationAssessment": {
                                "AccuracyScore": 92,
                                "ErrorType": "None"
                            },
                            "Syllables": [
                                {
                                    "Syllable": "taɪm",
                                    "PronunciationAssessment": {
                                        "AccuracyScore": 72
                                    },
                                    "Offset": 14900000,
                                    "Duration": 4800000
                                }
                            ],
                            "Phonemes": [
                                {
                                    "Phoneme": "t",
                                    "PronunciationAssessment": {
                                        "AccuracyScore": 92,
                                        "NBestPhonemes": [
                                            {
                                                "Phoneme": "t",
                                                "Score": 100
                                            },
                                            {
                                                "Phoneme": "ə",
                                                "Score": 78
                                            },
                                            {
                                                "Phoneme": "ɝ",
                                                "Score": 50
                                            },
                                            {
                                                "Phoneme": "h",
                                                "Score": 34
                                            },
                                            {
                                                "Phoneme": "d",
                                                "Score": 19
                                            }
                                        ]
                                    },
                                    "Offset": 14900000,
                                    "Duration": 1000000
                                },
                                {
                                    "Phoneme": "aɪ",
                                    "PronunciationAssessment": {
                                        "AccuracyScore": 81,
                                        "NBestPhonemes": [
                                            {
                                                "Phoneme": "m",
                                                "Score": 84
                                            },
                                            {
                                                "Phoneme": "aɪ",
                                                "Score": 71
                                            },
                                            {
                                                "Phoneme": "r",
                                                "Score": 68
                                            },
                                            {
                                                "Phoneme": "ə",
                                                "Score": 60
                                            },
                                            {
                                                "Phoneme": "n",
                                                "Score": 49
                                            }
                                        ]
                                    },
                                    "Offset": 16000000,
                                    "Duration": 700000
                                },
                                {
                                    "Phoneme": "m",
                                    "PronunciationAssessment": {
                                        "AccuracyScore": 62,
                                        "NBestPhonemes": [
                                            {
                                                "Phoneme": "m",
                                                "Score": 81
                                            },
                                            {
                                                "Phoneme": "n",
                                                "Score": 72
                                            },
                                            {
                                                "Phoneme": "ə",
                                                "Score": 61
                                            },
                                            {
                                                "Phoneme": "oʊ",
                                                "Score": 50
                                            },
                                            {
                                                "Phoneme": "ŋ",
                                                "Score": 37
                                            }
                                        ]
                                    },
                                    "Offset": 17000000,
                                    "Duration": 3000000
                                }
                            ]
                        }
Azure AI Language
Azure AI Language
An Azure service that provides natural language capabilities including sentiment analysis, entity extraction, and automated question answering.
485 questions
{count} votes

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.