In Pronunciation Assessment, it looks like the 'insertion' category of errors will only 'insert' words that appear in the reference text, and won't actually transcribe the actual word spoken/inserted. Is this expected behavior?

Alex Del Giudice 20 Reputation points
2024-02-20T16:38:38.8433333+00:00

When using Pronunciation assessment mode, if the speaker inserts a word that is not in the reference text, the word that is transcribed in its place (with the 'insertion' error type) is always a similar sounding word from the reference text rather than the word that was actually spoken. If we are interested in recording reading errors (categorizing them, maybe, as semantic or phonological, errors) this is not so useful (although, it appears that we can examine the phonemic errors). So i'm wondering if there's a way to run pronunciation assessment but allow for the transcription of insertions to be based on the audio input rather than the reference text.

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
1,673 questions
{count} votes

Accepted answer
  1. VasaviLankipalle-MSFT 17,021 Reputation points
    2024-02-28T02:00:16.46+00:00

    Hello @Alex Del Giudice , thank you for your time and patience throughout this issue.

    The product team confirmed that this behavior is by design.

    Also, without a reference text, it would be difficult to determine which words are insertions or omissions. The Pronunciation Assessment tool relies on the reference text to compare the transcript with the expected transcript.

    I hope this helps.

    Regards,

    Vasavi

    -Please kindly accept the answer and vote 'yes' if you feel helpful to support the community, thanks.


0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.