Difference in evaluation of Audio in Cognitive Pronunciation Assessment tool and same python SDK

jayenthiran 0 Reputation points
2023-09-21T04:51:04.5833333+00:00

I encountered a technical issue while testing the Pronunciation Assessment Python SDK sample. I used an audio file (temple.wav) that contains the word "temple" pronounced in US English dialect.

When I tested the same audio in your Speech Studio Pronunciation Assessment tool, it provided a result of 100 marks in JSON format: [https://speech.microsoft.com/portal/pronunciationassessmenttool].

However, when I used your Pronunciation Assessment Python SDK with the same audio, it returned very low marks for the identical audio with the US dialect. You can find the SDK here: [https://github.com/Azure-Samples/Cognitive-Speech-TTS/tree/master/PronunciationAssessment/Python].

I have the following queries:

  1. Why is there a different evaluation result for the same audio in the same dialect (en-us)?

Please find the attached text document with both JSON results for your reference.ResultJson-Onlinetool.txt

Azure AI Language
Azure AI Language
An Azure service that provides natural language capabilities including sentiment analysis, entity extraction, and automated question answering.
522 questions
{count} votes

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.