Difference in evaluation of Audio in Cognitive Pronunciation Assessment tool and same python SDK
I encountered a technical issue while testing the Pronunciation Assessment Python SDK sample. I used an audio file (temple.wav) that contains the word "temple" pronounced in US English dialect.
When I tested the same audio in your Speech Studio Pronunciation Assessment tool, it provided a result of 100 marks in JSON format: [https://speech.microsoft.com/portal/pronunciationassessmenttool].
However, when I used your Pronunciation Assessment Python SDK with the same audio, it returned very low marks for the identical audio with the US dialect. You can find the SDK here: [https://github.com/Azure-Samples/Cognitive-Speech-TTS/tree/master/PronunciationAssessment/Python].
I have the following queries:
- Why is there a different evaluation result for the same audio in the same dialect (en-us)?
Please find the attached text document with both JSON results for your reference.ResultJson-Onlinetool.txt