Difference in evaluation of Audio in Cognitive Pronunciation Assessment tool and same python SDK

Question

Difference in evaluation of Audio in Cognitive Pronunciation Assessment tool and same python SDK

jayenthiran 0

I encountered a technical issue while testing the Pronunciation Assessment Python SDK sample. I used an audio file (temple.wav) that contains the word "temple" pronounced in US English dialect.

When I tested the same audio in your Speech Studio Pronunciation Assessment tool, it provided a result of 100 marks in JSON format: [https://speech.microsoft.com/portal/pronunciationassessmenttool].

However, when I used your Pronunciation Assessment Python SDK with the same audio, it returned very low marks for the identical audio with the US dialect. You can find the SDK here: [https://github.com/Azure-Samples/Cognitive-Speech-TTS/tree/master/PronunciationAssessment/Python].

I have the following queries:

Why is there a different evaluation result for the same audio in the same dialect (en-us)?

Please find the attached text document with both JSON results for your reference.ResultJson-Onlinetool.txt

Ramr-msft 17,826 Reputation points

2023-09-22T04:12:39.6166667+00:00

@jayenthiran Thnaks for the quesiton, We would recommend raising issue in the following link, the team can check on the same. https://github.com/Azure-Samples/Cognitive-Speech-TTS/issues

Your answer

Ramr-msft 17,826 Reputation points

2023-09-22T04:12:39.6166667+00:00

@jayenthiran Thnaks for the quesiton, We would recommend raising issue in the following link, the team can check on the same. https://github.com/Azure-Samples/Cognitive-Speech-TTS/issues

Share via

Difference in evaluation of Audio in Cognitive Pronunciation Assessment tool and same python SDK

Your answer