How to get the JSON output that is shown in Speed Studio

Ritvij Sharma 60 Reputation points

User's image

I want this JSON object in my react project when I use the Azure Speech SDK. How can I get this?

The issue I have with the JS SDK is that the word level timestamps are not accurate, some offset-duration combos has 2 words. I need exact word by word durations at least. And I only see that here in this JSON object.

What is the right solution to get this word by word durations?

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
1,540 questions
{count} votes

Accepted answer
  1. navba-MSFT 20,530 Reputation points Microsoft Employee

    @Ritvij Sharma Thanks for your your reply.
    I'm glad to see you were able to resolve your issue. Thanks for posting your solution so that others experiencing the same thing can easily reference this. Since the Microsoft Q&A community has a policy that the question author cannot accept their own answer, they can only accept answers by others, I'll repost your solution in case you'd like to Accept the answer.
    You are looking for a solution to get word-by-word durations using the Azure Speech SDK. You have mentioned that the word level timestamps are not accurate and you need exact word-by-word durations.
    The correct answer to this, without having to do batch transcriptions and using the JS SDK itself, is to use pronunciation assessment. On activating pronunciation assessment, the results give the offset and duration for each and every word in the speech.

    Related documentation to activate pronunciation assessment is here:

    1 person found this answer helpful.
    0 comments No comments

0 additional answers

Sort by: Most helpful