Share via

Word level timestamps for real time transcription

Jade Nameless 1 Reputation point
2021-06-12T01:26:14.533+00:00

My team needs to synch up words in the transcript with events from another source (button presses, specifically). The final results of transcription have word level timestamps when we use the appropriate config arguments, but intermediate results (associated with Recognizing events) do not. How can we get word level timestamps when doing real time transcription?

Azure AI Speech
Azure AI Speech

An Azure service that integrates speech processing into apps and services.

Developer technologies | C#
Developer technologies | C#

An object-oriented and type-safe programming language that has its roots in the C family of languages and includes support for component-oriented programming.

Foundry Tools
Foundry Tools

Formerly known as Azure AI Services or Azure Cognitive Services is a unified collection of prebuilt AI capabilities within the Microsoft Foundry platform


1 answer

Sort by: Most helpful
  1. Ramr-msft 17,836 Reputation points
    2021-06-14T07:00:16.2+00:00

    @Jade Nameless Thanks for the question. Can you please share link to the code for transcription and API that you are trying. Please add more details about the intermediate results that you are getting.

    Please follow the threads to request word level timestamps in the speech config.
    To Generate Timestamps in STT model.


Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.