How to segment Speech-To-Text recognized result text ?

Kun Wu 146 Reputation points Microsoft Employee
2024-04-01T02:41:29.4966667+00:00

Hello,

i'm using Speech SDK. How can i segment Speech-To-Text recognized result text ?

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
1,736 questions
0 comments No comments
{count} votes

Accepted answer
  1. Q&A Assist
    2024-04-01T02:41:29.5166667+00:00
    AI generated content. This question contains an answer created with AI from Q&A Assist. Learn more

    To segment Speech-To-Text recognized result text, you can use the getOffset() method of the RecognitionResult class. This method returns the start and end positions of the recognized text in the original audio stream. You can use these positions to segment the recognized text as needed. Here's an example in Java:

    String recognizedText = result.getText();
    int startOffset = result.getOffset();
    int endOffset = startOffset + recognizedText.length();
    String originalAudio = // get original audio stream
    String segmentedAudio = originalAudio.substring(startOffset, endOffset);
    

    In this example, result is an instance of the RecognitionResult class, which is obtained from the SpeechRecognizer object. recognizedText is the text that was recognized from the audio stream, and startOffset and endOffset are the start and end positions of the recognized text in the original audio stream. originalAudio is the original audio stream, and segmentedAudio is the portion of the audio stream that corresponds to the recognized text.


    References:

    1 person found this answer helpful.
    0 comments No comments

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.