How to output transcription on a word-level

Question

How to output transcription on a word-level

Sophie 0

With the provided callback function, the text is outputted as described by you, either after a short pause or after a maximum of 15 seconds. Is it possible to output word by word so that the text can be seen while speaking?


def conversation_transcriber_transcribed_cb(evt: speechsdk.SpeechRecognitionEventArgs):
    if evt.result.reason == speechsdk.ResultReason.RecognizedSpeech:
        print('\tText={}'.format(evt.result.text))
        print('\tSpeaker ID={}'.format(evt.result.speaker_id))

1 answer

Your answer

Answer 1

Gowtham CP 6,030 Volunteer Moderator

Hello Sophie ,

Thanks for reaching out in the Microsoft Q&A!

def conversation_transcriber_transcribed_cb(evt: speechsdk.SpeechRecognitionEventArgs):
    if evt.result.reason == speechsdk.ResultReason.RecognizedSpeech:
        words = evt.result.text.split()  # Split the recognized text into words
        for word in words:
            print(f"\tWord: {word}")  # Output each word
        print('\tSpeaker ID={}'.format(evt.result.speaker_id))

To achieve word-by-word output, I modified the callback function by adding code to split the recognized text into individual words and then iterating over each word to print it separately. This allows the text to be displayed incrementally as it is spoken, providing real-time feedback.

If you found this solution helpful, consider accepting it.

Gowtham CP 6,030 Reputation points Volunteer Moderator

2024-05-27T08:53:18.7966667+00:00

Hello Sophie ,

We haven't heard back from you. Could you please provide an update on your issue? If you have any questions or concerns, feel free to reach out to us. If the information provided has been helpful, please consider marking it as the accepted answer to close this case. Doing so will assist others with similar questions in finding solutions more easily.

Share via

How to output transcription on a word-level

1 answer

Your answer