Reduce latency in text to speech microsoft speedh SDK

Nas 0 Reputation points

I am using Microsoft-cognitiveservices-speech-sdk in a react codebase, what is the best way to reduce latency and get output as fast as possible?

When I give it a text, the amount of seconds before the output is played is just too much and need a way to make this more real-time. Is there a way to start playing the sound as the transcribe is being done instead of waiting for the entire text to be synthesized?

   speechSynthesizer.synthesizing = () => {

     // Start playing audio

Right now, I am playing the sound with below sample code, it works but takes so much time to start speaking:

      (result) => {
        audioContext.current.decodeAudioData(result.audioData, (buffer) => {
          if (result.reason === ResultReason.SynthesizingAudioCompleted) {

            const newBufferSource = audioContext.current.createBufferSource();

            newBufferSource.buffer = buffer;

      (err) => {
        console.error('Speech synthesis error:', err);
Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
1,390 questions
{count} votes

1 answer

Sort by: Most helpful
  1. navba-MSFT 16,940 Reputation points Microsoft Employee

    @Nas Welcome to Microsoft Q&A Forum, Thank you for posting your query here!

    Please note, Microsoft does not publish any SLA for latency. Latency is a combination of many factors, including your network and client performance, especially when accessing lesser-used voices in text-to-speech.


    • Try with most recent version of the SDK and check if you still encounter same issue.
    • Please measure the Latency: The Speech SDK provides properties to measure the latency. You can use SpeechServiceResponse_SynthesisFirstByteLatencyMs to measure the time delay between the start of the synthesis task and receipt of the first chunk of audio data. Similarly, SpeechServiceResponse_SynthesisFinishLatencyMs can be used to measure the time delay between the start of the synthesis task and the receipt of the whole synthesized audio data.
    var result = await synthesizer.speakTextAsync(text);
    console.log(`first byte latency: ${} ms`);
    console.log(`finish latency: ${} ms`);

    To lower speech synthesis latency using Speech SDK there are a few best practices to lower the latency and bring the best performance to your end users. Please follow the recommendations available here:

    If the above suggestions, doesn't help you can enable the JS SDK logging as shown below:

    sdk.Diagnostics.SetLoggingLevel(sdk.LogLevel.Debug); sdk.Diagnostics.SetLogOutputPath("LogfilePathAndName");

    Hope this helps. If you have any follow-up questions, please let me know. I would be happy to help.

    0 comments No comments