NodeJS STT ContinuousRecognitionAsync - NoMatch, InitialSilenceTimeout

Question

NodeJS STT ContinuousRecognitionAsync - NoMatch, InitialSilenceTimeout

Jatebo 1

I am trying to set up STT in Node JS, and am having trouble with transcribing longer audio.

With the below code, I am able to transcribe audio from the Azure Samples data, however each time I try to use my own file, the text is not recognised- however if I run the same file through the short audio transcription, the first utterance / 15 seconds is recognised and returned.

I've tried to set a silence timeout in case this solved the problem, but this hasn't had any impact.

Also, with the below, the first attempt always returns an error before the transcription starts - how is best to avoid this?

(async function () {
    // <code>
    "use strict";

    var sdk = require("microsoft-cognitiveservices-speech-sdk");
    var fs = require("fs");


    var subscriptionKey = "MY_SUBSCRIPTION_KEY";
    var serviceRegion = "service region"
    var filename = "myAudioFile.wav";

    var pushStream = sdk.AudioInputStream.createPushStream();


    fs.createReadStream(filename)
        .on("data", function (arrayBuffer) {
            pushStream.write(arrayBuffer.slice());
        })
        .on("end", function () {
            pushStream.close();
        });


    console.log("Now recognizing from: " + filename);


    var audioConfig = sdk.AudioConfig.fromStreamInput(pushStream);
    var speechConfig = sdk.SpeechConfig.fromSubscription(
        subscriptionKey,
        serviceRegion
    );


    speechConfig.speechRecognitionLanguage = "en-US";

    speechConfig.setProperty(
        sdk.PropertyId.SpeechServiceConnection_InitialSilenceTimeoutMs,
        "10000"
    );

    var recognizer = new sdk.SpeechRecognizer(speechConfig, audioConfig);


    let transcript = { result: [] };

    recognizer.startContinuousRecognitionAsync(
        (recognizer.recognized = (recognizer, event) => {
            try {
                const res = event.result;
                console.log(`RECOGNIZED:AZURE Reason=${sdk.ResultReason[res.reason]}`);
                transcript.result.push(res.text);
                console.log(transcript.result.join(" "));
            } catch (error) {
                console.log(`RECOGNIZED:AZURE ErrorDetails=${error.errorDetails}`);
            }
        })
    );
})();

YutongTie-MSFT 53,966 Reputation points Moderator

2022-03-18T00:22:00.273+00:00

Hello @Jatebo

Thanks for reaching out to us, could you please share the error details? It will help the community to understand this issue.

Regards,
Yutong

Jatebo 1

Hi,

So when I run my file it doesn't error, it just returns with 'No Match'. The log of the 'res' object is below:

SpeechRecognitionResult {
  privResultId: <result ID>,
  privReason: 0,
  privText: undefined,
  privDuration: 104900000,
  privOffset: 0,
  privLanguage: undefined,
  privLanguageDetectionConfidence: undefined,
  privErrorDetails: undefined,
  privJson: '{"Id":<result ID>,"RecognitionStatus":"InitialSilenceTimeout","Offset":0,"Duration":104900000}',
  privProperties: PropertyCollection {
    privKeys: [ 'SpeechServiceResponse_JsonResult' ],
    privValues: [
      '{"Id":<result ID>,"RecognitionStatus":"InitialSilenceTimeout","Offset":0,"Duration":104900000}'
    ]
  },
  privSpeakerId: undefined
}

Jatebo 1

When successful, the res object looks like this:

    SpeechRecognitionResult {
      privResultId: '<result ID>',
      privReason: 3,
      privText: 'The ocelot leopardus Pardalis is a small Wildcat native to the southwestern United States, Mexico, and Central and South America.',
      privDuration: 92500000,
      privOffset: 10300000,
      privLanguage: undefined,
      privLanguageDetectionConfidence: undefined,
      privErrorDetails: undefined,
      privJson: '{"Id":"<result ID>","RecognitionStatus":"Success","DisplayText":"The ocelot leopardus Pardalis is a small Wildcat native to the southwestern United States, Mexico, and Central and South America.","Offset":10300000,"Duration":92500000}',
      privProperties: PropertyCollection {
        privKeys: [ 'SpeechServiceResponse_JsonResult' ],
        privValues: [
          '{"Id":"<result ID>","RecognitionStatus":"Success","DisplayText":"The ocelot leopardus Pardalis is a small Wildcat native to the southwestern United States, Mexico, and Central and South America","Offset":10300000,"Duration":92500000}'
        ]
      },
      privSpeakerId: undefined
    }

Your answer

YutongTie-MSFT 53,966 Reputation points Moderator

2022-03-18T00:22:00.273+00:00

Hello @Jatebo

Thanks for reaching out to us, could you please share the error details? It will help the community to understand this issue.

Regards,
Yutong
Jatebo 1 Reputation point

2022-03-18T10:40:01.76+00:00

Hi,

So when I run my file it doesn't error, it just returns with 'No Match'. The log of the 'res' object is below:

SpeechRecognitionResult { privResultId: <result ID>, privReason: 0, privText: undefined, privDuration: 104900000, privOffset: 0, privLanguage: undefined, privLanguageDetectionConfidence: undefined, privErrorDetails: undefined, privJson: '{"Id":<result ID>,"RecognitionStatus":"InitialSilenceTimeout","Offset":0,"Duration":104900000}', privProperties: PropertyCollection { privKeys: [ 'SpeechServiceResponse_JsonResult' ], privValues: [ '{"Id":<result ID>,"RecognitionStatus":"InitialSilenceTimeout","Offset":0,"Duration":104900000}' ] }, privSpeakerId: undefined }
Jatebo 1 Reputation point

2022-03-18T10:44:53.683+00:00

When successful, the res object looks like this:

SpeechRecognitionResult { privResultId: '<result ID>', privReason: 3, privText: 'The ocelot leopardus Pardalis is a small Wildcat native to the southwestern United States, Mexico, and Central and South America.', privDuration: 92500000, privOffset: 10300000, privLanguage: undefined, privLanguageDetectionConfidence: undefined, privErrorDetails: undefined, privJson: '{"Id":"<result ID>","RecognitionStatus":"Success","DisplayText":"The ocelot leopardus Pardalis is a small Wildcat native to the southwestern United States, Mexico, and Central and South America.","Offset":10300000,"Duration":92500000}', privProperties: PropertyCollection { privKeys: [ 'SpeechServiceResponse_JsonResult' ], privValues: [ '{"Id":"<result ID>","RecognitionStatus":"Success","DisplayText":"The ocelot leopardus Pardalis is a small Wildcat native to the southwestern United States, Mexico, and Central and South America","Offset":10300000,"Duration":92500000}' ] }, privSpeakerId: undefined }

Share via

NodeJS STT ContinuousRecognitionAsync - NoMatch, InitialSilenceTimeout

Your answer