While converting the given wave file from Speech-to-Text using Microsoft's Speech-to-Text service, it is not detecting "No" at 57th second in this file but detecting at 1:12 min and in other places.
Speech recognised is as follow
RECOGNIZED: {"Confidence":0.86026245,"Lexical":"yeah i need an appointment","ITN":"yeah i need an appointment","MaskedITN":"yeah i need an appointment","Display":"Yeah, I need an appointment.","Words":[{"Word":"yeah","Offset":121200000,"Duration":3600000},{"Word":"i","Offset":124800000,"Duration":400000},{"Word":"need","Offset":125200000,"Duration":2000000},{"Word":"an","Offset":127200000,"Duration":1200000},{"Word":"appointment","Offset":128400000,"Duration":6000000}]}
RECOGNIZED: {"Confidence":0.702717,"Lexical":"yes","ITN":"yes","MaskedITN":"yes","Display":"Yes.","Words":[{"Word":"yes","Offset":228900000,"Duration":5600000}]}
RECOGNIZED: {"Confidence":0.4998704,"Lexical":"morning","ITN":"morning","MaskedITN":"morning","Display":"Morning.","Words":[{"Word":"morning","Offset":355500000,"Duration":7200000}]}
2024-05-16T05:59:30.827Z [debug] microsoft-stt :: No speech could be recognized
2024-05-16T05:59:30.829Z [debug] microsoft-stt :: No speech could be recognized
RECOGNIZED: {"Confidence":0.797473,"Lexical":"no","ITN":"no","MaskedITN":"no","Display":"No.","Words":[{"Word":"no","Offset":707600000,"Duration":4000000}]}
RECOGNIZED: {"Confidence":0.76081145,"Lexical":"yes","ITN":"yes","MaskedITN":"yes","Display":"Yes.","Words":[{"Word":"yes","Offset":812000000,"Duration":6400000}]}
RECOGNIZED: {"Confidence":0.54089016,"Lexical":"yes","ITN":"yes","MaskedITN":"yes","Display":"Yes.","Words":[{"Word":"yes","Offset":944700000,"Duration":6800000}]}
RECOGNIZED: {"Confidence":0.38486534,"Lexical":"no","ITN":"no","MaskedITN":"no","Display":"No.","Words":[{"Word":"no","Offset":1121500000,"Duration":5600000}]}
2024-05-16T06:00:02.350Z [debug] microsoft-stt :: CANCELED: Reason=1
Input wave file:
https://meeamitech-my.sharepoint.com/:u:/p/vidyadhar_busam/EdmkpIY-zDlCuZFhzQRq0qYBr7PmG73wJaT0hQYW4hZdxg?nav=eyJyZWZlcnJhbEluZm8iOnsicmVmZXJyYWxBcHAiOiJPbmVEcml2ZUZvckJ1c2luZXNzIiwicmVmZXJyYWxBcHBQbGF0Zm9ybSI6IldlYiIsInJlZmVycmFsTW9kZSI6InZpZXciLCJyZWZlcnJhbFZpZXciOiJNeUZpbGVzTGlua0NvcHkifX0&e=Div3Z2
Please fix this issue.
Thanks.