How to send a in-memory wav file to Speech-To-text
I've been trying to get the speech to text to work on a .NET API. I am receiving a wav file via IFormFile from a API and then sending it to Azure Speech, using the Cognitive Services SDK. It is returning an error when i send the file: "Result = {ResultId:Reason:NoMatch Recognized text:<>.
Json:{"Id":"","RecognitionStatus":"InitialSilenceTimeout",
"DisplayText":"","Offset":15600000,"Duration":75100000,"Channel":0}}"
The problem is that if I save this file to a local directory and then read it, creating a "
using var audioConfig = AudioConfig.FromWavFileInput(newFile);
using var recognizer = new SpeechRecognizer(speechConfig, audioConfig);
" as it shows on docs https://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-recognize-speech?pivots=programming-language-csharp it works, but how can i make this work without saving the file locally?
I tried this:
private async void ReadFromStream(SpeechConfig speechConfig, IFormFile audioFile)
{
speechConfig.SetProperty(PropertyId.SpeechServiceConnection_InitialSilenceTimeoutMs, "10000");
using var audioFormat = AudioStreamFormat.GetWaveFormatPCM(16000, 16, 1);
using (var audioConfigStream = new PushAudioInputStream(audioFormat))
{
using (var audioConfig = AudioConfig.FromStreamInput(audioConfigStream))
{
using (var speechRecognizer = new SpeechRecognizer(speechConfig, audioConfig))
{
audioFile.Headers.Clear();
var bytes = ConvertToByteArrayContent(audioFile);
audioConfigStream.Write(bytes);
audioConfigStream.Close();
var speechRecognitionResult = speechRecognizer.RecognizeOnceAsync().Result;
OutputSpeechRecognitionResult(speechRecognitionResult);
}
}
}
}
private byte[] ConvertToByteArrayContent(IFormFile audiofile)
{
byte[] data;
using (var br = new BinaryReader(audiofile.OpenReadStream()))
{
data = br.ReadBytes((int)audiofile.OpenReadStream().Length);
}
return data;
}
and many other approaches.