Share via

How to send a in-memory wav file to Speech-To-text

Larissa de Araújo Barros 0 Reputation points
2023-09-05T19:11:27.3266667+00:00

I've been trying to get the speech to text to work on a .NET API. I am receiving a wav file via IFormFile from a API and then sending it to Azure Speech, using the Cognitive Services SDK. It is returning an error when i send the file: "Result = {ResultId:Reason:NoMatch Recognized text:<>.

Json:{"Id":"","RecognitionStatus":"InitialSilenceTimeout",

"DisplayText":"","Offset":15600000,"Duration":75100000,"Channel":0}}"
The problem is that if I save this file to a local directory and then read it, creating a "

using var audioConfig = AudioConfig.FromWavFileInput(newFile);
                using var recognizer = new SpeechRecognizer(speechConfig, audioConfig);

" as it shows on docs https://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-recognize-speech?pivots=programming-language-csharp it works, but how can i make this work without saving the file locally?
I tried this:

private async void ReadFromStream(SpeechConfig speechConfig, IFormFile audioFile)
        {
            speechConfig.SetProperty(PropertyId.SpeechServiceConnection_InitialSilenceTimeoutMs, "10000");

            using var audioFormat = AudioStreamFormat.GetWaveFormatPCM(16000, 16, 1);
            using (var audioConfigStream = new PushAudioInputStream(audioFormat))
            {
                using (var audioConfig = AudioConfig.FromStreamInput(audioConfigStream))
                {

                    using (var speechRecognizer = new SpeechRecognizer(speechConfig, audioConfig))
                    {
                        audioFile.Headers.Clear();

                        var bytes = ConvertToByteArrayContent(audioFile);

                        audioConfigStream.Write(bytes);
                        audioConfigStream.Close();

                        var speechRecognitionResult = speechRecognizer.RecognizeOnceAsync().Result;
                        OutputSpeechRecognitionResult(speechRecognitionResult);

                    }
                }
            }

        }
 private byte[] ConvertToByteArrayContent(IFormFile audiofile)
        {
            byte[] data;

            using (var br = new BinaryReader(audiofile.OpenReadStream()))
            {
                data = br.ReadBytes((int)audiofile.OpenReadStream().Length);
            }

            return data;
        }

and many other approaches.

Azure Speech in Foundry Tools

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.