Microsoft.CognitiveServices.Speech Text to Speech save to Azure Blob or Convert to Wav Byte Array

Question

I am trying to convert the Microsoft.CognitiveServices.Speech Text to Speech to and Wav file Byte Array or Save a Wav File to Azure Blob.

I have read the documentation and the only available methods are to save to a wav file on a local machine, or a Byte array that is in PCM format and not a Wav file format.

Any direction on converting PCM to WAV or Saving File directly to Azure blob would be helpful.

Accepted Answer

found my answer here using the restful api, i was ale to recieve a stream and convert the stream to a file saved in Azure blob:

https://stackoverflow.com/questions/57915170/how-to-use-azures-text-to-speech-to-create-an-audio-file-instead-of-live-text-t

Answer

@Ramr-msft thanks for your reply.

public async Task AzureSynthesisToBytesAsync(string audioText, string Id, ILogger _logger)  
        {  
            byte[] buffer = new byte[160000000];  
            var config = SpeechConfig.FromSubscription("key", "westus");  
            config.SpeechSynthesisVoiceName = "en-US-AriaNeural";  
            config.SetSpeechSynthesisOutputFormat(SpeechSynthesisOutputFormat.Raw8Khz16BitMonoPcm);  
            // Creates a speech synthesizer using the default speaker as audio output.  
            using (var synthesizer = new SpeechSynthesizer(config, null))  
            {  
                while (true)  
                {  
                    using (SpeechSynthesisResult result = await synthesizer.SpeakSsmlAsync(audioText))  
                    {  
                        if (result.Reason == ResultReason.SynthesizingAudioCompleted)  
                        {  
                            AudioDataStream stream = AudioDataStream.FromResult(result);  // to return in Memory  
                            await stream.SaveToWaveFileAsync("c://temp/TestAudio_"+ DateTime.Now.Hour.ToString() + "_" + DateTime.Now.Minute.ToString()+".wav");  
   
                            //_logger.LogInformation("AzureSynthesisController_AzureSynthesisToBytesAsync", new Dictionary { { "Id", Id }, { "ResultReasonMessage", "SynthesizingAudioCompleted" }, { "OriginalText", audioText } });  
   
  
                            var buffer2 = result.AudioData;   
                            Stream stream2 = new MemoryStream(buffer2);  
   
                            return buffer2;  
   
                            //return result.AudioData;  
                        }  
                        else if (result.Reason == ResultReason.Canceled)  
                        {  
                            var cancellation = SpeechSynthesisCancellationDetails.FromResult(result);  
                            _logger.LogInformation("AzureSynthesisController_AzureSynthesisToBytesAsync", new Dictionary { { "Id", Id }, { "message", $"CANCELED: Reason={cancellation.Reason}" }, { "result.Reason", "ResultReason.Canceled" } });  
   
                            if (cancellation.Reason == CancellationReason.Error)  
                            {  
                                _logger.LogError("ResultReason.Canceled", new Dictionary  
                                {  
                                    { "Id", Id }  
                                    , { "ErrorMethod", "AzureSynthesisController_AzureSynthesisToBytesAsync" }  
                                    , { "message", $"CANCELED: ErrorCode={cancellation.ErrorCode}" }  
                                    , { "messageDetails",$"CANCELED: ErrorDetails=[{cancellation.ErrorDetails}]" }  
                                    , { "messageDetails2", $"CANCELED: Did you update the subscription info?" }  
                                    , { "result.Reason", "CancellationReason.Error" }  
   
                                });  
   
                            }  
                        }  
                    }  
                }  
   
  
            }  
        }

I am using the Azure Cognitve Speech Sdk for .net Core, i am trying a few different angles here.

the Issue i am having is that when i try to save the result.AudioData to wav it sounds like Noise.

I used the SaveToWaveFileAsync and the file saves perfectly, but i wan to save the file to Azure blob, onyl thing i can think of is saving the byte array to memorystream and saving the stream to Azure blob, but when i do this the file is not playable.

Share via

Microsoft.CognitiveServices.Speech Text to Speech save to Azure Blob or Convert to Wav Byte Array

1 additional answer