Microsoft.CognitiveServices.Speech Text to Speech save to Azure Blob or Convert to Wav Byte Array

Ahmed Bravo 101 Reputation points
2021-06-03T01:26:37.197+00:00

I am trying to convert the Microsoft.CognitiveServices.Speech Text to Speech to and Wav file Byte Array or Save a Wav File to Azure Blob.

I have read the documentation and the only available methods are to save to a wav file on a local machine, or a Byte array that is in PCM format and not a Wav file format.

Any direction on converting PCM to WAV or Saving File directly to Azure blob would be helpful.

Azure AI services
Azure AI services
A group of Azure services, SDKs, and APIs designed to make apps more intelligent, engaging, and discoverable.
2,576 questions
{count} votes

1 additional answer

Sort by: Most helpful
  1. Ahmed Bravo 101 Reputation points
    2021-06-03T16:00:45.777+00:00

    @Ramr-msft thanks for your reply.

    public async Task<byte[]> AzureSynthesisToBytesAsync(string audioText, string Id, ILogger<AzureSynthesisController> _logger)  
            {  
                byte[] buffer = new byte[160000000];  
                var config = SpeechConfig.FromSubscription("key", "westus");  
                config.SpeechSynthesisVoiceName = "en-US-AriaNeural";  
                config.SetSpeechSynthesisOutputFormat(SpeechSynthesisOutputFormat.Raw8Khz16BitMonoPcm);  
                // Creates a speech synthesizer using the default speaker as audio output.  
                using (var synthesizer = new SpeechSynthesizer(config, null))  
                {  
                    while (true)  
                    {  
                        using (SpeechSynthesisResult result = await synthesizer.SpeakSsmlAsync(audioText))  
                        {  
                            if (result.Reason == ResultReason.SynthesizingAudioCompleted)  
                            {  
                                AudioDataStream stream = AudioDataStream.FromResult(result);  // to return in Memory  
                                await stream.SaveToWaveFileAsync("c://temp/TestAudio_"+ DateTime.Now.Hour.ToString() + "_" + DateTime.Now.Minute.ToString()+".wav");  
       
                                //_logger.LogInformation("AzureSynthesisController_AzureSynthesisToBytesAsync", new Dictionary<string, string> { { "Id", Id }, { "ResultReasonMessage", "SynthesizingAudioCompleted" }, { "OriginalText", audioText } });  
       
      
                                var buffer2 = result.AudioData;   
                                Stream stream2 = new MemoryStream(buffer2);  
       
                                return buffer2;  
       
                                //return result.AudioData;  
                            }  
                            else if (result.Reason == ResultReason.Canceled)  
                            {  
                                var cancellation = SpeechSynthesisCancellationDetails.FromResult(result);  
                                _logger.LogInformation("AzureSynthesisController_AzureSynthesisToBytesAsync", new Dictionary<string, string> { { "Id", Id }, { "message", $"CANCELED: Reason={cancellation.Reason}" }, { "result.Reason", "ResultReason.Canceled" } });  
       
                                if (cancellation.Reason == CancellationReason.Error)  
                                {  
                                    _logger.LogError("ResultReason.Canceled", new Dictionary<string, string>  
                                    {  
                                        { "Id", Id }  
                                        , { "ErrorMethod", "AzureSynthesisController_AzureSynthesisToBytesAsync" }  
                                        , { "message", $"CANCELED: ErrorCode={cancellation.ErrorCode}" }  
                                        , { "messageDetails",$"CANCELED: ErrorDetails=[{cancellation.ErrorDetails}]" }  
                                        , { "messageDetails2", $"CANCELED: Did you update the subscription info?" }  
                                        , { "result.Reason", "CancellationReason.Error" }  
       
                                    });  
       
                                }  
                            }  
                        }  
                    }  
       
      
                }  
            }  
    

    I am using the Azure Cognitve Speech Sdk for .net Core, i am trying a few different angles here.

    the Issue i am having is that when i try to save the result.AudioData to wav it sounds like Noise.

    I used the SaveToWaveFileAsync and the file saves perfectly, but i wan to save the file to Azure blob, onyl thing i can think of is saving the byte array to memorystream and saving the stream to Azure blob, but when i do this the file is not playable.

    0 comments No comments