How can i obtain the audio file in a text to speech resource Azure Speech Services?

Question

How can i obtain the audio file in a text to speech resource Azure Speech Services?

Cristian Camilo Bonilla Tellez 25

Good Evening,

i would like to know if exist a way to obtain a audio file like .wav or mp3, from text to speech service using code on python or c#, when i consume the api of text to speech, the text sound in my pc with the voice selected in the request to the API , but i need the file,

Thank you for your help.

Accepted answer

0 additional answers

Your answer

Answer 1

@Cristian Camilo Bonilla Tellez Yes, you can synthesize the text to an audio file using the AudioConfig input with your speechSynthesizer class. Here is a sample to obtain the output as a .wav file.

    """performs speech synthesis to a wave file"""
    # Creates an instance of a speech config with specified subscription key and service region.
    speech_config = speechsdk.SpeechConfig(subscription=speech_key, region=service_region)
    # Creates a speech synthesizer using file as audio output.
    # Replace with your own audio file name.
    file_name = "outputaudio.wav"
    file_config = speechsdk.audio.AudioOutputConfig(filename=file_name)
    speech_synthesizer = speechsdk.SpeechSynthesizer(speech_config=speech_config, audio_config=file_config)

    # Receives a text from console input and synthesizes it to wave file.
    while True:
        print("Enter some text that you want to synthesize, Ctrl-Z to exit")
        try:
            text = input()
        except EOFError:
            break
        result = speech_synthesizer.speak_text_async(text).get()
        # Check result
        if result.reason == speechsdk.ResultReason.SynthesizingAudioCompleted:
            print("Speech synthesized for text [{}], and the audio was saved to [{}]".format(text, file_name))
        elif result.reason == speechsdk.ResultReason.Canceled:
            cancellation_details = result.cancellation_details
            print("Speech synthesis canceled: {}".format(cancellation_details.reason))
            if cancellation_details.reason == speechsdk.CancellationReason.Error:
                print("Error details: {}".format(cancellation_details.error_details))

You can check the complete sample of the snippet from the speech SDK github repo.along with other possible scenarios.

To set a particular voice you need to set the same on speech config based on the available voices in your region. After setting the same you can use the same audio config to synthesize to file.

    voice = "Microsoft Server Speech Text to Speech Voice (en-US, JennyNeural)"
    speech_config.speech_synthesis_voice_name = voice

If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.

Share via

How can i obtain the audio file in a text to speech resource Azure Speech Services?

0 additional answers

Your answer