error in the azure.cognitiveservices.speech python module

Question

error in the azure.cognitiveservices.speech python module

kyle foley 0

There is documentation regarding how to use Azure's text to speech except for the case where you actually want the computer to speak the text while you process the text. I'm more interesting in converting a book to speech so that I can listen it. In that case, I can't have the computer talking to me the whole time. So I was recommended the following code on Github but it does not work. Here is the original Github question: https://github.com/MicrosoftDocs/azure-docs/issues/119434

This line

        stream = speechsdk.AudioDataStream(
            format=speechsdk.AudioStreamFormat(
                pcm_data_format=speechsdk.PcmDataFormat.Pcm16Bit,
                sample_rate_hertz=16000, channel_count=1))

threw an error and said there was no method 'AudioStreamFormat' in speechsdk, did you mean AudioStreamWaveFormat, so I then used:

        stream = speechsdk.AudioDataStream(
            format=speechsdk.AudioStreamWaveFormat(
                pcm_data_format=speechsdk.PcmDataFormat.Pcm16Bit,
                sample_rate_hertz=16000, channel_count=1))

Now, I'm getting: AttributeError: module 'azure.cognitiveservices.speech' has no attribute 'PcmDataFormat'

santoshkc 15,355 Microsoft External Staff Moderator

Hi @kyle foley, Thank you for reaching out to Microsoft Q&A forum! I tried to repro using the below code without computer to speak while processing the text.

import azure.cognitiveservices.speech as speechsdk

# Set up the speech configuration
speech_config = speechsdk.SpeechConfig(subscription="<Your_Subscription>", region="<Your_region>")
speech_config.speech_synthesis_language = "en-US"
speech_config.speech_synthesis_voice_name = "en-US-AriaNeural"

# Set up the audio output format
audio_config = speechsdk.audio.AudioOutputConfig(filename="<output-file-name>.mp3")

# Create a synthesizer object
synthesizer = speechsdk.SpeechSynthesizer(speech_config=speech_config, audio_config=audio_config)

# Read the input text from a file
with open(r"<input-file-name>.txt", "r") as f:
    input_text = f.read()

# Convert the text to speech
synthesizer.speak_text_async(input_text)

I hope you understand! Thank you.

santoshkc 15,355 Reputation points Microsoft External Staff Moderator

2024-02-06T10:21:03.67+00:00

Hi @kyle foley, Following up to see did you got any chance to check the above response was helpful? Thank you!
kyle foley 0 Reputation points

2024-02-06T10:23:05.3766667+00:00

Thanks for the follow up, I should have time for it within the next 24 hours. Haven't looked at it yet.
kyle foley 0 Reputation points

2024-02-06T11:13:41.6033333+00:00

Thanks, that did it.

1 answer

Your answer

santoshkc 15,355 Reputation points Microsoft External Staff Moderator

2024-02-05T10:55:39.72+00:00

Hi @kyle foley, Thank you for reaching out to Microsoft Q&A forum! I tried to repro using the below code without computer to speak while processing the text.

import azure.cognitiveservices.speech as speechsdk # Set up the speech configuration speech_config = speechsdk.SpeechConfig(subscription="<Your_Subscription>", region="<Your_region>") speech_config.speech_synthesis_language = "en-US" speech_config.speech_synthesis_voice_name = "en-US-AriaNeural" # Set up the audio output format audio_config = speechsdk.audio.AudioOutputConfig(filename="<output-file-name>.mp3") # Create a synthesizer object synthesizer = speechsdk.SpeechSynthesizer(speech_config=speech_config, audio_config=audio_config) # Read the input text from a file with open(r"<input-file-name>.txt", "r") as f: input_text = f.read() # Convert the text to speech synthesizer.speak_text_async(input_text)

I hope you understand! Thank you.
santoshkc 15,355 Reputation points Microsoft External Staff Moderator

2024-02-06T10:21:03.67+00:00

Hi @kyle foley, Following up to see did you got any chance to check the above response was helpful? Thank you!
kyle foley 0 Reputation points

2024-02-06T10:23:05.3766667+00:00

Thanks for the follow up, I should have time for it within the next 24 hours. Haven't looked at it yet.
kyle foley 0 Reputation points

2024-02-06T11:13:41.6033333+00:00

Thanks, that did it.

Answer 1

Hi @kyle foley,

I'm glad I could help and thanks for the response, which might be beneficial to other community members reading this thread. Since the Microsoft Q&A community has a policy that "The question author cannot accept their own answer. They can only accept answers by others ", I'll repost your solution in case you'd like to accept the answer.

Question: error in the azure.cognitiveservices.speech python module

Answer:

import azure.cognitiveservices.speech as speechsdk

# Set up the speech configuration
speech_config = speechsdk.SpeechConfig(subscription="<Your_Subscription>", region="<Your_region>")
speech_config.speech_synthesis_language = "en-US"
speech_config.speech_synthesis_voice_name = "en-US-AriaNeural"

# Set up the audio output format
audio_config = speechsdk.audio.AudioOutputConfig(filename="<output-file-name>.mp3")

# Create a synthesizer object
synthesizer = speechsdk.SpeechSynthesizer(speech_config=speech_config, audio_config=audio_config)

# Read the input text from a file
with open(r"<input-file-name>.txt", "r") as f:
    input_text = f.read()

# Convert the text to speech
synthesizer.speak_text_async(input_text)

-------If this answers your query, do click Accept Answer and Yes for was this answer helpful.

Share via

error in the azure.cognitiveservices.speech python module

1 answer

Your answer