error in the azure.cognitiveservices.speech python module

kyle foley 0 Reputation points
2024-02-05T04:10:00.2166667+00:00

There is documentation regarding how to use Azure's text to speech except for the case where you actually want the computer to speak the text while you process the text. I'm more interesting in converting a book to speech so that I can listen it. In that case, I can't have the computer talking to me the whole time. So I was recommended the following code on Github but it does not work. Here is the original Github question: https://github.com/MicrosoftDocs/azure-docs/issues/119434

This line

        stream = speechsdk.AudioDataStream(
            format=speechsdk.AudioStreamFormat(
                pcm_data_format=speechsdk.PcmDataFormat.Pcm16Bit,
                sample_rate_hertz=16000, channel_count=1))

threw an error and said there was no method 'AudioStreamFormat' in speechsdk, did you mean AudioStreamWaveFormat, so I then used:

        stream = speechsdk.AudioDataStream(
            format=speechsdk.AudioStreamWaveFormat(
                pcm_data_format=speechsdk.PcmDataFormat.Pcm16Bit,
                sample_rate_hertz=16000, channel_count=1))

Now, I'm getting: AttributeError: module 'azure.cognitiveservices.speech' has no attribute 'PcmDataFormat'

Azure AI services
Azure AI services
A group of Azure services, SDKs, and APIs designed to make apps more intelligent, engaging, and discoverable.
3,619 questions
{count} votes

1 answer

Sort by: Most helpful
  1. santoshkc 15,355 Reputation points Microsoft External Staff Moderator
    2024-02-06T11:43:49.4966667+00:00

    Hi @kyle foley,

    I'm glad I could help and thanks for the response, which might be beneficial to other community members reading this thread. Since the Microsoft Q&A community has a policy that "The question author cannot accept their own answer. They can only accept answers by others ", I'll repost your solution in case you'd like to accept the answer.

    Question: error in the azure.cognitiveservices.speech python module

    Answer:

    import azure.cognitiveservices.speech as speechsdk
    
    # Set up the speech configuration
    speech_config = speechsdk.SpeechConfig(subscription="<Your_Subscription>", region="<Your_region>")
    speech_config.speech_synthesis_language = "en-US"
    speech_config.speech_synthesis_voice_name = "en-US-AriaNeural"
    
    # Set up the audio output format
    audio_config = speechsdk.audio.AudioOutputConfig(filename="<output-file-name>.mp3")
    
    # Create a synthesizer object
    synthesizer = speechsdk.SpeechSynthesizer(speech_config=speech_config, audio_config=audio_config)
    
    # Read the input text from a file
    with open(r"<input-file-name>.txt", "r") as f:
        input_text = f.read()
    
    # Convert the text to speech
    synthesizer.speak_text_async(input_text)
    

    -------If this answers your query, do click Accept Answer and Yes for was this answer helpful.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.