语音服务中的文本转语音,生成后的音频末尾有空白音

Neunit AI 5 Reputation points
2023-12-19T01:34:14.31+00:00

我们是在Centos8上运行的,使用的Pythonsdk ,通过 SSML 标记语言请求的英文 语音合成,发现在合成的音频末尾有空白音的情况,请问其他人有遇到过吗,或者如何解决呢

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
2,085 questions
{count} votes

1 answer

Sort by: Most helpful
  1. navba-MSFT 27,550 Reputation points Microsoft Employee Moderator
    2023-12-19T05:34:46.0133333+00:00

    @Neunit AI Welcome to Microsoft Q&A Forum, Thank you for posting your query here! Could you please update your azure.cognitiveservices.speech python package to the most recent version and test again ?
    .
    I used the below sample and I couldn't hear any noise / sound at the end. Could you please try to use the below sample code and check if that helps ?

    import azure.cognitiveservices.speech as speechsdk
    
    # Replace with your own subscription key and region identifier from Azure.
    speech_key = "027cd6XXXXXXXXXXXXXde65e"
    service_region = "westeurope"
    
    # Creates an instance of a speech config with specified subscription key and service region.
    speech_config = speechsdk.SpeechConfig(subscription=speech_key, region=service_region)
    
    # Creates a speech synthesizer using the default speaker as audio output.
    speech_synthesizer = speechsdk.SpeechSynthesizer(speech_config=speech_config)
    
    # Receives a text from console input and synthesizes it to speaker output.
    text = "Hello world, Microsoft Cognitive Services Speech service is awesome!"
    
    # Encapsulate the text within speak and voice tags to indicate the language, gender and the text to be spoken.
    ssml_string = f"""
    <speak version='1.0' xmlns='http://www.w3.org/2001/10/synthesis' xml:lang='en-US'>
      <voice name='en-US-JennyNeural'>
        {text}
      </voice>
    </speak>
    """
    
    # Performs the text-to-speech process
    result = speech_synthesizer.speak_ssml_async(ssml_string).get()
    
    # Checks result.
    if result.reason == speechsdk.ResultReason.SynthesizingAudioCompleted:
        print("Speech synthesized to speaker for text [{}]".format(text))
    elif result.reason == speechsdk.ResultReason.Canceled:
        cancellation_details = speechsdk.SpeechSynthesisCancellationDetails.from_result(result)
        print("Speech synthesis canceled: {}".format(cancellation_details.reason))
        if cancellation_details.reason == speechsdk.CancellationReason.Error:
            if cancellation_details.error_details:
                print("Error details: {}".format(cancellation_details.error_details))
        print("Did you update the subscription info?")
    

    Hope this helps. If you have any follow-up questions, please let me know. I would be happy to help.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.