How to improve PersonalVoice' Chinese speech synthesization result?

Muye 20 Reputation points
2024-02-28T03:34:19.4766667+00:00

Here are the text: translated_zh-CN.txt

And the synthesized audio: https://cnvb-tmp.oss-cn-hangzhou.aliyuncs.com/translated_zh-CN.wav

I encountered two kinds of problems during using PersonalVoice synthesize Chinese speech:

  1. pronounce same text twice: for sentence "所以我决定去做", "决定" had been pronounced twice.
  2. miss text: for sentence "因为我的家庭是有着许多年航空背景的家庭", "的" is missing in the audio.

What could I do to improve the synthesized result?

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
1,676 questions
{count} votes

Accepted answer
  1. navba-MSFT 23,625 Reputation points Microsoft Employee
    2024-02-29T07:57:00.8433333+00:00

    @Muye I'm glad to see you were able to resolve your issue. Thanks for posting your solution so that others experiencing the same thing can easily reference this. Since the Microsoft Q&A community has a policy that the question author cannot accept their own answer, they can only accept answers by others, I'll repost your solution in case you'd like to Accept the answer.

    Issue:

    You wanted to improve PersonalVoice' Chinese speech synthesization result. You could see that the text was pronounced incorrectly and also it was missing the text.

    Resolution:

    You didn't set the "speak-lang" correctly in SSML. After you changed lang from en-US to zh-CN, the result are much better.

    0 comments No comments

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.