The timbre of the voice of "zh-CN-XiaochenNeural" has changed,why does it seem like a different person?

Trulon 0 Reputation points
2023-11-20T06:41:26.2633333+00:00

The timbre of the voice of "zh-CN-XiaochenNeural" has changed, and the current timbre is completely different from what it was a few weeks ago. The results of the test using "Batch synthesis API (Preview) for text to speech" or "Speech Studio" were different from those of a few weeks ago.My SSML input would look something like this:“

<speak version='1.0' xml:lang='en-US'><voice xml:lang='zh-CN' name='zh-CN-XiaochenNeural'><prosody rate='+60.00%'><break time="750ms"/> 这天的最后,是辅导员带我去校医务室做了伤口处理。她叹着气坐在病床旁边,看我的眼神充满了怜悯。你们的事我都知道了,老师相信,这次也不是你的错......</prosody></voice></speak>”。

I still have a voice mp3 file generated a few weeks ago locally, and I can't generate the same voice as a few weeks ago. Are there any adjustments made to the "zh-CN-XiaochenNeural" voice pack?

Batch synthesis API (Preview) :

https://eastasia.customvoice.api.speech.microsoft.com/api/texttospeech/3.1-preview1/batchsynthesis/

2023/11/22

The new finding is that the results obtained by using the Microsoft Speech API are correct, but this is a free presentation service that is unstable.

Microsoft Speech API:

https://southeastasia.api.speech.microsoft.com/accfreetrial/texttospeech/acc/v3.0-beta1/vcg/speak

So why does zh-CN-XiaochenNeural in Azure Speech API(Batch synthesis API (Preview)) suddenly become a different voice compared to Microsoft Speech API?

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
1,435 questions
Azure AI services
Azure AI services
A group of Azure services, SDKs, and APIs designed to make apps more intelligent, engaging, and discoverable.
2,434 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Trulon 0 Reputation points
    2023-11-23T10:10:17.9566667+00:00

    This is solved by technical support. Here are the results:

    问题描述:
    您正在使用Azure文本转语音服务。您的语音资源部署在东亚区域(East Asia),在使用“zh-CN-XiaochenNeural”模型时遇到了音色(timbre)与之前表现不同的情况。并且,您分别提供了该模型几周前正确的输出音色样本(good case)与目前错误的输出音色样本(bad case)供参考。

    调查反馈:
    经过测试,我们确定了东亚区域语音资源的“zh-CN-XiaochenNeural”模型音色存在改变的情况,并正在积极地与产品团队商讨针对这一区域语音资源的解决措施。

    缓解建议:
    您可以将您的语音资源切换至东南亚区域(Southeast Asia),该区域“zh-CN-XiaochenNeural”模型的音色与您提供的good case音色一致。