I use the Chinese pronunciation person "Yun Xi" and I also encountered this problem. When reading the sentence "没想到小三竟自导自演了一场戏", I always pause at the third word.
My requirement is also to read the novel aloud quickly in a compact mode without too many emotional and anthropomorphic pauses. Is there any solution that can meet my requirement?
Undesired Pause in Neural Voice Synthesis for Long Sentences
I have any number of places in text I need to synthesize (using Neural Voices) that has some very long sentences. (And no, I can't change them, unfortunately.) I've noticed that around 25-28 seconds into the sentence and apparently always at 500 characters into any sentence, the voice synthesizer will pause about as long as the end of a sentence. I can't have a pause like this. I cannot find anywhere in the documentation that talks about a sentence length limit or about any way to avoid that pause. (perhaps using some SSML tag?)
Can anyone provide some guidance on what can be done to avoid these gaps / pauses?
I know it's possible to edit MP3 files generated, but my workflow just isn't suited for doing so. (i.e., 40,000+ MP3 files, any one of which I may need to resynthesize as mispronunciations, etc. are found, and over 1200 of them have these long sentences with gaps in synthesis) I really need to be able to synthesize the sentences into MP3 files without needing to make manual fixes.
Azure AI Speech
Azure AI services
1 answer
Sort by: Most helpful
-
佳鑫 朱 0 Reputation points
2023-12-14T15:59:21.05+00:00