speackSSML mstts:audioduration It seems that the audio is not generated strictly according to the numerical value.

Quill Zhou 25 Reputation points
2023-09-24T17:02:27.8833333+00:00

I use the translated SSML from SRT to generate speech. The expected duration of the SRT is 43 seconds, but the output of speakSSML is 47 seconds.

This is my SSML file.

<speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis"
      xmlns:mstts="http://www.w3.org/2001/mstts" xml:lang="en-US">
      <voice name="en-US-JennyNeural">
            <mstts:audioduration value="1840ms" /> Look at all the money-saving tricks I've learned </voice>
      <voice name="en-US-JennyNeural">
            <mstts:audioduration value="1200ms" /> You don't actually need to clean the oil pot </voice>
      <voice name="en-US-JennyNeural">
            <mstts:audioduration value="1320ms" /> Just pour some rice inside </voice>
      <voice name="en-US-JennyNeural">
            <mstts:audioduration value="880ms" /> Squeeze some dishwashing liquid </voice>
      <voice name="en-US-JennyNeural">
            <mstts:audioduration value="1040ms" /> And add a tissue </voice>
      <voice name="en-US-JennyNeural">
            <mstts:audioduration value="1360ms" /> Finally, pour in some warm water and stir </voice>
      <voice name="en-US-JennyNeural">
            <mstts:audioduration value="960ms" /> Rinse it with water after that </voice>
      <voice name="en-US-JennyNeural">
            <mstts:audioduration value="1720ms" /> Look, the grease has dissolved completely </voice>
      <voice name="en-US-JennyNeural">
            <mstts:audioduration value="760ms" /> It's really clean </voice>
      <voice name="en-US-JennyNeural">
            <mstts:audioduration value="1360ms" /> Take a look at your electric kettle </voice>
      <voice name="en-US-JennyNeural">
            <mstts:audioduration value="840ms" /> Does it have any limescale? </voice>
      <voice name="en-US-JennyNeural">
            <mstts:audioduration value="880ms" /> Put in some potato peel </voice>
      <voice name="en-US-JennyNeural">
            <mstts:audioduration value="920ms" /> Then pour in some white vinegar </voice>
      <voice name="en-US-JennyNeural">
            <mstts:audioduration value="1120ms" /> Boil it after adding water </voice>
      <voice name="en-US-JennyNeural">
            <mstts:audioduration value="480ms" /> Look at this </voice>
      <voice name="en-US-JennyNeural">
            <mstts:audioduration value="1280ms" /> The limescale is dissolving </voice>
      <voice name="en-US-JennyNeural">
            <mstts:audioduration value="960ms" /> If your knife isn't sharp anymore </voice>
      <voice name="en-US-JennyNeural">
            <mstts:audioduration value="1040ms" /> Take a plate </voice>
      <voice name="en-US-JennyNeural">
            <mstts:audioduration value="1080ms" /> And turn it upside down </voice>
      <voice name="en-US-JennyNeural">
            <mstts:audioduration value="1200ms" /> Hold the knife at a 45-degree angle </voice>
      <voice name="en-US-JennyNeural">
            <mstts:audioduration value="1480ms" /> Rub it back and forth for about ten times </voice>
      <voice name="en-US-JennyNeural">
            <mstts:audioduration value="1040ms" /> This way, it will become sharp again </voice>
      <voice name="en-US-JennyNeural">
            <mstts:audioduration value="1680ms" /> We all know that the bottom of a pot is hard to
            clean </voice>
      <voice name="en-US-JennyNeural">
            <mstts:audioduration value="1560ms" /> But all you need is to sprinkle some salt and
            baking soda </voice>
      <voice name="en-US-JennyNeural">
            <mstts:audioduration value="1680ms" /> Then soak a piece of paper in soapy water </voice>
      <voice name="en-US-JennyNeural">
            <mstts:audioduration value="1400ms" /> Place it on the bottom and wait for half an hour </voice>
      <voice name="en-US-JennyNeural">
            <mstts:audioduration value="1320ms" /> The dirt will be easy to scrub off </voice>
      <voice name="en-US-JennyNeural">
            <mstts:audioduration value="1280ms" /> If the sink in your kitchen is clogged </voice>
      <voice name="en-US-JennyNeural">
            <mstts:audioduration value="1160ms" /> Squeeze some dishwashing liquid </voice>
      <voice name="en-US-JennyNeural">
            <mstts:audioduration value="1880ms" /> Then pour about 20 milliliters of white vinegar </voice>
      <voice name="en-US-JennyNeural">
            <mstts:audioduration value="1320ms" /> Add two tablespoons of baking soda </voice>
      <voice name="en-US-JennyNeural">
            <mstts:audioduration value="1840ms" /> And flush it with hot water three times </voice>
      <voice name="en-US-JennyNeural">
            <mstts:audioduration value="1080ms" /> The dirt in the drain </voice>
      <voice name="en-US-JennyNeural">
            <mstts:audioduration value="1040ms" /> Will be dissolved </voice>
      <voice name="en-US-JennyNeural">
            <mstts:audioduration value="600ms" /> Have you learned all of these? </voice>
</speak>
Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
1,659 questions
{count} votes

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.