Text to speech - time synchronization

sam husson 1 Reputation point
2021-02-10T10:20:10.16+00:00

Hello,

Im using the Speech Studio>Audio Content Creation tool to produce audio files with ssml. I need to get the audio files synchronized in different languages. In an ssml document with 2 sentences, is there a way to set the starting time of the second sentence, taking as a reference the first sentence?

Example:
<speak>
<par>
<media xml:id="test" begin="0.5s">
<speak>This is the first sentence</speak>
</media>
<media xml:id="answer" begin="test.end+2.0s">
<speak>This second sentence starts 2 seconds after the begining of the first sentence.</speak>
</media>
</par>
</speak>

If this is not possible, is there another option, with neural voices?

Many thanks for help

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
1,555 questions
0 comments No comments
{count} votes

2 answers

Sort by: Most helpful
  1. romungi-MSFT 43,696 Reputation points Microsoft Employee
    2021-02-10T11:52:56.307+00:00

    @sam husson The tool has an option to break i.e set a time in ms to wait for the next sentence. I tried this out using the text mode with two different voices and it seems to wait until the end of first sentence before pronouncing the second one.

    66492-break-sentence.gif


  2. sam husson 1 Reputation point
    2021-02-10T14:37:49.31+00:00

    yes, exactly, the idea would be to create 2 different audio files, in two different languages. Time synchronization is needed for synchronizing with other contents, such as music or video.
    Again, thanks!

    0 comments No comments