Azure TTS: Incorrect answer when using multilingual voices

musienkov 25 Reputation points
2024-01-17T13:01:01.1433333+00:00

I use the cognitiveservices/v1 REST API to convert text to speech (Azure AI Speech Service). The use of monolingual voices does not cause any problems. While multilingual voices occasionally return an incorrect response: accelerated and with a changed voice. If I change some words in the text, I can get a normal answer, but I don't know what exactly causes this problem. And the same problem is present for many multilingual voices. The endpoint I use: https://eastus.tts.speech.microsoft.com/cognitiveservices/v1 The body of the request I'm sending:

<speak version='1.0' xml:lang='en-US'>
    <voice xml:gender='Female' name='en-US-EmmaMultilingualNeural'>
        <lang xml:lang='en-US'>
            Welcome to customer support! How can I help you?
        </lang>
    </voice>
</speak>

I would be grateful for any tips.

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
1,899 questions
{count} votes

Accepted answer
  1. VasaviLankipalle-MSFT 18,561 Reputation points
    2024-01-23T06:49:48.82+00:00

    Hello @musienkov , thank you for your patience.

    Looks like there was a similar issue and the product team has been fixed the bug last week. In this issue there was an abnormal speed rate on multilingual voices when output format is not 24khz.

    Could you please check on your end and confirm?

    Regards,
    Vasavi

    1 person found this answer helpful.

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.