I use the cognitiveservices/v1 REST API to convert text to speech (Azure AI Speech Service). The use of monolingual voices does not cause any problems. While multilingual voices occasionally return an incorrect response: accelerated and with a changed voice. If I change some words in the text, I can get a normal answer, but I don't know what exactly causes this problem. And the same problem is present for many multilingual voices.
The endpoint I use:
https://eastus.tts.speech.microsoft.com/cognitiveservices/v1
The body of the request I'm sending:
<speak version='1.0' xml:lang='en-US'>
<voice xml:gender='Female' name='en-US-EmmaMultilingualNeural'>
<lang xml:lang='en-US'>
Welcome to customer support! How can I help you?
</lang>
</voice>
</speak>
I would be grateful for any tips.