Limitation on Text-to-Speech Audio Length in Azure Cognitive Services

Question

How can I generate audio files longer than 10 minutes using Azure Cognitive Services' Text-to-Speech API?
PS - Based on common issues that we have seen from customers and other sources, we are posting these questions to help the Azure community.

Answer

Greetings!

The Azure Cognitive Services' Text-to-Speech (TTS) API has a limitation that restricts the generation of audio files to a maximum of 10 minutes. This is a known product limitation. To generate audio files longer than 10 minutes, you should use the Text-to-Speech batch synthesizer for asynchronous applications. The batch synthesizer allows for handling longer text inputs and can produce extended audio outputs, although it is not designed for real-time processing and will need to be managed asynchronously.

Resources:

For more information, please see the following links:

Hope this helps. If you have any follow-up questions, please let me know. I would be happy to help. Please do not forget to "up-vote" wherever the information provided helps you, as this can be beneficial to other community members.

Share via

Limitation on Text-to-Speech Audio Length in Azure Cognitive Services

1 answer

Your answer