Limitation on Text-to-Speech Audio Length in Azure Cognitive Services

santoshkc 8,940 Reputation points Microsoft Vendor
2024-07-31T06:44:42.8933333+00:00

How can I generate audio files longer than 10 minutes using Azure Cognitive Services' Text-to-Speech API?
PS - Based on common issues that we have seen from customers and other sources, we are posting these questions to help the Azure community.

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
1,735 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. santoshkc 8,940 Reputation points Microsoft Vendor
    2024-07-31T06:47:45.64+00:00

    Greetings!

    The Azure Cognitive Services' Text-to-Speech (TTS) API has a limitation that restricts the generation of audio files to a maximum of 10 minutes. This is a known product limitation. To generate audio files longer than 10 minutes, you should use the Text-to-Speech batch synthesizer for asynchronous applications. The batch synthesizer allows for handling longer text inputs and can produce extended audio outputs, although it is not designed for real-time processing and will need to be managed asynchronously.

    Resources:

    For more information, please see the following links:

    Hope this helps. If you have any follow-up questions, please let me know. I would be happy to help. Please do not forget to "up-vote" wherever the information provided helps you, as this can be beneficial to other community members.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.