Long delay on TTS first response

Question

Long delay on TTS first response

Flynn Lauridsen 1

Hi,

I have been using Azure TTS in my project on an AWS server. Sometimes when running my project, TTS will take somewhere around 30 seconds to respond to the first request on the server (every request following is quite fast) yet this issue has never occurred when using TTS on my local machine although is quite infrequent on the ronin, some days it occurs more than others. It more typically occurs after not having used tts for a while. I am currently using a free subscription for this and am requesting using speak_text_async(text).get().

Has anyone experienced anything similar to this?

Cheers

1 answer

Your answer

Answer 1

romungi-MSFT 48,911 Microsoft Employee Moderator

@Flynn Lauridsen The free and standard tiers are governed by limits on the number of transactions per second depending on the API being called. These tiers are also limited by the quotas set for the same. If you are experiencing throttling issues during certain times, then it is most likely that the speech resource is trying to scale the compute that it depends on to support the higher rate of transactions. You can look at the quotas and limits page on how these values could affect your API calls.

From General best practices section of the document, the following explains the behavior you are seeing:

For example, let's say your application is using text-to-speech, and your current workload is 5 TPS. The next second, you increase the load to 20 TPS (that is, four times more). Speech service immediately starts scaling up to fulfill the new load, but is unable to scale as needed within one second. Some of the requests will get response code 429 (too many requests)

If an answer is helpful, please click on or upvote which might help other community members reading this thread.

Flynn Lauridsen 1 Reputation point

2022-10-13T05:28:08.367+00:00

Thanks for your response.
Although I just want to clarify. How long does this scaling take usually. As the delay I've been experiencing is usually 30 seconds longs, sometimes less. Also I am typically only sending a single short request at the beginning, about a sentence long, maybe two.
There was nothing else in the quotas and limits that really fit what has been happening as the delay is often preceded by tts not being used for some time and followed by tts working perfectly, so its almost certainly not too many requests or exceeding quotas.
Is there perhaps some hidden limit to the free tier that would alleviate this issue, whether it be resource scaling or something else

Cheers
romungi-MSFT 48,911 Reputation points Microsoft Employee Moderator

2022-10-21T05:15:23.17+00:00

There aren't any hidden limits that might be causing this. The scaling is typically based on the tier, region and availability so a longer delay might indicate the availability of backend compute until the service processes your request.
Flynn Lauridsen 1 Reputation point

2022-11-07T05:26:09.027+00:00

Hey sorry for a late reply, but I've found some more information on this issue. So my requests were all within the limits and quotas specified, but i still wasnt getting a 429 response indicating that it is a scaling issue, as a few pages suggested would happen in the case of a scaling issue. I tried using eastus as a region rather than australiaeast and this seems to have eliminated the delay from occurring (although with the sacrifice of a constant 2 second delay). Another thing i noticed was that it wasnt always occurring on the first request, sometimes it would occur a few requests after the first and this was consistent in both the python sdk and the rest api. The delay also only occurs on the first few requests after I havent made any requests in an hour or 2.
While it seems like a resource scaling issue in many senses, the lack of 429 response and the fact that this doesn't always happen on the first request suggests it may well be something else still.

Share via

Long delay on TTS first response

1 answer

Your answer