Anonymous-5707222 Greetings and Welcome to Microsoft Q&A forum!
Requests to the ChatCompletions_Create Operation under Azure OpenAI API version 2024-04-01-preview have exceeded token rate limit of your current OpenAI S0 pricing tier. Please retry after 86400 seconds. Please go here: https://aka.ms/oai/quotaincrease if you would like to further increase the default rate limit.
To give more context, As each request is received, Azure OpenAI computes an estimated max processed-token count that includes the following:
- Prompt text and count
- The max_tokens parameter setting
- The best_of parameter setting
As requests come into the deployment endpoint, the estimated max-processed-token count is added to a running token count of all requests that is reset each minute. If at any time during that minute, the TPM rate limit value is reached, then further requests will receive a 429 response code until the counter resets. For more details, see Understanding rate limits.
The endpoint seemingly has a limit of 1k tokens per Minuten, which is fine for me but I wouldn't mind to pay a bit more to explore the image analysis but this does not seem to be possible either. How can I increase any limits on the endpoint.
To increase the quota, In Azure OpenAI studio, under shared resources -> select quota.
Check if the quota is already full or not.
You choose Request quota link to increase the current model quota if already full.
Also, you can increase the token limit under deployments -> Edit-> Update deployments tab.
See Manage Azure OpenAI Service quota for more details.
I hope this helps. Do let me know if that helps or have any further queries.