Issue with Token Rate Limit when Uploading Files to OpenAI Playground

Jason 20 Reputation points
2025-01-24T07:45:17.3766667+00:00

I am encountering an issue when using the Azure OpenAI API. When I send a prompt without attaching any file, the model responds as expected with information. However, when I attach a file to my prompt, I receive the following error message:

"Requests to the ChatCompletions_Create Operation under Azure OpenAI API version 2024-10-01-preview have exceeded token rate limit of your current AIServices S0 pricing tier. Please retry after 86400 seconds. Please contact Azure support service if you would like to further increase the default rate limit."

This issue occurs while using the free trial account. Could you please assist me with the following:

  1. Why am I receiving this error when attaching a file, and how is it related to the token rate limit?
  2. How can I resolve this issue without upgrading my account or increasing my current rate limit while still using the free trial account?
  3. Are there any optimizations I can apply to prevent exceeding the token rate limit when uploading files, while still maintaining the ability to query with files?

I would appreciate any guidance or suggestions on how to overcome this limitation

Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
4,080 questions
{count} votes

Accepted answer
  1. kothapally Snigdha 3,020 Reputation points Microsoft External Staff Moderator
    2025-02-04T06:16:19.1766667+00:00

    Hi Jason

    sorry for delay

    To increase the token rate limit for your Azure OpenAI Service, you need to request a quota increase. Here are the steps you can follow.

    • Your maximum quota values may be lower if your Azure subscription is linked to certain offer types. For example, if you're on a free trial or a student subscription, your limit might be 1,000 tokens per minute.
    • You can submit a quota increase request via the quota increase request formNote that due to high demand, requests are filled in the order they are received, and priority is given to customers who are actively consuming their existing quota.
    • If you have the ability to modify your deployment settings, you can adjust the Tokens-Per-Minute (TPM) allocation. This can be done in the Azure AI Foundry portal under the Deployments section.
    • If you are unable to modify the rate limit due to your current subscription type, you may need to upgrade your subscription or change your offer type to access higher limits.

    Kindly Refer these document https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/quota#understanding-rate-limits

    I hope these helps you. Thank you!

    1 person found this answer helpful.

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.