Heet Sarju Shah Greetings & Welcome to Microsoft Q&A forum!
I understand that you are hitting quota limits.
To minimize issues related to rate limits, it's a good idea to use the following techniques:
- Implement retry logic in your application.
- Avoid sharp changes in the workload. Increase the workload gradually.
- Test different load increase patterns.
- Increase the quota assigned to your deployment. Move quota from another deployment, if necessary.
RPM rate limits are based on the number of requests received over time. The rate limit expects that requests be evenly distributed over a one-minute period. If this average flow isn't maintained, then requests may receive a 429 response even though the limit isn't met when measured over the course of a minute. To implement this behavior, Azure OpenAI Service evaluates the rate of incoming requests over a small period of time, typically 1 or 10 seconds. If the number of requests received during that time exceeds what would be expected at the set RPM limit, then new requests will receive a 429 response code until the next evaluation period. For example, if Azure OpenAI is monitoring request rate on 1-second intervals, then rate limiting will occur for a 600-RPM deployment if more than 10 requests are received during each 1-second period (600 requests per minute = 10 requests per second).
See Understanding rate limits for more details.
Azure for students subscription has limited quota on the resources.
Also, If you are not able to increase the quota on a student subscription, please contact customer service at any time so that we can adjust your limits appropriately.
You must upgrade your Azure for Students Starter subscription to a Pay-As-You-Go subscription to increase your quotas or limits. For more information, see Upgrade your Azure Free Trial subscription to a Pay-As-You-Go subscription
Do let me know if that helps or have any other queries.