You can go to the Azure OpenAI Studio -> Shared resources -> Deployments -> find the deployment you want to check. After clicking on the deployment, you can see the rate limit details under [Details] tab. Be aware, there are limit on both tokens and request count.
What is the request-per-minute rate limit for Azure openAI models for gpt-4o?
Dennis W
20
Reputation points
I wasn’t able to locate the RPM limit for Azure OpenAI. I did find that the TPM limit for a few regions in below link but seems like my end points/resource get 429 much earlier than the limit.
https://learn.microsoft.com/en-us/azure/ai-services/openai/quotas-limits
Despite searching through Microsoft’s documentation, I couldn’t find any information regarding the RPM limit. How do I check the limit for my deployed model?