@JAIN Saumya RPM rate limits are based on the number of requests received over time. Azure OpenAI evaluates the rate of incoming requests over a small period of time, typically 1 or 10 seconds and then determines if the rate limits are being exceeded. If it estimates that the rate could exceed error 429 is reported. See this section from documentation to get a better understanding of how this works.
To summarize, the rate limits are estimated based on a small time period and is not the sum of actual requests received over a minute. This is true for all Azure cognitive services and error 429 is reported if the service sees the limit being breached. Follow the best practices to avoid this error and stay within the quota allocated.
If this answers your query, do click Accept Answer
and Yes
for was this answer helpful. And, if you have any further query do let us know.