It looks like you're encountering rate limit issues due to the S0 pricing tier on your Azure OpenAI resource. Since rate limits are enforced per minute based on token usage, once the TPM (tokens per minute) threshold is reached, any additional requests within that minute will receive the error.
To Increase your Limits in Azure OpenAI Studio:
- Check Your Current Quota
- In Azure OpenAI Studio, go to Shared Resources > Quota and select your subscription from the drop-down.
- Click the link to Request quota to increase the Quota by filling a form.
- Request a Higher Limit
- Since you cannot change the pricing tier of an existing resource, create a new resource and request a higher quota allocation.
- You could also try increasing the limit on your deployment.
For more details, check: Manage and Increase Quotas in Azure OpenAI Studio.
I hope this helps! Thank you.