rate limit exceeded error with Azure AI search

Ravi Rama 5 Reputation points
2024-09-02T22:46:16.0633333+00:00

I am getting below rate limit exceeded error when I am using Azure AI search with python code.

'An error occurred when calling Azure OpenAI: Server responded with status 429. Error message: {"error":{"code":"429","message": "Rate limit is exceeded. Try again in 60 seconds.

Azure AI Search
Azure AI Search
An Azure search service with built-in artificial intelligence capabilities that enrich information to help identify and explore relevant content at scale.
1,339 questions
0 comments No comments
{count} votes

2 answers

Sort by: Most helpful
  1. SnehaAgrawal-MSFT 22,706 Reputation points Moderator
    2024-09-04T12:36:17.0433333+00:00

    @Ravi Rama Thanks for reaching here! The error message indicates that you have exceeded the rate limit for Azure OpenAI. The error code 429 indicates that the number of requests per second has reached the limit of managed online endpoints.

    To resolve this issue, you can try the following steps:

    1. Wait for 60 seconds and try again. This error message suggests that you have exceeded the rate limit for a short period of time, and waiting for a minute should resolve the issue.
    2. Check if you are making too many requests in a short period of time. If you are making too many requests in a short period of time, you may need to reduce the frequency of your requests or optimize your code to make fewer requests.
    3. Check if you have reached the limit of managed online endpoints. If you have reached the limit of managed online endpoints, you may need to increase the limit or use a different service. see Understanding rate limits.

    To minimize issues related to rate limits-

    • Set max_tokens and best_of to the minimum values that serve the needs of your scenario. For example, don’t set a large max-tokens value if you expect your responses to be small. Also, see A Guide to Limits, Quotas, and Best Practices for more details.
    0 comments No comments

  2. Mehdi Memar 10 Reputation points
    2025-04-28T03:50:47.81+00:00

    I added max_tokens parameter to the client and the issue was resolved.

    completion = client.chat.completions.create(
        model=deployment,
        messages=[
            {
                "role": "user",
                "content": text,
            },
        ],
        max_tokens=200,
        extra_body={
            "data_sources":[
                {
                    "type": "azure_search",
                    "parameters": {
                        "endpoint": os.environ["AZURE_SEARCH_ENDPOINT"],
                        "index_name": os.environ["AZURE_SEARCH_INDEX"],
                        "authentication": {
                            "type": "api_key",
                            "key": os.environ["AZURE_SEARCH_KEY"],
                        }
                    }
                }
            ],
        }
    )
    
    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.