Azure AI Search: Throttling (Error 429) when calling list_indexes in Python

Vishal Bhalabar 20 Reputation points
2026-02-04T11:21:32.0533333+00:00

I am running a Python-based Azure Function App (App Service Plan: P2v3) that interacts with Azure AI Search (Pricing Tier: Standard, Partitions: 1, Replicas: 1 (No SLA), Search Units: 1). My application frequently needs to retrieve a list of available search indexes using the following code.

# This is called multiple times across the application
def get_index_names():
    search_index_client = SearchIndexClient(
        endpoint=vector_store_address, 
        credential=AzureKeyCredential(vector_store_password)
    )
    index_names = [index.name for index in search_index_client.list_indexes()]
    return index_names

The Issue: Intermittently, the application throws a HttpResponseError: (You are sending too many requests. Please try again later.). This HTTP 429 error suggests that I am hitting a rate limit or quota on the Search Service management side.

Questions:

  1. Is there a specific documented rate limit or "Transactions Per Second" (TPS) quota for the list_indexes() method or general management operations?
  2. Since this is an "App Service Plan" (P2v3) environment, could the throttling be related to the Search Service tier (e.g., Free vs. Basic vs. Standard)?
  3. What is the recommended way to handle this? Should I implement a manual logic**,** or is there a built-in mechanism in the SDK?
Azure AI Search
Azure AI Search
An Azure search service with built-in artificial intelligence capabilities that enrich information to help identify and explore relevant content at scale.
0 comments No comments
{count} votes

4 answers

Sort by: Most helpful
  1. Shree Hima Bindu Maganti 6,660 Reputation points Microsoft External Staff Moderator
    2026-02-05T05:58:58.3833333+00:00

    Hi @Vishal Bhalabar
    Apology for your inconveniences
    Azure AI Search does not publish a fixed transactions-per-second limit specifically for the list_indexes() method, but Microsoft confirms that index APIs and service operations have static request rate limits, and requests can be throttled with HTTP 429 when the service approaches peak capacity. This behavior is related to the Azure AI Search service tier and allocated search units (replicas × partitions) rather than the hosting environment, so the App Service Plan (P2v3) is not the cause of the throttling. Services running with lower capacity (for example, Standard tier with a single search unit) are more likely to experience throttling under frequent management calls. Microsoft recommends handling such scenarios by implementing retry logic with exponential backoff, honoring the Retry-After header, reducing the frequency of management operations such as repeatedly listing indexes (for example, by caching results), and scaling the search service if throttling continues during normal workload conditions.
    https://learn.microsoft.com/en-us/azure/search/search-limits-quotas-capacity
    https://learn.microsoft.com/en-us/azure/search/search-performance-tips
    Let me know if you have any further assistances needed.

    0 comments No comments

  2. Vishal Bhalabar 20 Reputation points
    2026-02-05T05:00:40.2933333+00:00

    Hi Shikha Ghildiyal

    Thank you for the initial guidance, but it appears the response focuses on Azure OpenAI rate limits (TPM/RPM and max_tokens), whereas my issue specifically concerns Azure AI Search management operations

    As mentioned, the error occurs when calling list_indexes() from the SearchIndexClient in a Python Azure Function. To help me resolve this for the search service, could you please clarify the following:

    Questions:

    1. Is there a specific rate limit (max no. of calls per seconds) for search service list_indexes() method from the SearchIndexClient in a Python Azure Function?
    2. Since this is an "App Service Plan" (P2v3) environment, could the throttling be related to the Search Service tier (e.g., Free vs. Basic vs. Standard)?
    3. What is the recommended way to handle this? Is there a built-in mechanism in the SDK?
    0 comments No comments

  3. Shikha Ghildiyal 7,160 Reputation points Microsoft Employee Moderator
    2026-02-04T14:43:00.4033333+00:00

    Hi Vishal Bhalabar

    Thanks for reaching out to Microsoft QnA

    You may encounter a 429 error (“Too Many Requests”) when your usage exceeds the allowed limits or when the system is experiencing high demand.

    Please refer to rate limit best practices

    To minimize issues related to rate limits, it's a good idea to use the following techniques:

    • Set max_tokens and best_of to the minimum values that serve the needs of your scenario. For example, don’t set a large max-tokens value if you expect your responses to be small.
    • Use quota management to increase TPM on deployments with high traffic, and to reduce TPM on deployments with limited needs.
    • Implement retry logic in your application.
    • Avoid sharp changes in the workload. Increase the workload gradually.
    • Test different load increase patterns.

    Refer : https://learn.microsoft.com/en-us/azure/ai-foundry/openai/how-to/quota?view=foundry-classic&tabs=rest#understanding-rate-limits

    If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.

    0 comments No comments

  4. Q&A Assist
    2026-02-04T11:21:44.7766667+00:00
    AI generated content. This question contains an answer created with AI from Q&A Assist. Learn more
    1. Rate Limits for list_indexes(): The Azure documentation does not specify a distinct rate limit for the list_indexes() method itself. However, it is important to note that the overall management operations in Azure AI Search can be subject to throttling based on the service tier and the number of requests being sent. The general guidance is to be aware of the quotas associated with your Search Service tier.
    2. Throttling Related to Service Tier: Yes, the throttling you are experiencing could be related to the Search Service tier. In your case, using the Standard tier with only 1 partition and 1 replica (which has no SLA) may lead to limitations on the number of concurrent requests that can be processed. If the service is under heavy load or if your application is making frequent requests, this could trigger HTTP 429 errors.
    3. Recommended Handling: It is advisable to implement retry logic in your application when encountering HTTP 429 errors. This can be done by catching the HttpResponseError and waiting for a specified duration before retrying the request. Additionally, you might consider increasing the number of replicas and partitions if your application requires higher throughput to avoid hitting the rate limits. The Azure SDK does not have a built-in automatic retry mechanism for management operations, so manual implementation is necessary.

    References:

    0 comments No comments

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.