Hi @Vishal Bhalabar
Apology for your inconveniences
Azure AI Search does not publish a fixed transactions-per-second limit specifically for the list_indexes() method, but Microsoft confirms that index APIs and service operations have static request rate limits, and requests can be throttled with HTTP 429 when the service approaches peak capacity. This behavior is related to the Azure AI Search service tier and allocated search units (replicas × partitions) rather than the hosting environment, so the App Service Plan (P2v3) is not the cause of the throttling. Services running with lower capacity (for example, Standard tier with a single search unit) are more likely to experience throttling under frequent management calls. Microsoft recommends handling such scenarios by implementing retry logic with exponential backoff, honoring the Retry-After header, reducing the frequency of management operations such as repeatedly listing indexes (for example, by caching results), and scaling the search service if throttling continues during normal workload conditions.
https://learn.microsoft.com/en-us/azure/search/search-limits-quotas-capacity
https://learn.microsoft.com/en-us/azure/search/search-performance-tips
Let me know if you have any further assistances needed.
Azure AI Search: Throttling (Error 429) when calling list_indexes in Python
I am running a Python-based Azure Function App (App Service Plan: P2v3) that interacts with Azure AI Search (Pricing Tier: Standard, Partitions: 1, Replicas: 1 (No SLA), Search Units: 1). My application frequently needs to retrieve a list of available search indexes using the following code.
# This is called multiple times across the application
def get_index_names():
search_index_client = SearchIndexClient(
endpoint=vector_store_address,
credential=AzureKeyCredential(vector_store_password)
)
index_names = [index.name for index in search_index_client.list_indexes()]
return index_names
The Issue: Intermittently, the application throws a HttpResponseError: (You are sending too many requests. Please try again later.). This HTTP 429 error suggests that I am hitting a rate limit or quota on the Search Service management side.
Questions:
- Is there a specific documented rate limit or "Transactions Per Second" (TPS) quota for the
list_indexes()method or general management operations? - Since this is an "App Service Plan" (P2v3) environment, could the throttling be related to the Search Service tier (e.g., Free vs. Basic vs. Standard)?
- What is the recommended way to handle this? Should I implement a manual logic**,** or is there a built-in mechanism in the SDK?
Azure AI Search
4 answers
Sort by: Most helpful
-
Shree Hima Bindu Maganti 6,660 Reputation points Microsoft External Staff Moderator
2026-02-05T05:58:58.3833333+00:00 -
Vishal Bhalabar 20 Reputation points
2026-02-05T05:00:40.2933333+00:00 Thank you for the initial guidance, but it appears the response focuses on Azure OpenAI rate limits (TPM/RPM and
max_tokens), whereas my issue specifically concerns Azure AI Search management operationsAs mentioned, the error occurs when calling
list_indexes()from theSearchIndexClientin a Python Azure Function. To help me resolve this for the search service, could you please clarify the following:Questions:
- Is there a specific rate limit (max no. of calls per seconds) for search service
list_indexes()method from theSearchIndexClientin a Python Azure Function? - Since this is an "App Service Plan" (P2v3) environment, could the throttling be related to the Search Service tier (e.g., Free vs. Basic vs. Standard)?
- What is the recommended way to handle this? Is there a built-in mechanism in the SDK?
- Is there a specific rate limit (max no. of calls per seconds) for search service
-
Shikha Ghildiyal 7,160 Reputation points Microsoft Employee Moderator
2026-02-04T14:43:00.4033333+00:00 Thanks for reaching out to Microsoft QnA
You may encounter a 429 error (“Too Many Requests”) when your usage exceeds the allowed limits or when the system is experiencing high demand.
Please refer to rate limit best practices
To minimize issues related to rate limits, it's a good idea to use the following techniques:
- Set max_tokens and best_of to the minimum values that serve the needs of your scenario. For example, don’t set a large max-tokens value if you expect your responses to be small.
- Use quota management to increase TPM on deployments with high traffic, and to reduce TPM on deployments with limited needs.
- Implement retry logic in your application.
- Avoid sharp changes in the workload. Increase the workload gradually.
- Test different load increase patterns.
If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.
-
Q&A Assist2026-02-04T11:21:44.7766667+00:00 AI generated content. This question contains an answer created with AI from Q&A Assist. Learn more- Rate Limits for
list_indexes(): The Azure documentation does not specify a distinct rate limit for thelist_indexes()method itself. However, it is important to note that the overall management operations in Azure AI Search can be subject to throttling based on the service tier and the number of requests being sent. The general guidance is to be aware of the quotas associated with your Search Service tier. - Throttling Related to Service Tier: Yes, the throttling you are experiencing could be related to the Search Service tier. In your case, using the Standard tier with only 1 partition and 1 replica (which has no SLA) may lead to limitations on the number of concurrent requests that can be processed. If the service is under heavy load or if your application is making frequent requests, this could trigger HTTP 429 errors.
- Recommended Handling: It is advisable to implement retry logic in your application when encountering HTTP 429 errors. This can be done by catching the
HttpResponseErrorand waiting for a specified duration before retrying the request. Additionally, you might consider increasing the number of replicas and partitions if your application requires higher throughput to avoid hitting the rate limits. The Azure SDK does not have a built-in automatic retry mechanism for management operations, so manual implementation is necessary.
References:
- Rate Limits for