Translate
Request to increase GPT-4o TPM quota for internal assistant using Azure OpenAI + AI Search (France region)
Hello,
We are building an internal coding assistant for our company using Azure OpenAI in the France Central region. Our setup includes:
- Azure OpenAI (GPT-4o)
- Azure Cognitive Search (vector store)
- Azure Cosmos DB (metadata)
We are currently blocked by the 50,000 TPM / 10 RPM quota. A single developer interaction with RAG context and source code input often exceeds 8,000 tokens per prompt, making the system unusable after just a few requests.
✅ Our use case is 100% internal — there is no public API or public users. Only our internal developers are using the assistant.
We would like to request a quota increase to:
- 300,000 TPM
- 100 RPM
If necessary, we’re also open to GPT-4 32k instead of GPT-4o.
This assistant is used exclusively for source code assistance, debugging, refactoring, and understanding files.
Is there a way to trigger this quota upgrade from the Basic plan or any other alternative path?
Thanks in advance,
Jonathan Miezin
Technical lead – iSonic