Hi L P
It seems that you are encountering an "Insufficient quota" error despite having a remaining quota of 30K.
- Model-Specific Quota Limits: Each model has its own maximum Tokens-Per-Minute (TPM) allocation. For the GPT-4.1 model, the default quota limit is 1M TPM, and for the GPT-4.1-mini, it is also 1M TPM. If the combined TPM of your existing deployments exceeds your total quota, you may not be able to create additional deployments.
- Requests-Per-Minute (RPM): The RPM is also a limiting factor. For GPT-4.1, the RPM is set at a specific ratio to the TPM. If your current deployments are consuming too much of your RPM allocation, it could prevent new deployments.
- Quota Allocation: When you assign TPM to a deployment, it reduces the available quota for that model. If you have already allocated a significant amount of your quota to the GPT-4.1 deployment, it may limit your ability to deploy the GPT-4.1-mini.
If the current quota is not enough, you can request a quota increase for the specific resources needed for the GPT4.1-mini deployment. You can do this by following these steps:
- Go to the Azure portal.
- Select Help + support.
- Choose New support request.
- Provide the necessary information, such as the resource type (GPT4.1-mini), the subscription, and the specific quota you need to increase.
- Submit the request for a quota increase.
Kindly refer below link: quota
Thank You.