How to scale the Azure openAI service fir multiple users?

ADM-Akash.Yadav 0 Reputation points
2023-11-22T10:09:59.7766667+00:00

Hi Team,

I need to know some information about OpenAI service scaling. I need to scale my OpenAI service to accommodate more users because there are multiple users who have access to it and there are access token limitations. Please let me know how to solve this.

Thanks
Akash Yadav

Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
4,080 questions
{count} votes

1 answer

Sort by: Most helpful
  1. AshokPeddakotla-MSFT 35,971 Reputation points Moderator
    2023-11-22T10:50:56.9466667+00:00

    ADM-Akash.Yadav Greetings & Welcome to Microsoft Q&A forum!

    I need to know some information about OpenAI service scaling. I need to scale my OpenAI service to accommodate more users because there are multiple users who have access to it and there are access token limitations.

    Can you add more details? Are you looking for suggestions to increase the quota?

    Scaling the Azure OpenAI service for multiple users depends on the level of isolation and performance that you want to achieve. I would suggest you check Multitenancy and Azure OpenAI Service for more details.

    You can scale your OpenAI service to accommodate more users, you can apply for a quota increase. See Azure OpenAI Service quotas and limits for more details.

    Quota increase requests can be submitted from the Quotas page of Azure OpenAI Studio. Please note that due to overwhelming demand, quota increase requests are being accepted and will be filled in the order they are received. Priority will be given to customers who generate traffic that consumes the existing quota allocation, and your request may be denied if this condition is not met.

    For other rate limits, please submit a service request.

    To minimize issues related to rate limits, it's a good idea to use the following techniques:

    • Implement retry logic in your application.
    • Avoid sharp changes in the workload. Increase the workload gradually.
    • Test different load increase patterns.
    • Increase the quota assigned to your deployment. Move quota from another deployment, if necessary.

    Also, see Manage Azure OpenAI Service quota for more details.

    Do let me know if that helps or have any other queries.


    If the response helped, please do click Accept Answer and Yes for was this answer helpful.

    Doing so would help other community members with similar issue identify the solution. I highly appreciate your contribution to the community.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.