Model Retirement Dates in Azure OpenAI vs. OpenAI and others

Question

Model Retirement Dates in Azure OpenAI vs. OpenAI and others

KT 190

Hi,

I would like to ask about model retirements.

https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/model-retirements

1, Model retirements dates of Azure OpenAI models are the same as Open AI model? Or is this only about Azure OpenAI?

(e.g. gpt-4 0613 retires on 30th June 2024 or later in Azure OpenAI. OpenAI gpt-4 0613 is also retired?)

https://platform.openai.com/docs/models/continuous-model-upgrades

２, When using ChatCompletion API, even if stream = True, it returns in bulk. Is there any way to make Azure OpenAI output as I type like the original OpenAI?

３, How can we use Azure OpenAI with our SaaS applications? If 50-100 people try to use the service at the same time, it does not work due to TPM or some quota. Any turnarounds?

Accepted answer

0 additional answers

Your answer

Answer 1

@KT

Thanks for reaching out to us. Please be aware that Azure OpenAI is operated separately from OpenAI. So, the retirement date may not be the same due to operational considerations.

Model Retirement Dates: The retirement dates mentioned on Azure's website specifically refer to Azure OpenAI models. While Azure OpenAI and OpenAI work closely, they may operate on different schedules and timelines. To find the most accurate information about OpenAI model retirement dates, it's best to consult OpenAI's documentation or contact OpenAI directly.
Streamed Responses: The Azure OpenAI API does not currently support streaming responses like the original OpenAI API. All responses are returned in bulk once the model has finished processing the input. If you require streaming responses, you may need to implement this functionality on the client side. For this feature, if you have any other questions and share more scenario details, we can check with product team.
Using Azure OpenAI with SaaS Applications: Azure OpenAI can certainly be integrated into SaaS applications. If you're running into issues with TPM limits or quotas, there are a few potential solutions. First, you could consider batching requests together to reduce the total number of requests made. Second, you could look into increasing your quota limits with Azure. Azure often provides options to increase quotas based on your needs and usage patterns. Finally, you could implement a queue or other form of load balancing to manage high volumes of requests. This would involve having your application distribute requests among multiple instances of the Azure OpenAI service. Remember to review Azure's documentation and support resources for more detailed information and guidance.

I hope this helps! Please let us know if you have any further questions.

Regards,

Yutong

-Please kindly accept the answer if you feel helpful to support the community, thanks a lot.

Zhai, Jim 65 Reputation points

2024-10-22T17:29:16.2233333+00:00

Thank you for your information! That is really helpful. However, there is still one more thing I may need your help.

I wanted to set up automated monitoring and notification functions to track the retirement date of the models I have developed and send an email reminder to specific people three months in advance. I think that these retirement dates are stored in the model's metadata and not in the logs if I am not wrong.

I would like to know how I can configure these retirement dates into my alerts on Azure if that is possible. I seems like I cannot simply query the retirement date from the logs, nor can I find this field in the alerts setting.

Thank you and looking forward to your reply.

Share via

Model Retirement Dates in Azure OpenAI vs. OpenAI and others

0 additional answers

Your answer