Frequently asked questions about using Azure AI services for startups.
Check out the Generative AI for beginners course on GitHub. It's an 18-lesson instruction set that introduces all of the main Azure OpenAI features and shows you how to build applications with them.
Use Azure AI Studio to test a variety of AI capabilities, including deploying Azure OpenAI models and applying content moderation services.
Different Azure OpenAI models are restricted to different regions. See the model availability table for a complete list.
The impact is minimal, unless you're using the streaming feature. The latency of the model's own response has a much greater effect on latency than region differences.
The choice of using a dedicated Azure OpenAI server vs. pay-as-you-go plan also has a larger impact on performance.
See Manage Azure OpenAI Service quota to understand how quota limits work and how to manage them.
For customers using the pay-as-you-go model (most common), see the Manage Azure OpenAI Service quota page. For customers using a dedicated Azure OpenAI server, see the quota section of the related guide.
Consider combining multiple Azure OpenAI deployments in an advanced architecture to build a system that delivers more tokens-per-minute to more users.
You should consider switching from pay-as-you-go to provisioned throughput when you have well defined, predictable throughput requirements. Typically, this is the case when the application is ready for production or has already been deployed in production and there is an understanding of the expected traffic. This allows users to accurately forecast the required capacity and avoid unexpected billing.
Create a load balancer for your application.
See the Load balancing sample if you're using the pay-as-you-go-model. If you're using a dedicated Azure OpenAI server, see the PTU guide for information on load balancing.
Create an online deployment using prompt flow in Azure AI Studio. Then, test it out by inputting values in the form editor or JSON editor.
See the Evaluation and monitoring metrics guide for information on tracking risk and safety metrics as well as a number of response quality metrics.
Use the monitoring feature of Azure OpenAI Studio. It provides a dashboards that track the performance metrics of your models over time.
See the Azure OpenAI chat reference architecture for best practices for deploying a standard chat application.
See the Artificial Intelligence and Machine Learning tech community forum.
To learn more, see Microsoft for Startups.