Ócáid
Tóg Feidhmchláir agus Gníomhairí AI
Mar 17, 9 PM - Mar 21, 10 AM
Bí ar an tsraith meetup chun réitigh AI inscálaithe a thógáil bunaithe ar chásanna úsáide fíor-dhomhanda le forbróirí agus saineolaithe eile.
Cláraigh anoisNí thacaítear leis an mbrabhsálaí seo a thuilleadh.
Uasghrádú go Microsoft Edge chun leas a bhaint as na gnéithe is déanaí, nuashonruithe slándála, agus tacaíocht theicniúil.
This article provides a summary of the latest releases and major documentation updates for Azure OpenAI Service.
In addition to the deployment-level content filtering configuration, we now also provide a request header that allows you specify your custom configuration at request time for every API call. For more information, see Use content filters (preview).
The latest GPT model that excels at diverse text and image tasks is now available on Azure OpenAI.
For access to gpt-4.5-preview
registration is required, and access will be granted based on Microsoft's eligibility criteria. Customers who have access to other limited access models will still need to request access for this model.
Request access: GPT-4.5-preview limited access model application
For more information on model capabilities, and region availability see the models documentation.
Stored completions allow you to capture the conversation history from chat completions sessions to use as datasets for evaluations and fine-tuning.
o3-mini
is now available for global standard, and data zone standard deployments for registered limited access customers.
For more information, see our reasoning model guide.
The gpt-4o-mini-audio-preview
(2024-12-17) model is the latest audio completions model. For more information, see the audio generation quickstart.
The gpt-4o-mini-realtime-preview
(2024-12-17) model is the latest real-time audio model. The real-time models use the same underlying GPT-4o audio model as the completions API, but is optimized for low-latency, real-time audio interactions. For more information, see the real-time audio quickstart.
For more information about available models, see the models and versions documentation.
o3-mini
(2025-01-31) is the latest reasoning model, offering enhanced reasoning abilities. For more information, see our reasoning model guide.
The gpt-4o-audio-preview
model is now available for global deployments in East US 2 and Sweden Central regions. Use the gpt-4o-audio-preview
model for audio generation.
The gpt-4o-audio-preview
model introduces the audio modality into the existing /chat/completions
API. The audio model expands the potential for AI applications in text and voice-based interactions and audio analysis. Modalities supported in gpt-4o-audio-preview
model include: text, audio, and text + audio. For more information, see the audio generation quickstart.
Nóta
The Realtime API uses the same underlying GPT-4o audio model as the completions API, but is optimized for low-latency, real-time audio interactions.
The gpt-4o-realtime-preview
model version 2024-12-17 is available for global deployments in East US 2 and Sweden Central regions. Use the gpt-4o-realtime-preview
version 2024-12-17 model instead of the gpt-4o-realtime-preview
version 2024-10-01-preview model for real-time audio interactions.
gpt-4o-realtime-preview
model.gpt-4o-realtime-preview
models now support the following voices: "alloy", "ash", "ballad", "coral", "echo", "sage", "shimmer", "verse".gpt-4o-realtime-preview
model. The rate limits for each gpt-4o-realtime-preview
model deployment are 100K TPM and 1K RPM. During the preview, Azure AI Foundry portal and APIs might inaccurately show different rate limits. Even if you try to set a different rate limit, the actual rate limit will be 100K TPM and 1K RPM.For more information, see the GPT-4o real-time audio quickstart and the how-to guide.
The latest o1
model is now available for API access and model deployment. Registration is required, and access will be granted based on Microsoft's eligibility criteria. Customers who previously applied and received access to o1-preview
, don't need to reapply as they are automatically on the wait-list for the latest model.
Request access: limited access model application
To learn more about the advanced o1
series models see, getting started with o1 series reasoning models.
Model | Region |
---|---|
o1 (Version: 2024-12-17) |
East US2 (Global Standard) Sweden Central (Global Standard) |
Direct preference optimization (DPO) is a new alignment technique for large language models, designed to adjust model weights based on human preferences. Unlike reinforcement learning from human feedback (RLHF), DPO does not require fitting a reward model and uses simpler data (binary preferences) for training. This method is computationally lighter and faster, making it equally effective at alignment while being more efficient. DPO is especially useful in scenarios where subjective elements like tone, style, or specific content preferences are important. We’re excited to announce the public preview of DPO in Azure OpenAI Service, starting with the gpt-4o-2024-08-06
model.
For fine-tuning model region availability, see the models page.
Stored completions allow you to capture the conversation history from chat completions sessions to use as datasets for evaluations and fine-tuning.
gpt-4o-2024-11-20
is now available for global standard deployment in:
Data zone provisioned deployments are available in the same Azure OpenAI resource as all other Azure OpenAI deployment types but allow you to leverage Azure global infrastructure to dynamically route traffic to the data center within the Microsoft defined data zone with the best availability for each request. Data zone provisioned deployments provide reserved model processing capacity for high and predictable throughput using Azure infrastructure within Microsoft specified data zones. Data zone provisioned deployments are supported on gpt-4o-2024-08-06
, gpt-4o-2024-05-13
, and gpt-4o-mini-2024-07-18
models.
For more information, see the deployment types guide.
Vision fine-tuning with GPT-4o (2024-08-06) in now Generally Available (GA).
Vision fine-tuning allows you to add images to your JSONL training data. Just as you can send one or many image inputs to chat completions, you can include those same message types within your training data. Images can be provided either as URLs or as base64 encoded images.
For fine-tuning model region availability, see the models page.
We are introducing new forms of abuse monitoring that leverage LLMs to improve efficiency of detection of potentially abusive use of the Azure OpenAI service and to enable abuse monitoring without the need for human review of prompts and completions. Learn more, see Abuse monitoring.
Prompts and completions that are flagged through content classification and/or identified to be part of a potentially abusive pattern of use are subjected to an additional review process to help confirm the system's analysis and inform actioning decisions. Our abuse monitoring systems have been expanded to enable review by LLM by default and by humans when necessary and appropriate.
Data zone standard deployments are available in the same Azure OpenAI resource as all other Azure OpenAI deployment types but allow you to leverage Azure global infrastructure to dynamically route traffic to the data center within the Microsoft defined data zone with the best availability for each request. Data zone standard provides higher default quotas than our Azure geography-based deployment types. Data zone standard deployments are supported on gpt-4o-2024-08-06
, gpt-4o-2024-05-13
, and gpt-4o-mini-2024-07-18
models.
For more information, see the deployment types guide.
Azure OpenAI global batch is now generally available.
The Azure OpenAI Batch API is designed to handle large-scale and high-volume processing tasks efficiently. Process asynchronous groups of requests with separate quota, with 24-hour target turnaround, at 50% less cost than global standard. With batch processing, rather than send one request at a time you send a large number of requests in a single file. Global batch requests have a separate enqueued token quota avoiding any disruption of your online workloads.
Key use cases include:
Large-Scale Data Processing: Quickly analyze extensive datasets in parallel.
Content Generation: Create large volumes of text, such as product descriptions or articles.
Document Review and Summarization: Automate the review and summarization of lengthy documents.
Customer Support Automation: Handle numerous queries simultaneously for faster responses.
Data Extraction and Analysis: Extract and analyze information from vast amounts of unstructured data.
Natural Language Processing (NLP) Tasks: Perform tasks like sentiment analysis or translation on large datasets.
Marketing and Personalization: Generate personalized content and recommendations at scale.
For more information on getting started with global batch deployments.
The o1-preview
and o1-mini
models are now available for API access and model deployment. Registration is required, and access will be granted based on Microsoft's eligibility criteria.
Request access: limited access model application
Customers who were already approved and have access to the model through the early access playground don't need to apply again, you'll automatically be granted API access. Once access has been granted, you'll need to create a deployment for each model.
API support:
Support for the o1 series models was added in API version 2024-09-01-preview
.
The max_tokens
parameter has been deprecated and replaced with the new max_completion_tokens
parameter. o1 series models will only work with the max_completion_tokens
parameter.
Region availability:
Models are available for standard and global standard deployment in East US2 and Sweden Central for approved customers.
Azure OpenAI GPT-4o audio is part of the GPT-4o model family that supports low-latency, "speech in, speech out" conversational interactions. The GPT-4o audio realtime
API is designed to handle real-time, low-latency conversational interactions, making it a great fit for use cases involving live interactions between a user and a model, such as customer support agents, voice assistants, and real-time translators.
The gpt-4o-realtime-preview
model is available for global deployments in East US 2 and Sweden Central regions.
For more information, see the GPT-4o real-time audio quickstart.
Global batch now supports GPT-4o (2024-08-06). See the global batch getting started guide for more information.
As of September 19, 2024, when you go to the Azure OpenAI Studio you no longer see the legacy Azure OpenAI Studio by default. If needed you'll still be able to go back to the previous experience by using the Switch to the old look toggle in the top bar of the UI for the next couple of weeks. If you switch back to legacy Azure AI Foundry portal, it helps if you fill out the feedback form to let us know why. We're actively monitoring this feedback to improve the new experience.
GPT-4o 2024-08-06 is now available for provisioned deployments in East US, East US 2, North Central US, and Sweden Central. It's also available for global provisioned deployments.
For the latest information on model availability, see the models page.
Global deployments are available in the same Azure OpenAI resources as non-global deployment types but allow you to leverage Azure's global infrastructure to dynamically route traffic to the data center with best availability for each request. Global provisioned deployments provide reserved model processing capacity for high and predictable throughput using Azure global infrastructure. Global provisioned deployments are supported on gpt-4o-2024-08-06
and gpt-4o-mini-2024-07-18
models.
For more information, see the deployment types guide.
The Azure OpenAI o1-preview
and o1-mini
models are designed to tackle reasoning and problem-solving tasks with increased focus and capability. These models spend more time processing and understanding the user's request, making them exceptionally strong in areas like science, coding, and math compared to previous iterations.
o1-preview
: o1-preview
is the more capable of the o1
series models.o1-mini
: o1-mini
is the faster and cheaper of the o1
series models.Model version: 2024-09-12
Request access: limited access model application
The o1
series models are currently in preview and don't include some features available in other models, such as image understanding and structured outputs which are available in the latest GPT-4o model. For many tasks, the generally available GPT-4o models might still be more suitable.
OpenAI has incorporated additional safety measures into the o1
models, including new techniques to help the models refuse unsafe requests. These advancements make the o1
series some of the most robust models available.
The o1-preview
and o1-mini
are available in the East US2 region for limited access through the Azure AI Foundry portal early access playground. Data processing for the o1
models might occur in a different region than where they are available for use.
To try the o1-preview
and o1-mini
models in the early access playground registration is required, and access will be granted based on Microsoft’s eligibility criteria.
Request access: limited access model application
Once access has been granted, you will need to:
eastus2
region. If you don't have an Azure OpenAI resource in this region you'll need to create one.eastus2
Azure OpenAI resource is selected, in the upper left-hand panel under Playgrounds select Early access playground (preview).GPT-4o mini is now available for provisioned deployments in Canada East, East US, East US2, North Central US, and Sweden Central.
For the latest information on model availability, see the models page.
GPT-4o fine-tuning is now available for Azure OpenAI in public preview in North Central US and Sweden Central.
For more information, see our blog post.
API version 2024-07-01-preview
is the latest dataplane authoring & inference API release. It replaces API version 2024-05-01-preview
and adds support for:
max_num_results
that the file search tool should output.For more information see our reference documentation
On August 6, 2024, OpenAI announced the latest version of their flagship GPT-4o model version 2024-08-06
. GPT-4o 2024-08-06
has all the capabilities of the previous version as well as:
Azure customers can test out GPT-4o 2024-08-06
today in the new Azure AI Foundry early access playground (preview).
Unlike the previous early access playground, the Azure AI Foundry portal early access playground (preview) doesn't require you to have a resource in a specific region.
Nóta
Prompts and completions made through the early access playground (preview) might be processed in any Azure OpenAI region, and are currently subject to a 10 request per minute per Azure subscription limit. This limit might change in the future.
Azure OpenAI Service abuse monitoring is enabled for all early access playground users even if approved for modification; default content filters are enabled and cannot be modified.
To test out GPT-4o 2024-08-06
, sign-in to the Azure AI early access playground (preview) using this link.
The Azure OpenAI Batch API is designed to handle large-scale and high-volume processing tasks efficiently. Process asynchronous groups of requests with separate quota, with 24-hour target turnaround, at 50% less cost than global standard. With batch processing, rather than send one request at a time you send a large number of requests in a single file. Global batch requests have a separate enqueued token quota avoiding any disruption of your online workloads.
Key use cases include:
Large-Scale Data Processing: Quickly analyze extensive datasets in parallel.
Content Generation: Create large volumes of text, such as product descriptions or articles.
Document Review and Summarization: Automate the review and summarization of lengthy documents.
Customer Support Automation: Handle numerous queries simultaneously for faster responses.
Data Extraction and Analysis: Extract and analyze information from vast amounts of unstructured data.
Natural Language Processing (NLP) Tasks: Perform tasks like sentiment analysis or translation on large datasets.
Marketing and Personalization: Generate personalized content and recommendations at scale.
For more information on getting started with global batch deployments.
GPT-4o mini fine-tuning is now available in public preview in Sweden Central and in North Central US.
The file search tool for Assistants now has additional charges for usage. See the pricing page for more information.
GPT-4o mini is the latest Azure OpenAI model first announced on July 18, 2024:
"GPT-4o mini allows customers to deliver stunning applications at a lower cost with blazing speed. GPT-4o mini is significantly smarter than GPT-3.5 Turbo—scoring 82% on Measuring Massive Multitask Language Understanding (MMLU) compared to 70%—and is more than 60% cheaper.1 The model delivers an expanded 128K context window and integrates the improved multilingual capabilities of GPT-4o, bringing greater quality to languages from around the world."
The model is currently available for both standard and global standard deployment in the East US region.
For information on model quota, consult the quota and limits page and for the latest info on model availability refer to the models page.
The new default content filtering policy DefaultV2
delivers the latest safety and security mitigations for the GPT model series (text), including:
While there are no changes to content filters for existing resources and deployments (default or custom content filtering configurations remain unchanged), new resources and GPT deployments will automatically inherit the new content filtering policy DefaultV2
. Customers have the option to switch between safety defaults and create custom content filtering configurations.
Refer to our Default safety policy documentation for more information.
API version 2024-06-01
is the latest GA data plane inference API release. It replaces API version 2024-02-01
and adds support for:
encoding_format
& dimensions
parameters.logprobs
& top_logprobs
parameters.Refer to our data plane inference reference documentation for more information.
GPT-4o is now available for global standard deployments in:
For information on global standard quota, consult the quota and limits page.
gpt-35-turbo
0301 retirement date to no earlier than October 1, 2024.gpt-35-turbo
& gpt-35-turbo-16k
0613 retirement date to October 1, 2024.gpt-4
& gpt-4-32k
0314 deprecation date to October 1, 2024, and retirement date to June 6, 2025.Refer to our model retirement guide for the latest information on model deprecation and retirement.
For the latest information on model availability, see the models page.
Threads and Files in Assistants now supports CMK in the following region:
gpt-4o
Version: 2024-05-13
is available for both standard and provisioned deployments. Provisioned and standard model deployments accept both text and image/vision inference requests.
For information on model regional availability, consult the model matrix for provisioned deployments.
A refresh of the Assistants API is now publicly available. It contains the following updates:
tool_choice
parameter for forcing the Assistant to use a specified tool.
You can now create messages with the assistant role to create custom conversation histories in Threads.temperature
, top_p
, response_format
parameters.GPTAssistantAgent
, a new experimental agent that lets you seamlessly add Assistants into AutoGen-based multi-agent workflows. This enables multiple Azure OpenAI assistants that could be task or domain specialized to collaborate and tackle complex tasks.gpt-3.5-turbo-0125
models in the following regions:
For more information, see the blog post about assistants.
GPT-4o ("o is for "omni") is the latest model from OpenAI launched on May 13, 2024.
For information on model regional availability, see the models page.
Global deployments are available in the same Azure OpenAI resources as non-global offers but allow you to leverage Azure's global infrastructure to dynamically route traffic to the data center with best availability for each request. Global standard provides the highest default quota for new models and eliminates the need to load balance across multiple resources.
For more information, see the deployment types guide.
2024-05-01-preview
API release.Create custom content filters for your DALL-E 2 and 3, GPT-4 Turbo with Vision GA (turbo-2024-04-09
), and GPT-4o deployments. Content filtering
Running filters asynchronously for improved latency in streaming scenarios is now available for all Azure OpenAI customers. Content filtering
Prompt Shields protect applications powered by Azure OpenAI models from two types of attacks: direct (jailbreak) and indirect attacks. Indirect Attacks (also known as Indirect Prompt Attacks or Cross-Domain Prompt Injection Attacks) are a type of attack on systems powered by Generative AI models that might occur when an application processes information that wasn’t directly authored by either the developer of the application or the user. Content filtering
The latest GA release of GPT-4 Turbo is:
gpt-4
Version: turbo-2024-04-09
This is the replacement for the following preview models:
gpt-4
Version: 1106-Preview
gpt-4
Version: 0125-Preview
gpt-4
Version: vision-preview
0409
turbo model supports JSON mode and function calling for all inference requests.turbo-2024-04-09
currently doesn't support the use of JSON mode and function calling when making inference requests with image (vision) input. Text based input requests (requests without image_url
and inline images) do support JSON mode and function calling.gpt-4
Version: turbo-2024-04-09
. This includes Optical Character Recognition (OCR), object grounding, video prompts, and improved handling of your data with images.Tábhachtach
Vision enhancements preview features including Optical Character Recognition (OCR), object grounding, video prompts will be retired and no longer available once gpt-4
Version: vision-preview
is upgraded to turbo-2024-04-09
. If you are currently relying on any of these preview features, this automatic model upgrade will be a breaking change.
gpt-4
Version: turbo-2024-04-09
is available for both standard and provisioned deployments. Currently the provisioned version of this model doesn't support image/vision inference requests. Provisioned deployments of this model only accept text input. Standard model deployments accept both text and image/vision inference requests.To deploy the GA model from the Azure AI Foundry portal, select GPT-4
and then choose the turbo-2024-04-09
version from the dropdown menu. The default quota for the gpt-4-turbo-2024-04-09
model will be the same as current quota for GPT-4-Turbo. See the regional quota limits.
Fine-tuning is now available with support for:
gpt-35-turbo
(0613)gpt-35-turbo
(1106)gpt-35-turbo
(0125)babbage-002
davinci-002
gpt-35-turbo
(0613)gpt-35-turbo
(1106)gpt-35-turbo
(0125)Check the models page, for the latest information on model availability and fine-tuning support in each region.
Fine-tuning now supports multi-turn chat training examples.
You can now use the GPT-4 (0125) model in available regions with Azure OpenAI On Your Data.
Azure OpenAI Studio now provides a Risks & Safety dashboard for each of your deployments that uses a content filter configuration. Use it to check the results of the filtering activity. Then you can adjust your filter configuration to better serve your business needs and meet Responsible AI principles.
This is the latest GA API release and is the replacement for the previous 2023-05-15
GA release. This release adds support for the latest Azure OpenAI GA features like Whisper, DALLE-3, fine-tuning, on your data, and more.
Features that are in preview such as Assistants, text to speech (TTS), and some of the "on your data" datasources, require a preview API version. For more information, check out our API version lifecycle guide.
The Whisper speech to text model is now GA for both REST and Python. Client library SDKs are currently still in public preview.
Try out Whisper by following a quickstart.
DALL-E 3 image generation model is now GA for both REST and Python. Client library SDKs are currently still in public preview.
Try out DALL-E 3 by following a quickstart.
You can now access DALL-E 3 with an Azure OpenAI resource in the East US
or AustraliaEast
Azure region, in addition to SwedenCentral
.
We have added a page to track model deprecations and retirements in Azure OpenAI Service. This page provides information about the models that are currently available, deprecated, and retired.
2024-03-01-preview
has all the same functionality as 2024-02-15-preview
and adds two new parameters for embeddings:
encoding_format
allows you to specify the format to generate embeddings in float
, or base64
. The default is float
.dimensions
allows you set the number of output embeddings. This parameter is only supported with the new third generation embeddings models: text-embedding-3-large
, text-embedding-3-small
. Typically larger embeddings are more expensive from a compute, memory, and storage perspective. Being able to adjust the number of dimensions allows more control over overall cost and performance. The dimensions
parameter isn't supported in all versions of the OpenAI 1.x Python library, to take advantage of this parameter we recommend upgrading to the latest version: pip install openai --upgrade
.If you're currently using a preview API version to take advantage of the latest features, we recommend consulting the API version lifecycle article to track how long your current API version will be supported.
The deployment upgrade of gpt-4
1106-Preview to gpt-4
0125-Preview scheduled for March 8, 2024 is no longer taking place. Deployments of gpt-4
versions 1106-Preview and 0125-Preview set to "Auto-update to default" and "Upgrade when expired" will start to be upgraded after a stable version of the model is released.
For more information on the upgrade process refer to the models page.
This model has various improvements, including higher accuracy at responding in requested formats and a fix for a bug which caused a text encoding issue for non-English language function calls.
For information on model regional availability and upgrades refer to the models page.
text-embedding-3-large
text-embedding-3-small
In testing, OpenAI reports both the large and small third generation embeddings models offer better average multi-language retrieval performance with the MIRACL benchmark while still maintaining better performance for English tasks with the MTEB benchmark than the second generation text-embedding-ada-002 model.
For information on model regional availability and upgrades refer to the models page.
To simplify migration between different versions of the GPT-3.5-Turbo models (including 16k), we'll be consolidating all GPT-3.5-Turbo quota into a single quota value.
Any customers who have increased quota approved will have combined total quota that reflects the previous increases.
Any customer whose current total usage across model versions is less than the default will get a new combined total quota by default.
The gpt-4
model version 0125-preview
is now available on Azure OpenAI Service in the East US, North Central US, and South Central US regions. Customers with deployments of gpt-4
version 1106-preview
will be automatically upgraded to 0125-preview
in the coming weeks.
For information on model regional availability and upgrades refer to the models page.
Azure OpenAI now supports the API that powers OpenAI's GPTs. Azure OpenAI Assistants (Preview) allows you to create AI assistants tailored to your needs through custom instructions and advanced tools like code interpreter, and custom functions. To learn more, see:
Azure OpenAI Service now supports text to speech APIs with OpenAI's voices. Get AI-generated speech from the text you provide. To learn more, see the overview guide and try the quickstart.
Nóta
Azure AI Speech also supports OpenAI text to speech voices. To learn more, see OpenAI text to speech voices via Azure OpenAI Service or via Azure AI Speech guide.
You can now use Azure OpenAI On Your Data in the following Azure region:
GPT-4 Turbo with Vision on Azure OpenAI service is now in public preview. GPT-4 Turbo with Vision is a large multimodal model (LMM) developed by OpenAI that can analyze images and provide textual responses to questions about them. It incorporates both natural language processing and visual understanding. With enhanced mode, you can use the Azure AI Vision features to generate additional insights from the images.
SwitzerlandNorth
, SwedenCentral
, WestUS
, and AustraliaEast
Both models are the latest release from OpenAI with improved instruction following, JSON mode, reproducible output, and parallel function calling.
GPT-4 Turbo Preview has a max context window of 128,000 tokens and can generate 4,096 output tokens. It has the latest training data with knowledge up to April 2023. This model is in preview and isn't recommended for production use. All deployments of this preview model will be automatically updated in place once the stable release becomes available.
GPT-3.5-Turbo-1106 has a max context window of 16,385 tokens and can generate 4,096 output tokens.
For information on model regional availability consult the models page.
The models have their own unique per region quota allocations.
DALL-E 3 is the latest image generation model from OpenAI. It features enhanced image quality, more complex scenes, and improved performance when rendering text in images. It also comes with more aspect ratio options. DALL-E 3 is available through OpenAI Studio and through the REST API. Your OpenAI resource must be in the SwedenCentral
Azure region.
DALL-E 3 includes built-in prompt rewriting to enhance images, reduce bias, and increase natural variation.
Try out DALL-E 3 by following a quickstart.
Expanded customer configurability: All Azure OpenAI customers can now configure all severity levels (low, medium, high) for the categories hate, violence, sexual and self-harm, including filtering only high severity content. Configure content filters
Content Credentials in all DALL-E models: AI-generated images from all DALL-E models now include a digital credential that discloses the content as AI-generated. Applications that display image assets can leverage the open source Content Authenticity Initiative SDK to display credentials in their AI generated images. Content Credentials in Azure OpenAI
New RAI models
Blocklists: Customers can now quickly customize content filter behavior for prompts and completions further by creating a custom blocklist in their filters. The custom blocklist allows the filter to take action on a customized list of patterns, such as specific terms or regex patterns. In addition to custom blocklists, we provide a Microsoft profanity blocklist (English). Use blocklists
gpt-35-turbo-0613
is now available for fine-tuning.
babbage-002
and davinci-002
are now available for fine-tuning. These models replace the legacy ada, babbage, curie, and davinci base models that were previously available for fine-tuning.
Fine-tuning availability is limited to certain regions. Check the models page, for the latest information on model availability in each region.
Fine-tuned models have different quota limits than regular models.
GPT-4 and GPT-4-32k are now available to all Azure OpenAI Service customers. Customers no longer need to apply for the waitlist to use GPT-4 and GPT-4-32k (the Limited Access registration requirements continue to apply for all Azure OpenAI models). Availability might vary by region. Check the models page, for the latest information on model availability in each region.
Azure OpenAI Service now supports the GPT-3.5 Turbo Instruct model. This model has performance comparable to text-davinci-003
and is available to use with the Completions API. Check the models page, for the latest information on model availability in each region.
Azure OpenAI Service now supports speech to text APIs powered by OpenAI's Whisper model. Get AI-generated text based on the speech audio you provide. To learn more, check out the quickstart.
Nóta
Azure AI Speech also supports OpenAI's Whisper model via the batch transcription API. To learn more, check out the Create a batch transcription guide. Check out What is the Whisper model? to learn more about when to use Azure AI Speech vs. Azure OpenAI Service.
2023-05-15
.If you're currently using the 2023-03-15-preview
API, we recommend migrating to the GA 2023-05-15
API. If you're currently using API version 2022-12-01
this API remains GA, but doesn't include the latest Chat Completion capabilities.
Tábhachtach
Using the current versions of the GPT-35-Turbo models with the completion endpoint remains in preview.
DALL-E 2 public preview. Azure OpenAI Service now supports image generation APIs powered by OpenAI's DALL-E 2 model. Get AI-generated images based on the descriptive text you provide. To learn more, check out the quickstart.
Inactive deployments of customized models will now be deleted after 15 days; models will remain available for redeployment. If a customized (fine-tuned) model is deployed for more than fifteen (15) days during which no completions or chat completions calls are made to it, the deployment will automatically be deleted (and no further hosting charges will be incurred for that deployment). The underlying customized model will remain available and can be redeployed at any time. To learn more check out the how-to-article.
GPT-4 series models are now available in preview on Azure OpenAI. To request access, existing Azure OpenAI customers can apply by filling out this form. These models are currently available in the East US and South Central US regions.
New Chat Completion API for GPT-35-Turbo and GPT-4 models released in preview on 3/21. To learn more, check out the updated quickstarts and how-to article.
GPT-35-Turbo preview. To learn more, check out the how-to article.
Increased training limits for fine-tuning: The max training job size (tokens in training file) x (# of epochs) is 2 Billion tokens for all models. We have also increased the max training job from 120 to 720 hours.
Adding additional use cases to your existing access. Previously, the process for adding new use cases required customers to reapply to the service. Now, we're releasing a new process that allows you to quickly add new use cases to your use of the service. This process follows the established Limited Access process within Azure AI services. Existing customers can attest to any and all new use cases here. Please note that this is required anytime you would like to use the service for a new use case you didn't originally apply for.
suffix
parameter.New articles on:
New training course:
Service GA. Azure OpenAI Service is now generally available.
New models: Addition of the latest text model, text-davinci-003 (East US, West Europe), text-ada-embeddings-002 (East US, South Central US, West Europe)
The latest models from OpenAI. Azure OpenAI provides access to all the latest models including the GPT-3.5 series.
New API version (2022-12-01). This update includes several requested enhancements including token usage information in the API response, improved error messages for files, alignment with OpenAI on fine-tuning creation data structure, and support for the suffix parameter to allow custom naming of fine-tuned jobs.
Higher request per second limits. 50 for non-Davinci models. 20 for Davinci models.
Faster fine-tune deployments. Deploy an Ada and Curie fine-tuned models in under 10 minutes.
Higher training limits: 40M training tokens for Ada, Babbage, and Curie. 10M for Davinci.
Process for requesting modifications to the abuse & miss-use data logging & human review. Today, the service logs request/response data for the purposes of abuse and misuse detection to ensure that these powerful models aren't abused. However, many customers have strict data privacy and security requirements that require greater control over their data. To support these use cases, we're releasing a new process for customers to modify the content filtering policies or turn off the abuse logging for low-risk use cases. This process follows the established Limited Access process within Azure AI services and existing OpenAI customers can apply here.
Customer managed key (CMK) encryption. CMK provides customers greater control over managing their data in Azure OpenAI by providing their own encryption keys used for storing training data and customized models. Customer-managed keys (CMK), also known as bring your own key (BYOK), offer greater flexibility to create, rotate, disable, and revoke access controls. You can also audit the encryption keys used to protect your data. Learn more from our encryption at rest documentation.
Lockbox support
SOC-2 compliance
Logging and diagnostics through Azure Resource Health, Cost Analysis, and Metrics & Diagnostic settings.
Studio improvements. Numerous usability improvements to the Studio workflow including Azure AD role support to control who in the team has access to create fine-tuned models and deploy.
Fine-tuning create API request has been updated to match OpenAI’s schema.
Preview API versions:
{
"training_file": "file-XGinujblHPwGLSztz8cPS8XY",
"hyperparams": {
"batch_size": 4,
"learning_rate_multiplier": 0.1,
"n_epochs": 4,
"prompt_loss_weight": 0.1,
}
}
API version 2022-12-01:
{
"training_file": "file-XGinujblHPwGLSztz8cPS8XY",
"batch_size": 4,
"learning_rate_multiplier": 0.1,
"n_epochs": 4,
"prompt_loss_weight": 0.1,
}
Content filtering is temporarily off by default. Azure content moderation works differently than Azure OpenAI. Azure OpenAI runs content filters during the generation call to detect harmful or abusive content and filters them from the response. Learn More
These models will be re-enabled in Q1 2023 and be on by default.
Customer actions
Learn more about the underlying models that power Azure OpenAI.
Ócáid
Tóg Feidhmchláir agus Gníomhairí AI
Mar 17, 9 PM - Mar 21, 10 AM
Bí ar an tsraith meetup chun réitigh AI inscálaithe a thógáil bunaithe ar chásanna úsáide fíor-dhomhanda le forbróirí agus saineolaithe eile.
Cláraigh anoisOiliúint
Modúl
Apply prompt engineering with Azure OpenAI Service - Training
In this module, learn how prompt engineering can help to create and fine-tune prompts for natural language processing models. Prompt engineering involves designing and testing various prompts to optimize the performance of the model in generating accurate and relevant responses.