Azure OpenAI Service is powered by a diverse set of models with different capabilities and price points. Model availability varies by region and cloud. For Azure Government model availability, please refer to Azure Government OpenAI Service.
A series of models in preview that can synthesize text to speech.
GPT-4.5 Preview
Availability
For access to gpt-4.5-preview registration is required, and access will be granted based on Microsoft's eligibility criteria. Customers who have access to other limited access models will still need to request access for this model.
Once access has been granted, you will need to create a deployment for the model
Region Availability
Model
Region
gpt-4.5-preview
East US 2 (Global Standard) Sweden Central (Global Standard)
Capabilities
Model ID
Description
Context Window
Max Output Tokens
Training Data (up to)
gpt-4.5-preview (2025-02-27) GPT-4.5 Preview
The latest GPT model that excels at diverse text and image tasks. -Structured outputs -Prompt caching -Tools -Streaming -Text(input/output) - Image(input)
128,000
16,384
Oct 2023
Opomba
It is expected behavior that the model cannot answer questions about itself. If you want to know when the knowledge cutoff for the model's training data is, or other details about the model you should refer to the model documentation above.
o-series models
The Azure OpenAI o* series models are specifically designed to tackle reasoning and problem-solving tasks with increased focus and capability. These models spend more time processing and understanding the user's request, making them exceptionally strong in areas like science, coding, and math compared to previous iterations.
The most capable model in the o1 series, offering enhanced reasoning abilities. - Structured outputs - Text, image processing - Functions/Tools
Input: 200,000 Output: 100,000
Oct 2023
o1-preview (2024-09-12)
Older preview version
Input: 128,000 Output: 32,768
Oct 2023
o1-mini (2024-09-12)
A faster and more cost-efficient option in the o1 series, ideal for coding tasks requiring speed and lower resource consumption.
Global standard deployment available by default.
Standard (regional) deployments are currently only available for select customers who received access as part of the o1-preview limited access release.
The GPT 4o audio models are part of the GPT-4o model family and support either low-latency, "speech in, speech out" conversational interactions or audio generation.
GPT-4o real-time audio is designed to handle real-time, low-latency conversational interactions, making it a great fit for support agents, assistants, translators, and other use cases that need highly responsive back-and-forth with a user. For more information on how to use GPT-4o real-time audio, see the GPT-4o real-time audio quickstart and how to use GPT-4o audio.
GPT-4o audio completion is designed to generate audio from audio or text prompts, making it a great fit for generating audio books, audio content, and other use cases that require audio generation. The GPT-4o audio completions model introduces the audio modality into the existing /chat/completions API. For more information on how to use GPT-4o audio completions, see the audio generation quickstart.
Svarilo
We don't recommend using preview models in production. We will upgrade all deployments of preview models to either future preview versions or to the latest stable GA version. Models that are designated preview don't follow the standard Azure OpenAI model lifecycle.
East US2 (Global Standard) Sweden Central (Global Standard)
gpt-4o-audio-preview
East US2 (Global Standard) Sweden Central (Global Standard)
gpt-4o-realtime-preview
East US2 (Global Standard) Sweden Central (Global Standard)
To compare the availability of GPT-4o audio models across all regions, see the models table.
GPT-4o and GPT-4 Turbo
GPT-4o integrates text and images in a single model, enabling it to handle multiple data types simultaneously. This multimodal approach enhances accuracy and responsiveness in human-computer interactions. GPT-4o matches GPT-4 Turbo in English text and coding tasks while offering superior performance in non-English languages and vision tasks, setting new benchmarks for AI capabilities.
How do I access the GPT-4o and GPT-4o mini models?
GPT-4o and GPT-4o mini are available for standard and global-standard model deployment.
When your resource is created, you can deploy the GPT-4o models. If you are performing a programmatic deployment, the model names are:
gpt-4oVersion2024-11-20
gpt-4oVersion2024-08-06
gpt-4oVersion2024-05-13
gpt-4o-miniVersion2024-07-18
GPT-4 Turbo
GPT-4 Turbo is a large multimodal model (accepting text or image inputs and generating text) that can solve difficult problems with greater accuracy than any of OpenAI's previous models. Like GPT-3.5 Turbo, and older GPT-4 models GPT-4 Turbo is optimized for chat and works well for traditional completions tasks.
The latest GA release of GPT-4 Turbo is:
gpt-4Version:turbo-2024-04-09
This is the replacement for the following preview models:
gpt-4Version:1106-Preview
gpt-4Version:0125-Preview
gpt-4Version:vision-preview
Differences between OpenAI and Azure OpenAI GPT-4 Turbo GA Models
OpenAI's version of the latest 0409 turbo model supports JSON mode and function calling for all inference requests.
Azure OpenAI's version of the latest turbo-2024-04-09 currently doesn't support the use of JSON mode and function calling when making inference requests with image (vision) input. Text based input requests (requests without image_url and inline images) do support JSON mode and function calling.
Differences from gpt-4 vision-preview
Azure AI specific Vision enhancements integration with GPT-4 Turbo with Vision isn't supported for gpt-4Version:turbo-2024-04-09. This includes Optical Character Recognition (OCR), object grounding, video prompts, and improved handling of your data with images.
Pomembno
Vision enhancements preview features including Optical Character Recognition (OCR), object grounding, video prompts will be retired and no longer available once gpt-4 Version: vision-preview is upgraded to turbo-2024-04-09. If you are currently relying on any of these preview features, this automatic model upgrade will be a breaking change.
GPT-4 Turbo provisioned managed availability
gpt-4Version:turbo-2024-04-09 is available for both standard and provisioned deployments. Currently the provisioned version of this model doesn't support image/vision inference requests. Provisioned deployments of this model only accept text input. Standard model deployments accept both text and image/vision inference requests.
Deploying GPT-4 Turbo with Vision GA
To deploy the GA model from the Azure AI Foundry portal, select GPT-4 and then choose the turbo-2024-04-09 version from the dropdown menu. The default quota for the gpt-4-turbo-2024-04-09 model will be the same as current quota for GPT-4-Turbo. See the regional quota limits.
GPT-4
GPT-4 is the predecessor to GPT-4 Turbo. Both the GPT-4 and GPT-4 Turbo models have a base model name of gpt-4. You can distinguish between the GPT-4 and Turbo models by examining the model version.
gpt-4Version0314
gpt-4Version0613
gpt-4-32kVersion0613
You can see the token context length supported by each model in the model summary table.
GPT-4 and GPT-4 Turbo models
These models can only be used with the Chat Completion API.
See model versions to learn about how Azure OpenAI Service handles model version upgrades, and working with models to learn how to view and configure the model version settings of your GPT-4 deployments.
Model ID
Description
Max Request (tokens)
Training Data (up to)
gpt-4o (2024-11-20) GPT-4o (Omni)
Latest large GA model - Structured outputs - Text, image processing - JSON Mode - parallel function calling - Enhanced accuracy and responsiveness - Parity with English text and coding tasks compared to GPT-4 Turbo with Vision - Superior performance in non-English languages and in vision tasks. - Enhanced creative writing ability
Input: 128,000 Output: 16,384
Oct 2023
gpt-4o (2024-08-06) GPT-4o (Omni)
- Structured outputs - Text, image processing - JSON Mode - parallel function calling - Enhanced accuracy and responsiveness - Parity with English text and coding tasks compared to GPT-4 Turbo with Vision - Superior performance in non-English languages and in vision tasks
Input: 128,000 Output: 16,384
Oct 2023
gpt-4o-mini (2024-07-18) GPT-4o mini
Latest small GA model - Fast, inexpensive, capable model ideal for replacing GPT-3.5 Turbo series models. - Text, image processing - JSON Mode - parallel function calling
Input: 128,000 Output: 16,384
Oct 2023
gpt-4o (2024-05-13) GPT-4o (Omni)
Text, image processing - JSON Mode - parallel function calling - Enhanced accuracy and responsiveness - Parity with English text and coding tasks compared to GPT-4 Turbo with Vision - Superior performance in non-English languages and in vision tasks
Input: 128,000 Output: 4,096
Oct 2023
gpt-4 (turbo-2024-04-09) GPT-4 Turbo with Vision
New GA model - Replacement for all previous GPT-4 preview models (vision-preview, 1106-Preview, 0125-Preview). - Feature availability is currently different depending on method of input, and deployment type.
Input: 128,000 Output: 4,096
Dec 2023
gpt-4 (0125-Preview)* GPT-4 Turbo Preview
Preview Model -Replaces 1106-Preview - Better code generation performance - Reduces cases where the model doesn't complete a task - JSON Mode - parallel function calling - reproducible output (preview)
Input: 128,000 Output: 4,096
Dec 2023
gpt-4 (vision-preview) GPT-4 Turbo with Vision Preview
Preview model - Accepts text and image input. - Supports enhancements - JSON Mode - parallel function calling - reproducible output (preview)
Input: 128,000 Output: 4,096
Apr 2023
gpt-4 (1106-Preview) GPT-4 Turbo Preview
Preview Model - JSON Mode - parallel function calling - reproducible output (preview)
Input: 128,000 Output: 4,096
Apr 2023
gpt-4-32k (0613)
Older GA model - Basic function calling with tools
32,768
Sep 2021
gpt-4 (0613)
Older GA model - Basic function calling with tools
We don't recommend using preview models in production. We will upgrade all deployments of preview models to either future preview versions or to the latest stable GA version. Models that are designated preview don't follow the standard Azure OpenAI model lifecycle.
GPT-4 version 0125-preview is an updated version of the GPT-4 Turbo preview previously released as version 1106-preview.
GPT-4 version 0125-preview completes tasks such as code generation more completely compared to gpt-4-1106-preview. Because of this, depending on the task, customers may find that GPT-4-0125-preview generates more output compared to the gpt-4-1106-preview. We recommend customers compare the outputs of the new model. GPT-4-0125-preview also addresses bugs in gpt-4-1106-preview with UTF-8 handling for non-English languages.
GPT-4 version turbo-2024-04-09 is the latest GA release and replaces 0125-Preview, 1106-preview, and vision-preview.
GPT-3.5
GPT-3.5 models can understand and generate natural language or code. The most capable and cost effective model in the GPT-3.5 family is GPT-3.5 Turbo, which has been optimized for chat and works well for traditional completions tasks as well. GPT-3.5 Turbo is available for use with the Chat Completions API. GPT-3.5 Turbo Instruct has similar capabilities to text-davinci-003 using the Completions API instead of the Chat Completions API. We recommend using GPT-3.5 Turbo and GPT-3.5 Turbo Instruct over legacy GPT-3.5 and GPT-3 models.
Model ID
Description
Max Request (tokens)
Training Data (up to)
gpt-35-turbo (0125) NEW
Latest GA Model - JSON Mode - parallel function calling - reproducible output (preview) - Higher accuracy at responding in requested formats. - Fix for a bug which caused a text encoding issue for non-English language function calls.
Input: 16,385 Output: 4,096
Sep 2021
gpt-35-turbo (1106)
Older GA Model - JSON Mode - parallel function calling - reproducible output (preview)
To learn more about how to interact with GPT-3.5 Turbo and the Chat Completions API check out our in-depth how-to.
1 This model will accept requests > 4,096 tokens. It is not recommended to exceed the 4,096 input token limit as the newer version of the model are capped at 4,096 tokens. If you encounter issues when exceeding 4,096 input tokens with this model this configuration is not officially supported.
Embeddings
text-embedding-3-large is the latest and most capable embedding model. Upgrading between embeddings models is not possible. In order to move from using text-embedding-ada-002 to text-embedding-3-large you would need to generate new embeddings.
text-embedding-3-large
text-embedding-3-small
text-embedding-ada-002
In testing, OpenAI reports both the large and small third generation embeddings models offer better average multi-language retrieval performance with the MIRACL benchmark while still maintaining performance for English tasks with the MTEB benchmark.
Evaluation Benchmark
text-embedding-ada-002
text-embedding-3-small
text-embedding-3-large
MIRACL average
31.4
44.0
54.9
MTEB average
61.0
62.3
64.6
The third generation embeddings models support reducing the size of the embedding via a new dimensions parameter. Typically larger embeddings are more expensive from a compute, memory, and storage perspective. Being able to adjust the number of dimensions allows more control over overall cost and performance. The dimensions parameter is not supported in all versions of the OpenAI 1.x Python library, to take advantage of this parameter we recommend upgrading to the latest version: pip install openai --upgrade.
OpenAI's MTEB benchmark testing found that even when the third generation model's dimensions are reduced to less than text-embeddings-ada-002 1,536 dimensions performance remains slightly better.
DALL-E
The DALL-E models generate images from text prompts that the user provides. DALL-E 3 is generally available for use with the REST APIs. DALL-E 2 and DALL-E 3 with client SDKs are in preview.
Whisper
The Whisper models can be used for speech to text.
You can also use the Whisper model via Azure AI Speech batch transcription API. Check out What is the Whisper model? to learn more about when to use Azure AI Speech vs. Azure OpenAI Service.
Text to speech (Preview)
The OpenAI text to speech models, currently in preview, can be used to synthesize text to speech.
Azure OpenAI provides customers with choices on the hosting structure that fits their business and usage patterns. The service offers two main types of deployment:
Standard is offered with a global deployment option, routing traffic globally to provide higher throughput.
Provisioned is also offered with a global deployment option, allowing customers to purchase and deploy provisioned throughput units across Azure global infrastructure.
All deployments can perform the exact same inference operations, however the billing, scale, and performance are substantially different. To learn more about Azure OpenAI deployment types see our deployment types guide.
o1-mini is currently available to all customers for global standard deployment.
Select customers were granted standard (regional) deployment access to o1-mini as part of the o1-preview limited access release. At this time access to o1-mini standard (regional) deployments is not being expanded.
Global provisioned managed model availability
Region
o1, 2024-12-17
gpt-4o, 2024-05-13
gpt-4o, 2024-08-06
gpt-4o, 2024-11-20
gpt-4o-mini, 2024-07-18
australiaeast
-
✅
✅
✅
✅
brazilsouth
-
✅
✅
✅
✅
canadaeast
-
✅
✅
✅
✅
eastus
✅
✅
✅
✅
✅
eastus2
-
✅
✅
✅
✅
francecentral
✅
✅
✅
✅
✅
germanywestcentral
-
✅
✅
✅
✅
italynorth
-
✅
✅
✅
✅
japaneast
-
✅
✅
✅
✅
koreacentral
-
✅
✅
✅
✅
northcentralus
-
✅
✅
✅
✅
norwayeast
-
✅
✅
✅
✅
polandcentral
-
✅
✅
✅
✅
southafricanorth
✅
✅
✅
✅
✅
southcentralus
✅
✅
✅
✅
✅
southeastasia
-
✅
✅
✅
✅
southindia
✅
✅
✅
✅
✅
spaincentral
✅
✅
✅
✅
✅
swedencentral
✅
✅
✅
✅
✅
switzerlandnorth
✅
✅
✅
✅
✅
switzerlandwest
✅
✅
✅
✅
✅
uaenorth
-
✅
✅
✅
✅
uksouth
✅
✅
✅
✅
✅
westeurope
✅
✅
✅
✅
✅
westus
✅
✅
✅
✅
✅
westus3
✅
✅
✅
✅
✅
Global batch model availability
Region
o3-mini, 2025-01-31
gpt-4o, 2024-05-13
gpt-4o, 2024-08-06
gpt-4o, 2024-11-20
gpt-4o-mini, 2024-07-18
gpt-4, 0613
gpt-4, turbo-2024-04-09
gpt-35-turbo, 0613
gpt-35-turbo, 1106
gpt-35-turbo, 0125
australiaeast
-
✅
✅
✅
✅
✅
✅
✅
✅
✅
brazilsouth
-
✅
✅
✅
✅
✅
✅
✅
✅
✅
canadaeast
-
✅
✅
✅
✅
✅
✅
✅
✅
✅
eastus
✅
✅
✅
✅
✅
✅
✅
✅
✅
✅
eastus2
✅
✅
✅
✅
✅
✅
✅
✅
✅
✅
francecentral
-
✅
✅
✅
✅
✅
✅
✅
✅
✅
germanywestcentral
-
✅
✅
✅
✅
✅
✅
✅
✅
✅
japaneast
-
✅
✅
✅
✅
✅
✅
✅
✅
✅
koreacentral
-
✅
✅
✅
✅
✅
✅
✅
✅
✅
northcentralus
-
✅
✅
✅
✅
✅
✅
✅
✅
✅
norwayeast
-
✅
✅
✅
✅
✅
✅
✅
✅
✅
polandcentral
-
✅
✅
✅
✅
✅
✅
✅
✅
✅
southafricanorth
-
✅
✅
✅
✅
✅
✅
✅
✅
✅
southcentralus
-
✅
✅
✅
✅
✅
✅
✅
✅
✅
southindia
-
✅
✅
✅
✅
✅
✅
✅
✅
✅
swedencentral
-
✅
✅
✅
✅
✅
✅
✅
✅
✅
switzerlandnorth
-
✅
✅
✅
✅
✅
✅
✅
✅
✅
uksouth
-
✅
✅
✅
✅
✅
✅
✅
✅
✅
westeurope
-
✅
✅
✅
✅
✅
✅
✅
✅
✅
westus
-
✅
✅
✅
✅
✅
✅
✅
✅
✅
westus3
✅
✅
✅
✅
✅
✅
✅
✅
✅
✅
Data zone standard model availability
Region
o3-mini, 2025-01-31
gpt-4o, 2024-05-13
gpt-4o, 2024-08-06
gpt-4o-mini, 2024-07-18
eastus
✅
✅
✅
✅
eastus2
✅
✅
✅
✅
francecentral
✅
✅
✅
✅
germanywestcentral
✅
✅
✅
✅
northcentralus
✅
✅
✅
✅
polandcentral
✅
✅
✅
✅
southcentralus
✅
✅
✅
✅
spaincentral
✅
✅
✅
✅
swedencentral
✅
✅
✅
✅
westeurope
✅
✅
✅
✅
westus
✅
✅
✅
✅
westus3
✅
✅
✅
✅
Opomba
o1-mini is currently available to all customers for global standard deployment.
Select customers were granted standard (regional) deployment access to o1-mini as part of the o1-preview limited access release. At this time access to o1-mini standard (regional) deployments is not being expanded.
Data zone provisioned managed model availability
Region
gpt-4o, 2024-05-13
gpt-4o, 2024-08-06
gpt-4o-mini, 2024-07-18
eastus
✅
✅
✅
eastus2
✅
✅
✅
francecentral
✅
✅
✅
germanywestcentral
✅
✅
✅
northcentralus
✅
✅
✅
polandcentral
✅
✅
✅
southcentralus
✅
✅
✅
spaincentral
✅
✅
✅
swedencentral
✅
✅
✅
westeurope
✅
✅
✅
westus
✅
✅
✅
westus3
✅
✅
✅
Data zone batch model availability
Region
o3-mini, 2025-01-31
gpt-4o, 2024-08-06
gpt-4o-mini, 2024-07-18
eastus
✅
✅
✅
eastus2
✅
✅
✅
francecentral
-
✅
✅
germanywestcentral
-
✅
✅
northcentralus
✅
✅
✅
polandcentral
-
✅
✅
southcentralus
✅
✅
✅
swedencentral
-
✅
✅
westeurope
-
✅
✅
westus
✅
✅
✅
westus3
✅
✅
✅
Standard deployment model availability
Region
o1-preview, 2024-09-12
o1-mini, 2024-09-12
gpt-4o, 2024-05-13
gpt-4o, 2024-08-06
gpt-4o-mini, 2024-07-18
gpt-4, 0613
gpt-4, 1106-Preview
gpt-4, 0125-Preview
gpt-4, vision-preview
gpt-4, turbo-2024-04-09
gpt-4-32k, 0613
gpt-35-turbo, 0301
gpt-35-turbo, 0613
gpt-35-turbo, 1106
gpt-35-turbo, 0125
gpt-35-turbo-16k, 0613
gpt-35-turbo-instruct, 0914
text-embedding-3-small, 1
text-embedding-3-large, 1
text-embedding-ada-002, 1
text-embedding-ada-002, 2
dall-e-3, 3.0
tts, 001
tts-hd, 001
whisper, 001
australiaeast
-
-
-
-
-
✅
✅
-
✅
-
✅
-
✅
✅
✅
✅
-
✅
✅
-
✅
✅
-
-
-
brazilsouth
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
✅
-
-
-
-
canadaeast
-
-
-
-
-
✅
✅
-
-
-
✅
-
✅
✅
✅
✅
-
✅
✅
-
✅
-
-
-
-
eastus
✅
✅
✅
✅
✅
✅
-
✅
-
✅
-
✅
✅
-
✅
✅
✅
✅
✅
✅
✅
✅
-
-
-
eastus2
✅
✅
✅
✅
✅
✅
✅
-
-
✅
-
-
✅
-
✅
✅
-
✅
✅
-
✅
-
-
-
✅
francecentral
-
-
-
-
-
✅
✅
-
-
-
✅
✅
✅
✅
✅
✅
-
-
✅
-
✅
-
-
-
-
japaneast
-
-
-
-
-
-
-
-
✅
-
-
-
✅
-
✅
✅
-
✅
✅
-
✅
-
-
-
-
northcentralus
✅
✅
✅
✅
✅
✅
-
✅
-
✅
-
-
✅
-
✅
✅
-
-
-
-
✅
-
✅
✅
✅
norwayeast
-
-
-
-
-
-
✅
-
-
-
-
-
-
-
-
-
-
-
✅
-
✅
-
-
-
✅
polandcentral
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
✅
-
-
-
-
-
-
southafricanorth
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
✅
-
-
-
-
southcentralus
✅
✅
✅
✅
✅
-
-
✅
-
✅
-
✅
-
-
✅
-
-
-
-
✅
✅
-
-
-
-
southindia
-
-
-
-
-
-
✅
-
-
-
-
-
-
✅
✅
-
-
-
✅
-
✅
-
-
-
✅
swedencentral
✅
✅
✅
✅
✅
✅
✅
-
✅
✅
✅
-
✅
✅
✅
✅
✅
-
✅
-
✅
✅
✅
✅
✅
switzerlandnorth
-
-
-
-
-
✅
-
-
✅
-
✅
-
✅
-
✅
✅
-
✅
✅
-
✅
-
-
-
✅
uaenorth
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
✅
-
✅
-
-
-
✅
uksouth
-
-
-
-
-
-
✅
✅
-
-
-
✅
✅
✅
✅
✅
-
-
✅
-
✅
-
-
-
-
westeurope
-
-
-
-
-
-
-
-
-
-
-
✅
-
-
-
-
-
-
-
-
✅
-
-
-
✅
westus
✅
✅
✅
✅
✅
-
✅
-
✅
✅
-
-
-
✅
✅
-
-
✅
-
-
✅
-
-
-
-
westus3
✅
✅
✅
✅
✅
-
✅
-
-
✅
-
-
-
-
✅
-
-
-
✅
-
✅
-
-
-
-
Opomba
o1-mini is currently available to all customers for global standard deployment.
Select customers were granted standard (regional) deployment access to o1-mini as part of the o1-preview limited access release. At this time access to o1-mini standard (regional) deployments is not being expanded.
Provisioned deployment model availability
Region
gpt-4o, 2024-05-13
gpt-4o, 2024-08-06
gpt-4o, 2024-11-20
gpt-4o-mini, 2024-07-18
gpt-4, 0613
gpt-4, 1106-Preview
gpt-4, 0125-Preview
gpt-4, turbo-2024-04-09
gpt-4-32k, 0613
gpt-35-turbo, 1106
gpt-35-turbo, 0125
australiaeast
✅
✅
✅
✅
✅
✅
✅
✅
✅
✅
✅
brazilsouth
✅
-
-
✅
✅
✅
✅
-
✅
✅
-
canadaeast
✅
✅
✅
✅
✅
✅
-
✅
-
✅
-
eastus
✅
✅
✅
✅
✅
✅
✅
✅
✅
✅
✅
eastus2
✅
✅
-
✅
✅
✅
✅
✅
✅
✅
✅
francecentral
✅
✅
-
✅
✅
✅
✅
-
✅
-
✅
germanywestcentral
✅
✅
-
-
✅
✅
✅
✅
✅
✅
-
japaneast
✅
✅
✅
✅
-
✅
✅
✅
-
-
✅
koreacentral
✅
✅
-
✅
✅
-
-
✅
✅
✅
-
northcentralus
✅
✅
✅
✅
✅
✅
✅
✅
✅
✅
✅
norwayeast
✅
✅
-
✅
✅
-
✅
-
✅
-
-
polandcentral
✅
-
-
-
✅
✅
✅
✅
✅
✅
✅
southafricanorth
✅
-
-
-
✅
✅
-
✅
✅
✅
-
southcentralus
✅
✅
-
✅
✅
✅
✅
✅
✅
✅
✅
southeastasia
-
✅
✅
✅
-
-
-
-
-
-
-
southindia
✅
✅
-
✅
✅
✅
✅
-
✅
✅
✅
swedencentral
✅
✅
✅
✅
✅
✅
✅
✅
✅
✅
✅
switzerlandnorth
✅
✅
✅
✅
✅
✅
✅
✅
✅
✅
✅
switzerlandwest
-
-
-
-
-
-
-
-
-
-
✅
uaenorth
✅
✅
✅
-
-
✅
-
-
-
✅
✅
uksouth
✅
✅
-
✅
✅
✅
✅
✅
✅
✅
✅
westus
✅
✅
-
✅
✅
✅
✅
✅
✅
✅
✅
westus3
✅
✅
-
-
✅
✅
✅
✅
✅
✅
✅
Opomba
The provisioned version of gpt-4Version:turbo-2024-04-09 is currently limited to text only.
o1-mini is currently available to all customers for global standard deployment.
Select customers were granted standard (regional) deployment access to o1-mini as part of the o1-preview limited access release. At this time access to o1-mini standard (regional) deployments is not being expanded.
GPT-4 and GPT-4 Turbo model availability
Select customer access
In addition to the regions above which are available to all Azure OpenAI customers, some select preexisting customers have been granted access to versions of GPT-4 in additional regions:
Model
Region
gpt-4 (0314) gpt-4-32k (0314)
East US France Central South Central US UK South
gpt-4 (0613) gpt-4-32k (0613)
East US East US 2 Japan East UK South
GPT-3.5 models
See model versions to learn about how Azure OpenAI Service handles model version upgrades, and working with models to learn how to view and configure the model version settings of your GPT-3.5 Turbo deployments.
Embeddings models
Region
text-embedding-3-small, 1
text-embedding-3-large, 1
text-embedding-ada-002, 1
text-embedding-ada-002, 2
australiaeast
✅
✅
-
✅
brazilsouth
-
-
-
✅
canadaeast
✅
✅
-
✅
eastus
✅
✅
✅
✅
eastus2
✅
✅
-
✅
francecentral
-
✅
-
✅
japaneast
✅
✅
-
✅
northcentralus
-
-
-
✅
norwayeast
-
✅
-
✅
polandcentral
-
✅
-
-
southafricanorth
-
-
-
✅
southcentralus
-
-
✅
✅
southindia
-
✅
-
✅
swedencentral
-
✅
-
✅
switzerlandnorth
✅
✅
-
✅
uaenorth
-
-
-
✅
uksouth
-
✅
-
✅
westeurope
-
-
-
✅
westus
✅
-
-
✅
westus3
-
✅
-
✅
These models can only be used with Embedding API requests.
Opomba
text-embedding-3-large is the latest and most capable embedding model. Upgrading between embedding models is not possible. In order to migrate from using text-embedding-ada-002 to text-embedding-3-large you would need to generate new embeddings.
Model ID
Max Request (tokens)
Output Dimensions
Training Data (up-to)
text-embedding-ada-002 (version 2)
8,192
1,536
Sep 2021
text-embedding-ada-002 (version 1)
2,046
1,536
Sep 2021
text-embedding-3-large
8,192
3,072
Sep 2021
text-embedding-3-small
8,192
1,536
Sep 2021
Opomba
When sending an array of inputs for embedding, the max number of input items in the array per call to the embedding endpoint is 2048.
Image generation models
Region
dall-e-3, 3.0
australiaeast
✅
eastus
✅
swedencentral
✅
DALL-E models
Model ID
Max Request (characters)
dall-e-3
4,000
Audio models
Region
tts, 001
tts-hd, 001
whisper, 001
eastus2
-
-
✅
northcentralus
✅
✅
✅
norwayeast
-
-
✅
southindia
-
-
✅
swedencentral
✅
✅
✅
switzerlandnorth
-
-
✅
uaenorth
-
-
✅
westeurope
-
-
✅
Whisper models
Model ID
Max Request (audio file size)
whisper
25 MB
Text to speech models (Preview)
Model ID
Description
tts
The latest Azure OpenAI text to speech model, optimized for speed.
tts-hd
The latest Azure OpenAI text to speech model, optimized for quality.
Completions models
Region
gpt-35-turbo-instruct, 0914
eastus
✅
swedencentral
✅
Fine-tuning models
Opomba
gpt-35-turbo - Fine-tuning of this model is limited to a subset of regions, and isn't available in every region the base model is available.
The supported regions for fine-tuning might vary if you use Azure OpenAI models in an Azure AI Foundry project versus outside a project.
Model ID
Fine-tuning regions
Max request (tokens)
Training Data (up to)
gpt-35-turbo (0613)
East US2 North Central US Sweden Central Switzerland West
4,096
Sep 2021
gpt-35-turbo (1106)
East US2 North Central US Sweden Central Switzerland West
Input: 16,385 Output: 4,096
Sep 2021
gpt-35-turbo (0125)
East US2 North Central US Sweden Central Switzerland West
16,385
Sep 2021
gpt-4 (0613) 1
North Central US Sweden Central
8192
Sep 2021
gpt-4o-mini (2024-07-18)
North Central US Sweden Central
Input: 128,000 Output: 16,384 Training example context length: 64,536
Oct 2023
gpt-4o (2024-08-06)
East US2 North Central US Sweden Central
Input: 128,000 Output: 16,384 Training example context length: 64,536
Oct 2023
1 GPT-4 is currently in public preview.
Assistants (Preview)
For Assistants you need a combination of a supported model, and a supported region. Certain tools and capabilities require the latest models. The following models are available in the Assistants API, SDK, and Azure AI Foundry. The following table is for pay-as-you-go. For information on Provisioned Throughput Unit (PTU) availability, see provisioned throughput. The listed models and regions can be used with both Assistants v1 and v2. You can use global standard models if they are supported in the regions listed below.