Vision-enabled chat model concepts

Raksts
21.02.2025

Vision-enabled chat models are large multimodal models (LMM) developed by OpenAI that can analyze images and provide textual responses to questions about them. They incorporate both natural language processing and visual understanding. The current vision-enabled models are GPT-4 Turbo with Vision, GPT-4o, and GPT-4o-mini. This guide provides details on their capabilities and limitations.

To try out vision-enabled chat models, see the quickstart.

Vision-enabled chats

The vision-enabled models answer general questions about what's present in the images you upload.

Special pricing information

Svarīgi

Pricing details are subject to change in the future.

Vision-enabled models accrue charges like other Azure OpenAI chat models. You pay a per-token rate for the prompts and completions, detailed on the Pricing page. The base charges and additional features are outlined here:

Base Pricing for GPT-4 Turbo with Vision is:

Input: $0.01 per 1000 tokens
Output: $0.03 per 1000 tokens

See the Tokens section of the overview for information on how text and images translate to tokens.

Example image price calculation

Svarīgi

The following content is an example only, and prices are subject to change in the future.

For a typical use case, take an image with both visible objects and text and a 100-token prompt input. When the service processes the prompt, it generates 100 tokens of output. In the image, both text and objects can be detected. The price of this transaction would be:

Item	Detail	Cost
Text prompt input	100 text tokens	$0.001
Example image input (see Image tokens)	170 + 85 image tokens	$0.00255
Enhanced add-on features for OCR	$1.50 / 1000 transactions	$0.0015
Enhanced add-on features for Object Grounding	$1.50 / 1000 transactions	$0.0015
Output Tokens	100 tokens (assumed)	$0.003
Total		$0.00955

Input limitations

This section describes the limitations of vision-enabled chat models.

Image support

Maximum input image size: The maximum size for input images is restricted to 20 MB.
Low resolution accuracy: When images are analyzed using the "low resolution" setting, it allows for faster responses and uses fewer input tokens for certain use cases. However, this could impact the accuracy of object and text recognition within the image.
Image chat restriction: When you upload images in Azure AI Foundry portal or the API, there is a limit of 10 images per chat call.

Next steps

Get started using vision-enabled models by following the quickstart.
For a more in-depth look at the APIs, follow the how-to guide.
See the completions and embeddings API reference

Papildu resursi

Dokumentācija

How to use vision-enabled chat models - Azure OpenAI Service

Learn how to use vision-enabled chat models in Azure OpenAI Service, including how to call the Chat Completion API and process images.
Quickstart: Use vision-enabled chats with the Azure OpenAI Service - Azure OpenAI

Use this article to get started using Azure OpenAI to deploy and use the GPT-4 Turbo with Vision model or other vision-enabled models.
What is Azure OpenAI Service? - Azure AI services

Apply advanced language models to variety of use cases with Azure OpenAI
Azure OpenAI Service documentation - Quickstarts, Tutorials, API Reference - Azure AI services

Learn how to use Azure OpenAI's powerful models including the GPT-4o, GPT-4o mini, GPT-4, GPT-4 Turbo with Vision, GPT-3.5-Turbo, DALL-E 3 and Embeddings model series
Azure OpenAI Service models - Azure OpenAI

Learn about the different model capabilities that are available with Azure OpenAI.
Work with the GPT-35-Turbo and GPT-4 models - Azure OpenAI Service

Learn about the options for how to use the GPT-35-Turbo and GPT-4 models.

Apmācība

Mācību ceļš

Create vision models with Azure AI Custom Vision - Training

How to create custom computer vision solutions with Azure AI Custom Vision

Sertifikācija

Microsoft Certified: Azure AI Fundamentals - Certifications

Demonstrate fundamental AI concepts related to the development of software and services of Microsoft Azure to create AI solutions.

Notikumi

Veidojiet inteliģentas lietotnes

17. marts 21 - 21. marts 10

Pievienojieties meetup sērijai, lai kopā ar citiem izstrādātājiem un ekspertiem izveidotu mērogojamus AI risinājumus, kuru pamatā ir reālas lietošanas gadījumi.

Reģistrēties tūlīt

Kopīgot, izmantojot