Quickstart: Get started using GPT-4 Turbo with Vision on your images and videos in Azure AI Studio

Note

Azure AI Studio is currently in public preview. This preview is provided without a service-level agreement, and we don't recommend it for production workloads. Certain features might not be supported or might have constrained capabilities. For more information, see Supplemental Terms of Use for Microsoft Azure Previews.

Use this article to get started using Azure AI Studio to deploy and test the GPT-4 Turbo with Vision model.

GPT-4 Turbo with Vision and Azure AI Vision offer advanced functionality including:

  • Optical Character Recognition (OCR): Extracts text from images and combines it with the user's prompt and image to expand the context.
  • Object grounding: Complements the GPT-4 Turbo with Vision text response with object grounding and outlines salient objects in the input images.
  • Video prompts: GPT-4 Turbo with Vision can answer questions by retrieving the video frames most relevant to the user's prompt.

Extra usage fees might apply for using GPT-4 Turbo with Vision and Azure AI Vision functionality.

Prerequisites

Note

This feature isn't available if you created an Azure AI hub resource together with an existing Azure OpenAI Service resource. You must create an AI hub with an Azure AI services provider. We're gradually rolling out this feature to all customers. If you don't see it yet, check back later.

  • An Azure subscription - Create one for free.

  • Access granted to Azure OpenAI in the desired Azure subscription.

    Currently, access to this service is granted only by application. You can apply for access to Azure OpenAI by completing the form at https://aka.ms/oai/access. Open an issue on this repo to contact us if you have an issue.

  • An Azure AI hub resource with a GPT-4 Turbo with Vision model deployed in one of the regions that support GPT-4 Turbo with Vision. When you deploy from your Azure AI project's Deployments page, select: gpt-4 as the model name and vision-preview as the model version.

  • An Azure AI project in Azure AI Studio.

Start a chat session to analyze images or video

You need an image to complete the image quickstarts. You can use the following image or any other image you have available.

Photo of a car accident that can be used to complete the quickstart.

You need a video up to three minutes in length to complete the video quickstart.

In this chat session, you instruct the assistant to aid in understanding images that you input.

  1. Sign in to Azure AI Studio.

  2. Go to your project or create a new project in Azure AI Studio.

  3. Select Build from the top menu and then select Playground from the collapsible left menu.

  4. Make sure that Chat is selected from the Mode dropdown. Select your deployed GPT-4 Turbo with Vision model from the Deployment dropdown. Under the chat session text box, you should now see the option to select a file.

    Screenshot of the chat playground with mode and deployment highlighted.

  5. In the System message text box on the Assistant setup pane, provide this prompt to guide the assistant: "You're an AI assistant that helps people find information." You can tailor the prompt the image or scenario that you're uploading.

  6. Select Apply changes to save your changes, and when prompted to see if you want to update the system message, select Continue.

  7. In the chat session pane, select an image file and then select the right arrow icon to upload the image.

    Screenshot of the chat playground with the selected image highlighted.

  8. Enter enter the following question: "Describe this image", and then select the right arrow icon to send.

    Screenshot of the chat playground with the image and prompt selected.

  9. The square icon replaces the right arrow icon. If you select the square icon, the assistant stops processing your request. For this quickstart, let the assistant finish its reply. Don't select the square icon.

    Screenshot of the chat playground with the square stop button highlighted.

  10. The assistant should reply with a description of the image.

    Screenshot of the chat playground with the assistant's reply for basic image analysis.

  11. Ask a follow-up question related to the analysis of your image. Enter "What should I highlight about this image to my insurance company" and then select the right arrow icon to send.

  12. You should receive a relevant response similar to what's shown here:

    Screenshot of the chat playground with the assistant's follow-up reply for basic image analysis.

At any point in the chat session, you can select the Show raw JSON option to see the conversation formatted as JSON. Heres' what it looks like at the beginning of the quickstart chat session:

Screenshot of the chat session with show raw json selected.

[
	{
		"role": "system",
		"content": [
			"You are an AI assistant that helps people find information."
		]
	},
]

This has been a walkthrough of GPT-4 Turbo with Vision in the Azure AI Studio chat playground experience.

Clean up resources

To avoid incurring unnecessary Azure costs, you should delete the resources you created in this quickstart if they're no longer needed. To manage resources, you can use the Azure portal.

Next steps