快速入門：開始對 Azure AI Studio 中的影像及影片使用含有視覺功能的 GPT-4 Turbo

發行項
05/21/2024

重要

本文所述的部分功能可能僅適用於預覽版。此預覽版本沒有服務等級協定，不建議將其用於生產工作負載。可能不支援特定功能，或可能已經限制功能。如需詳細資訊，請參閱 Microsoft Azure 預覽版增補使用條款。

您可以閱讀本文，以開始使用 Azure AI Studio 來部署及測試含有視覺功能的 GPT-4 Turbo 模型。

含有視覺功能的 GPT-4 Turbo 和 Azure AI 視覺可提供進階功能，包括：

光學字元辨識 (OCR)：從影像擷取文字並與使用者的提示和影像結合，以展開內容。
物件基礎：運用物件基礎彌補含有視覺功能的 GPT-4 Turbo 文字回應，並概述輸入影像中的主要物件。
影片提示：含有視覺功能的 GPT-4 Turbo 藉由擷取與使用者提示最相關的影片畫面來回答問題。

使用 GPT-4 Turbo 搭配視覺和 Azure AI 視覺功能時，可能會收取額外的使用量費用。

必要條件

Azure 訂用帳戶 - 建立免費帳戶。
在所需的 Azure 訂用帳戶中授與 Azure OpenAI 的存取權。目前只有應用程式會授予此服務的存取權。您可以填妥 https://aka.ms/oai/access 的表單，以申請 Azure OpenAI 的存取權。如有問題，請在此存放庫中提出問題來與我們連絡。
擁有 Azure 訂用帳戶之後，請建立 Azure OpenAI 資源。
已將 Azure OpenAI 資源新增為連線的 AI Studio 中樞。

準備媒體

您需要影像來完成影像快速入門。您可以使用此範例映像或任何其他可用的映像。

針對視訊提示，您需要長度低於三分鐘的視訊。

使用視覺模型部署 GPT-4 Turbo

登入 Azure AI Studio ，然後選取您想要運作的中樞。
在左側導覽功能表上，選取 [ AI 服務]。選取 [ 試用 GPT-4 Turbo ] 面板。
在 gpt-4 頁面上，選取 [ 部署]。在出現的視窗中，選取您的 Azure OpenAI 資源。選取 vision-preview 作為模型版本。
選取部署。
接下來，移至新模型的頁面，然後選取 [在遊樂場中開啟]。在聊天遊樂場中，您應該在 [部署] 下拉式清單中選取您建立的 GPT-4 部署 。

在此聊天工作階段中，您會指示助理協助了解輸入的影像。

在 [系統訊息] 索引標籤上的 [系統訊息] 文字框中，提供此提示以引導助理："You're an AI assistant that helps people find information."您可以針對影像或案例量身打造提示。
選取 [ 套用變更 ] 以儲存變更。
在聊天會話窗格中，選取 [附件] 按鈕，然後選取 [ 上傳影像]。選擇您的映像。
在聊天欄位中新增下列問題： "Describe this image"，然後選取要傳送的向右箭號圖示。
向右箭號圖示會取代為 [停止] 按鈕。如果您選取它，助理會停止處理您的要求。在這個快速入門中，請讓助理完成回覆。
助理會以影像的描述回復。
詢問與影像分析相關的後續問題。您可以輸入。 "What should I highlight about this image to my insurance company?"

您應該會收到類似此處所顯示內容的相關回應：

When reporting the incident to your insurance company, you should highlight the following key points from the image:  

1. **Location of Damage**: Clearly state that the front end of the car, particularly the driver's side, is damaged. Mention the crumpled hood, broken front bumper, and the damaged left headlight.  

2. **Point of Impact**: Indicate that the car has collided with a guardrail, which may suggest that no other vehicles were involved in the accident.  

3. **Condition of the Car**: Note that the damage seems to be concentrated on the front end, and there is no visible damage to the windshield or rear of the car from this perspective.  

4. **License Plate Visibility**: Mention that the license plate is intact and can be used for identification purposes.  

5. **Environment**: Report that the accident occurred near a roadside with a guardrail, possibly in a rural or semi-rural area, which might help in establishing the accident location and context.  

6. **Other Observations**: If there were any other circumstances or details not visible in the image that may have contributed to the accident, such as weather conditions, road conditions, or any other relevant information, be sure to include those as well.  

Remember to be factual and descriptive, avoiding speculation about the cause of the accident, as the insurance company will conduct its own investigation.

在此聊天工作階段中，您會指示助理協助了解輸入的影像。試用擴增視覺模型的功能。

在 聊天視窗左側的 [增強功能] 窗格中，開啟 [ 視覺] 的選項。在出現的視窗中，選取您的 Azure 電腦視覺資源。
在 [系統訊息] 索引標籤上的 [系統訊息] 文字框中，提供此提示以引導助理："You're an AI assistant that helps people find information."您可以針對影像或案例量身打造提示。選取 [ 套用變更 ] 以儲存變更。
在聊天會話窗格中，選取 [附件] 按鈕，然後選取 [ 上傳影像]。選擇您的映像。
在聊天欄位中新增下列問題： "Describe this image"，然後選取要傳送的向右箭號圖示。
向右箭號圖示會取代為 [停止] 按鈕。如果您選取它，助理會停止處理您的要求。在這個快速入門中，請讓助理完成回覆。
助理會以影像的描述回復。它會使用 Azure AI 視覺服務，從您上傳的影像擷取更多詳細數據。
詢問與影像分析相關的後續問題。輸入， "What should I highlight about this image to my insurance company?" 然後選取要傳送的向右箭號圖示。

您應該會收到類似此處所顯示內容的相關回應：

When reporting the incident to your insurance company, you should highlight the following key points from the image:  

1. **Location of Damage**: Clearly state that the front end of the car, particularly the driver's side, is damaged. Mention the crumpled hood, broken front bumper, and the damaged left headlight.  

2. **Point of Impact**: Indicate that the car has collided with a guardrail, which may suggest that no other vehicles were involved in the accident.  

3. **Condition of the Car**: Note that the damage seems to be concentrated on the front end, and there is no visible damage to the windshield or rear of the car from this perspective.  

4. **License Plate Visibility**: Mention that the license plate is intact and can be used for identification purposes.  

5. **Environment**: Report that the accident occurred near a roadside with a guardrail, possibly in a rural or semi-rural area, which might help in establishing the accident location and context.  

6. **Other Observations**: If there were any other circumstances or details not visible in the image that may have contributed to the accident, such as weather conditions, road conditions, or any other relevant information, be sure to include those as well.  

Remember to be factual and descriptive, avoiding speculation about the cause of the accident, as the insurance company will conduct its own investigation.

在此聊天會話中，您會指示助理協助您了解您輸入的影片。助理會從影片擷取數個畫面，用來回答您的問題。

在 聊天視窗左側的 [增強功能] 窗格中，開啟 [ 視覺] 的選項。在出現的視窗中，選取您的 Azure 電腦視覺資源。
在 [系統訊息] 索引標籤上的 [系統訊息] 文字框中，提供此提示以引導助理："You're an AI assistant that helps people find information."您可以針對影像或案例量身打造提示。
選取 [ 套用變更 ] 以儲存變更。
在聊天會話窗格中，選取 [附件] 按鈕，然後選取 [ 上傳視訊]。選擇您的影片。
輸入文字提示，例如， "Provide details about this video"然後選取要傳送的向右箭號圖示。
向右箭號圖示會取代為 [停止] 按鈕。如果您選取它，助理會停止處理您的要求。在這個快速入門中，請讓助理完成回覆。
助理應該會回覆影片描述。
您可以隨意詢問與影片分析相關的任何後續問題。

限制

以下是影片提示增強功能的已知限制。

低解析度：畫面分析是使用含有視覺功能的 GPT-4 Turbo 的「低解析度」設定，這可能會影響影片中小型物件和文字辨識的精確度。
視訊檔案限制： 支援MP4和MOV檔案類型。在 Azure AI Studio 遊樂場中，影片長度必須少於 3 分鐘。當您使用 API 時，沒有任何這類限制。
提示限制：影片提示只包含一段影片，沒有任何圖片。在遊樂場中，您可以清除工作階段，以嘗試處理另一個影片或影像。
有限的畫面選取：目前系統會從整部視片中選取 20 個畫面，因而可能無法擷取所有重要時刻或詳細資訊。視提示而定，畫面選取範圍可以平均分散到視訊或由特定影片擷取查詢聚焦。
語言支援：目前，系統主要支援英文作為文字記錄基礎。文字記錄無法提供有關歌曲歌詞的準確資訊。

檢視和匯出程式碼

在聊天會話的任何時間點，您可以啟用 聊天視窗頂端的 [顯示原始 JSON] 參數，以查看格式化為 JSON 的交談。以下是快速入門聊天工作階段開始時的樣子：

[
	{
		"role": "system",
		"content": [
			"You are an AI assistant that helps people find information."
		]
	},
]

清除資源

為了避免產生不必要的 Azure 費用，如果您不再需要在本快速入門中建立的資源，則應該加以刪除。若要管理資源，您可以使用 Azure 入口網站。

下一步

建立專案
深入了解 Azure AI 視覺。
深入了解 Azure OpenAI 模型。

共用方式為