使用聊天完成模型

聊天模型是針對交談介面最佳化的語言模型。這些模型的行為與舊有的補全 API 模型不同。先前的模型會在接受文字輸入後提供文字輸出結果，這表示模型會接受提示字串，並傳回完成結果以附加至提示。不過，最新的模型是對話傳入和訊息輸出。模型預期輸入格式為特定類似聊天的文字記錄格式。這會傳回代表聊天中模型所撰寫訊息的完成結果。此格式是專為多回合交談所設計，但也適用於非聊天案例。

本文將逐步引導您開始使用聊天完成模型。若要獲得最佳結果，請使用這裡所述的技術。因為舊版模型的訊息通常較為冗長，並提供較不實用的回應，所以請勿使用和這類模型系列相同的方式與模型互動。

使用聊天完成模型

下列程式碼片段顯示與模型互動的最基本方式，這類模型使用聊天完成 API。

備註

回應 API 使用相同的聊天互動樣式，但支援舊版聊天完成 API 不支援的最新功能。

Microsoft Entra 身份識別
API 金鑰

from openai import OpenAI
from azure.identity import DefaultAzureCredential, get_bearer_token_provider

token_provider = get_bearer_token_provider(
    DefaultAzureCredential(), "https://cognitiveservices.azure.com/.default"
)

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",  
  api_key=token_provider,
)

response = client.chat.completions.create(
    model="gpt-4o", # model = "deployment_name".
    messages=[
        {"role": "system", "content": "Assistant is a large language model trained by OpenAI."},
        {"role": "user", "content": "Who were the founders of Microsoft?"}
    ]
)

#print(response)
print(response.model_dump_json(indent=2))
print(response.choices[0].message.content)

import os
from openai import OpenAI

client = OpenAI(
  api_key = os.getenv("AZURE_OPENAI_API_KEY"),  
  base_url="https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/"
)

response = client.chat.completions.create(
    model="gpt-4o", # model = "deployment_name".
    messages=[
        {"role": "system", "content": "Assistant is a large language model trained by OpenAI."},
        {"role": "user", "content": "Who were the founders of Microsoft?"}
    ]
)

#print(response)
print(response.model_dump_json(indent=2))
print(response.choices[0].message.content)

{
  "id": "chatcmpl-8GHoQAJ3zN2DJYqOFiVysrMQJfe1P",
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "message": {
        "content": "Microsoft was founded by Bill Gates and Paul Allen. They established the company on April 4, 1975. Bill Gates served as the CEO of Microsoft until 2000 and later as Chairman and Chief Software Architect until his retirement in 2008, while Paul Allen left the company in 1983 but remained on the board of directors until 2000.",
        "role": "assistant",
        "function_call": null
      },
      "content_filter_results": {
        "hate": {
          "filtered": false,
          "severity": "safe"
        },
        "self_harm": {
          "filtered": false,
          "severity": "safe"
        },
        "sexual": {
          "filtered": false,
          "severity": "safe"
        },
        "violence": {
          "filtered": false,
          "severity": "safe"
        }
      }
    }
  ],
  "created": 1698892410,
  "model": "gpt-4o",
  "object": "chat.completion",
  "usage": {
    "completion_tokens": 73,
    "prompt_tokens": 29,
    "total_tokens": 102
  },
  "prompt_filter_results": [
    {
      "prompt_index": 0,
      "content_filter_results": {
        "hate": {
          "filtered": false,
          "severity": "safe"
        },
        "self_harm": {
          "filtered": false,
          "severity": "safe"
        },
        "sexual": {
          "filtered": false,
          "severity": "safe"
        },
        "violence": {
          "filtered": false,
          "severity": "safe"
        }
      }
    }
  ]
}
Microsoft was founded by Bill Gates and Paul Allen. They established the company on April 4, 1975. Bill Gates served as the CEO of Microsoft until 2000 and later as Chairman and Chief Software Architect until his retirement in 2008, while Paul Allen left the company in 1983 but remained on the board of directors until 2000.

每個回應都包含 finish_reason。 finish_reason 的可能值為：

stop：API 傳回的完整模型輸出。
length：因為 max_tokens 參數或權杖限制，所以模型輸出不完整。
content_filter：因為內容篩選中的旗標而省略了內容。
null：API 回應仍在進行中或不完整。

請考慮將設定 max_tokens 為稍微高於一般值。更高的值可確保模型不會在到達訊息結尾之前停止產生文字。

使用聊天完成 API

OpenAI 訓練聊天完成模型，使其能夠接受以交談形式格式化的輸入。 messages 參數會採用以角色組織之交談的訊息物件陣列。使用 Python API 時，您會使用字典清單。

基本聊天完成的格式如下：

{"role": "system", "content": "Provide some context and/or instructions to the model"},
{"role": "user", "content": "The users messages goes here"}

具有一個範例答案的交談，後面接著問題看起來會像這樣：

{"role": "system", "content": "Provide some context and/or instructions to the model."},
{"role": "user", "content": "Example question goes here."},
{"role": "assistant", "content": "Example answer goes here."},
{"role": "user", "content": "First question/message for the model to actually respond to."}

系統角色

系統角色 (也稱為系統訊息) 包含在陣列的開頭。此訊息會提供模型的初始指示。您可以在系統角色中提供各種資訊，例如：

助理的簡短描述。
助理的個人化特徵。
您希望助理遵循的指示或規則。
模型所需的資料或資訊，例如常見問題集中的相關問題。

您可以自訂使用案例的系統角色，或包含基本指示。系統角色/訊息是選擇性的，但建議至少包含基本的角色/訊息，以取得最佳結果。

訊息

於系統角色之後，您可以在 user 與 assistant 之間包含一系列訊息。

 {"role": "user", "content": "What is thermodynamics?"}

若要觸發模型的回應，請以使用者訊息結尾，指出輪到助理回應。您也可以在使用者與助理之間包含一系列範例訊息，以做為執行小樣本學習的方式。

訊息提示範例

下一節顯示您可以搭配聊天完成模型使用之不同提示樣式的範例。這些範例只是起點。您可以試驗不同的提示來自訂您自己的使用案例行為。

基本範例

如果您想要聊天完成模型的行為類似於 chatgpt.com，您可以使用基本的系統訊息，例如 Assistant is a large language model trained by OpenAI.

{"role": "system", "content": "Assistant is a large language model trained by OpenAI."},
{"role": "user", "content": "Who were the founders of Microsoft?"}

使用指示的範例

在部分案例中，您可能要為模型提供額外指示，定義模型能做之事的護欄。

{"role": "system", "content": "Assistant is an intelligent chatbot designed to help users answer their tax related questions.
Instructions: 
- Only answer questions related to taxes. 
- If you're unsure of an answer, you can say "I don't know" or "I'm not sure" and recommend users go to the IRS website for more information. "},
{"role": "user", "content": "When are my taxes due?"}

使用資料進行接地

您也可以在系統訊息中包含相關資料或資訊，為交談提供模型額外的內容。如果只需要包含少量資訊，您可以在系統訊息中採用硬式編碼。如果您有模型應該注意的大量資料，您可以使用內嵌或 Azure AI 搜尋服務之類的產品，在查詢時間擷取最相關的資訊。

{"role": "system", "content": "Assistant is an intelligent chatbot designed to help users answer technical questions about Azure OpenAI in Microsoft Foundry Models. Only answer questions using the context below and if you're not sure of an answer, you can say 'I don't know'.

Context:
- Azure OpenAI provides REST API access to OpenAI's powerful language models including the GPT-3, Codex and Embeddings model series.
- Azure OpenAI gives customers advanced language AI with OpenAI GPT-3, Codex, and DALL-E models with the security and enterprise promise of Azure. Azure OpenAI co-develops the APIs with OpenAI, ensuring compatibility and a smooth transition from one to the other.
- At Microsoft, we're committed to the advancement of AI driven by principles that put people first. Microsoft has made significant investments to help guard against abuse and unintended harm, which includes requiring applicants to show well-defined use cases, incorporating Microsoft’s principles for responsible AI use."
},
{"role": "user", "content": "What is Azure OpenAI?"}

使用聊天完成的小樣本學習

您也可以提供小樣本範例給模型。由於新的提示格式，小樣本學習的方法已出現些許變更。您現在可以在提示中包含使用者與助理之間的一系列訊息，作為小樣本範例。透過使用這些範例，您植入常見問題的解答，以質化模型或教導模型的特定行為。

此範例說明如何搭配 GPT-35-Turbo 和 GPT-4 使用小樣本學習。您可以試驗不同的方法，以查看最適合您的使用案例。

{"role": "system", "content": "Assistant is an intelligent chatbot designed to help users answer their tax related questions. "},
{"role": "user", "content": "When do I need to file my taxes by?"},
{"role": "assistant", "content": "In 2023, you will need to file your taxes by April 18th. The date falls after the usual April 15th deadline because April 15th falls on a Saturday in 2023. For more details, see https://www.irs.gov/filing/individuals/when-to-file."},
{"role": "user", "content": "How can I check the status of my tax refund?"},
{"role": "assistant", "content": "You can check the status of your tax refund by visiting https://www.irs.gov/refunds"}

針對非聊天案例使用聊天完成

聊天完成 API 的設計目的是用於多回合交談，但也適用於非聊天案例。

例如，針對實體擷取案例，您可以使用下列提示：

{"role": "system", "content": "You are an assistant designed to extract entities from text. Users will paste in a string of text and you will respond with entities you've extracted from the text as a JSON object. Here's an example of your output format:
{
   "name": "",
   "company": "",
   "phone_number": ""
}"},
{"role": "user", "content": "Hello. My name is Robert Smith. I'm calling from Contoso Insurance, Delaware. My colleague mentioned that you are interested in learning about our comprehensive benefits policy. Could you give me a call back at (555) 346-9322 when you get a chance so we can go over the benefits?"}

建立基本交談迴圈

到目前為止，這些範例已示範聊天完成 API 互動的基本機制。此範例示範如何建立執行下列動作的交談迴圈：

持續接受主控台輸入，並將其正確格式化為訊息清單的一部分，以作為使用者角色內容。
輸出列印至主控台的回應，並格式化並新增至訊息清單作為助理角色內容。

每次詢問新問題時，到目前為止的交談執行文字記錄都會連同最新的問題一起傳送。因為模型沒有記憶，所以您必須傳送具有每個新問題的更新文字記錄，否則模型將會失去先前問題和答案的內容。

Microsoft Entra 身份識別
API 金鑰

from openai import OpenAI
from azure.identity import DefaultAzureCredential, get_bearer_token_provider

token_provider = get_bearer_token_provider(
    DefaultAzureCredential(), "https://cognitiveservices.azure.com/.default"
)

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",  
  api_key=token_provider,
)

conversation=[{"role": "system", "content": "You are a helpful assistant."}]

while True:
    user_input = input("Q:")      
    conversation.append({"role": "user", "content": user_input})

    response = client.chat.completions.create(
        model="gpt-4o", # model = "deployment_name".
        messages=conversation
    )

    conversation.append({"role": "assistant", "content": response.choices[0].message.content})
    print("\n" + response.choices[0].message.content + "\n")

import os
from openai import OpenAI

client = OpenAI(
  api_key = os.getenv("AZURE_OPENAI_API_KEY"),  
  base_url="https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/"
)

conversation=[{"role": "system", "content": "You are a helpful assistant."}]

while True:
    user_input = input("Q:")      
    conversation.append({"role": "user", "content": user_input})

    response = client.chat.completions.create(
        model="gpt-4o", # model = "deployment_name".
        messages=conversation
    )

    conversation.append({"role": "assistant", "content": response.choices[0].message.content})
    print("\n" + response.choices[0].message.content + "\n")

當您執行前述程式碼時，將會取得空白主控台視窗。在視窗中輸入您的第一個問題，然後按下 Enter 鍵。傳回回應之後，您可以重複此流程並持續詢問問題。

管理交談

上一個範例會持續執行，直到達到模型的權杖限制為止。在收到每個問題並回答時，messages 清單會隨著大小成長。不同模型和版本的聊天完成模型令牌限制有所不同，gpt-4 和 gpt-4-32k 的令牌限制分別為 8,192 和 32,768。這些限制包括來自所傳送訊息清單和模型回應的權杖計數。結合 max_tokens 參數值之訊息清單中的權杖數目必須保持在這些限制之下，否則您會收到錯誤訊息。請參考模型頁面，了解每個模型的代幣限制/上下文視窗。

您必須負責確保提示和完成符合權杖限制。針對較長的交談，您必須追蹤權杖計數，並只傳送屬於限制內的提示模型。或者，透過回應 API，您可以擁有交談記錄的 API 控制代碼截斷/管理。

下列程式碼範例示範簡單的聊天迴圈範例，其中包含使用 OpenAI tiktoken 程式庫處理 4,096 個權杖計數的技術。

你可能需要使用 pip install tiktoken --upgrade 升級你的 tiktoken 版本。

Microsoft Entra 身份識別
API 金鑰

import tiktoken
from openai import OpenAI
from azure.identity import DefaultAzureCredential, get_bearer_token_provider

token_provider = get_bearer_token_provider(
    DefaultAzureCredential(), "https://cognitiveservices.azure.com/.default"
)

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",  
  api_key=token_provider,
)

system_message = {"role": "system", "content": "You are a helpful assistant."}
max_response_tokens = 250
token_limit = 4096
conversation = []
conversation.append(system_message)

def num_tokens_from_messages(messages, model="gpt-4o"):
    """Return the number of tokens used by a list of messages."""
    try:
        encoding = tiktoken.encoding_for_model(model)
    except KeyError:
        print("Warning: model not found. Using o200k_base encoding.")
        encoding = tiktoken.get_encoding("o200k_base")
    
    if model in {
        "gpt-4o",
        "gpt-4o-mini",
        "gpt-5",
        "gpt-4.1",
        "o1",
        "o1-mini",
        "o3",
        "o3-mini",
        "o4-mini",
    }:
        tokens_per_message = 3
        tokens_per_name = 1

    elif any(model.startswith(prefix) for prefix in [
        "gpt-4o-", 
        "gpt-5-", 
        "gpt-4.1-",
        "o1-", 
        "o3-", 
        "o4-mini-",
    ]):
        tokens_per_message = 3
        tokens_per_name = 1
    else:
        raise NotImplementedError(
            f"""num_tokens_from_messages() is not implemented for model {model}. """
        )
    
    num_tokens = 0
    for message in messages:
        num_tokens += tokens_per_message
        for key, value in message.items():
            num_tokens += len(encoding.encode(value))
            if key == "name":
                num_tokens += tokens_per_name
    num_tokens += 3  
    return num_tokens


while True:
    user_input = input("Q:")      
    conversation.append({"role": "user", "content": user_input})
    conv_history_tokens = num_tokens_from_messages(conversation, model="gpt-4o")

    while conv_history_tokens + max_response_tokens >= token_limit:
        del conversation[1] 
        conv_history_tokens = num_tokens_from_messages(conversation, model="gpt-4o")

    response = client.chat.completions.create(
        model="gpt-4o",  
        messages=conversation,
        temperature=0.7,
        max_tokens=max_response_tokens
    )

    conversation.append({"role": "assistant", "content": response.choices[0].message.content})
    print("\n" + response.choices[0].message.content + "\n")

import tiktoken
import os
from openai import OpenAI

client = OpenAI(
  api_key = os.getenv("AZURE_OPENAI_API_KEY"),  
  base_url="https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/"
)

system_message = {"role": "system", "content": "You are a helpful assistant."}
max_response_tokens = 250
token_limit = 4096
conversation = []
conversation.append(system_message)

def num_tokens_from_messages(messages, model="gpt-4o"):
    """Return the number of tokens used by a list of messages."""
    try:
        encoding = tiktoken.encoding_for_model(model)
    except KeyError:
        print("Warning: model not found. Using o200k_base encoding.")
        encoding = tiktoken.get_encoding("o200k_base")
    
    if model in {
        "gpt-4o",
        "gpt-4o-mini",
        "gpt-5",
        "gpt-4.1",
        "o1",
        "o1-mini",
        "o3",
        "o3-mini",
        "o4-mini",
    }:
        tokens_per_message = 3
        tokens_per_name = 1

    elif any(model.startswith(prefix) for prefix in [
        "gpt-4o-", 
        "gpt-5-", 
        "gpt-4.1-",
        "o1-", 
        "o3-", 
        "o4-mini-",
    ]):
        tokens_per_message = 3
        tokens_per_name = 1
    else:
        raise NotImplementedError(
            f"""num_tokens_from_messages() is not implemented for model {model}. """
        )
    
    num_tokens = 0
    for message in messages:
        num_tokens += tokens_per_message
        for key, value in message.items():
            num_tokens += len(encoding.encode(value))
            if key == "name":
                num_tokens += tokens_per_name
    num_tokens += 3  
    return num_tokens


while True:
    user_input = input("Q:")      
    conversation.append({"role": "user", "content": user_input})
    conv_history_tokens = num_tokens_from_messages(conversation, model="gpt-4o")

    while conv_history_tokens + max_response_tokens >= token_limit:
        del conversation[1] 
        conv_history_tokens = num_tokens_from_messages(conversation, model="gpt-4o")

    response = client.chat.completions.create(
        model="gpt-4o",  
        messages=conversation,
        temperature=0.7,
        max_tokens=max_response_tokens
    )

    conversation.append({"role": "assistant", "content": response.choices[0].message.content})
    print("\n" + response.choices[0].message.content + "\n")

在此範例中，在達到權杖計數後，就會移除交談文字記錄中的最舊訊息。為效率考量，這裡會使用 del，而不是 pop()。我們會從索引 1 開始，一律保留系統訊息，並只移除使用者或助理訊息。經過一段時間後，因為模型會逐漸失去先前交談部分的內容，所以管理交談的這個方法可能會導致交談品質降低。

替代方法是將交談持續時間限制為最大權杖長度或特定回合數。達到最大權杖限制之後，如果允許交談繼續，模型就會遺失內容。您可以提示使用者開始新的交談，並清除訊息清單，透過可用的完整權杖限制來開始新的交談。

先前所示範程式碼的權杖計數部分是 OpenAI 操作手冊其中一個範例的簡化版本。

故障排除

因為模型產生無效的 Unicode 輸出，所以無法建立完成

錯誤碼	錯誤訊息	因應措施
500	500 - InternalServerError：錯誤碼：500 - {'message'：「因為模型產生無效的 Unicode 輸出，所以無法建立完成」}。	您可以將提示的溫度降到小於 1，並確保使用具有重試邏輯的用戶端，以最小化這些錯誤的發生次數。重新嘗試要求通常會得到成功的回應。

意見反應

此頁面對您有幫助嗎？

Last updated on 2025-11-26

共用方式為

使用聊天完成模型