查詢已部署的馬賽克 AI 代理程式

瞭解如何將請求傳送給部署到模型服務端點的代理程式。 Databricks 提供多種查詢方法，以符合不同的使用案例和整合需求。

選取最適合您使用案例的查詢方法：

方法	主要優點
Databricks OpenAI 用戶端（建議）	原生整合、全功能支援、串流功能
MLflow 部署用戶端	現有的 MLflow 模式、已建立的 ML 管線
REST API	與 OpenAI 兼容，與語言無關，可與現有工具配合使用
人工智慧功能： `ai_query`	與 OpenAI 兼容，可與現有工具配合使用

Databricks 建議針對新應用程式使用 Databricks OpenAI 用戶端 。與預期 OpenAI 相容端點的平台整合時，請選擇 REST API 。

Databricks OpenAI 用戶端（建議）

Databricks 建議您使用 Databricks OpenAI 用戶端來查詢已部署的代理程式。根據您已部署的代理程式的 API，您將使用回應客戶端或聊天完成客戶端。

ResponsesAgent 代理程式端點

針對使用 ResponsesAgent 介面建立的代理程式使用下列範例，這是建置代理程式的建議方法。

from databricks.sdk import WorkspaceClient

input_msgs = [{"role": "user", "content": "What does Databricks do?"}]
endpoint = "<agent-endpoint-name>" # TODO: update this with your endpoint name

w = WorkspaceClient()
client = w.serving_endpoints.get_open_ai_client()

## Run for non-streaming responses. Invokes `predict`
response = client.responses.create(model=endpoint, input=input_msgs)
print(response)

## Include stream=True for streaming responses. Invokes `predict_stream`
streaming_response = client.responses.create(model=endpoint, input=input_msgs, stream=True)
for chunk in streaming_response:
  print(chunk)

如果要傳入 custom_inputs 或 databricks_options，可以使用 extra_body 參數來添加它們：

streaming_response = client.responses.create(
    model=endpoint,
    input=input_msgs,
    stream=True,
    extra_body={
        "custom_inputs": {"id": 5},
        "databricks_options": {"return_trace": True},
    },
)
for chunk in streaming_response:
    print(chunk)

ChatAgent 或 ChatModel 端點

對使用舊版 ChatAgent 或 ChatModel 介面建立的客服專員使用以下範例，這些介面仍受支援，但不建議新客服人員使用。

from databricks.sdk import WorkspaceClient

messages = [{"role": "user", "content": "What does Databricks do?"}]
endpoint = "<agent-endpoint-name>" # TODO: update this with your endpoint name

w = WorkspaceClient()
client = w.serving_endpoints.get_open_ai_client()

## Run for non-streaming responses. Invokes `predict`
response = client.chat.completions.create(model=endpoint, messages=messages)
print(response)

## Include stream=True for streaming responses. Invokes `predict_stream`
streaming_response = client.chat.completions.create(model=endpoint, messages=messages, stream=True)
for chunk in streaming_response:
  print(chunk)

如果要傳入 custom_inputs 或 databricks_options，可以使用 extra_body 參數來添加它們：

streaming_response = client.chat.completions.create(
    model=endpoint,
    messages=messages,
    stream=True,
    extra_body={
        "custom_inputs": {"id": 5},
        "databricks_options": {"return_trace": True},
    },
)
for chunk in streaming_response:
    print(chunk)

MLflow 部署客戶端

在現有的 MLflow 工作流程和管線中工作時，請使用 MLflow 部署用戶端。這種方法與 MLflow 追蹤和實驗管理自然整合。

下列範例示範如何使用 MLflow 部署用戶端查詢代理程式。對於新應用程序，Databricks 建議使用 Databricks OpenAI 客戶端，因為它具有增強的功能和本機集成。

根據您已部署代理程式的 API，您將使用 "ResponsesAgent" 或 "ChatAgent" 格式。

ResponsesAgent 代理程式端點

針對使用 ResponsesAgent 介面建立的代理程式使用下列範例，這是建置代理程式的建議方法。

from mlflow.deployments import get_deploy_client

client = get_deploy_client()
input_example = {
    "input": [{"role": "user", "content": "What does Databricks do?"}],
    ## Optional: Include any custom inputs
    ## "custom_inputs": {"id": 5},
    "databricks_options": {"return_trace": True},
}
endpoint = "<agent-endpoint-name>" # TODO: update this with your endpoint name

## Call predict for non-streaming responses
response = client.predict(endpoint=endpoint, inputs=input_example)

## Call predict_stream for streaming responses
streaming_response = client.predict_stream(endpoint=endpoint, inputs=input_example)

ChatAgent 或 ChatModel 端點

將此選項用於使用舊版 ChatAgent 或 ChatModel 介面建立的客服專員，這些介面仍受支援，但不建議新客服專員使用。

from mlflow.deployments import get_deploy_client

client = get_deploy_client()
input_example = {
    "messages": [{"role": "user", "content": "What does Databricks do?"}],
    ## Optional: Include any custom inputs
    ## "custom_inputs": {"id": 5},
    "databricks_options": {"return_trace": True},
}
endpoint = "<agent-endpoint-name>" # TODO: update this with your endpoint name

## Call predict for non-streaming responses
response = client.predict(endpoint=endpoint, inputs=input_example)

## Call predict_stream for streaming responses
streaming_response = client.predict_stream(endpoint=endpoint, inputs=input_example)

client.predict() 和 client.predict_stream() 呼叫您在撰寫代理程式時定義的代理程式函式。請參閱串流回應。

REST API

Databricks REST API 為 OpenAI 相容的模型提供端點。這可讓您使用 Databricks 代理程式來提供需要 OpenAI 介面的應用程式。

這種方法非常適合：

使用 HTTP 請求的、不依賴特定語言的應用程式
與期望 OpenAI 相容 API 的第三方平台整合
從 OpenAI 遷移到 Databricks，只需最少的代碼更改

使用 Databricks OAuth 權杖或個人存取權杖（PAT）向 REST API 進行驗證。下列範例會使用 Databricks OAuth 權杖，如需詳細資訊和資訊，請參閱 Databricks 驗證檔。

ResponsesAgent 代理程式端點

針對使用 ResponsesAgent 介面建立的代理程式使用下列範例，這是建置代理程式的建議方法。 REST API 呼叫相當於：

使用 Databricks OpenAI 用戶端搭配 responses.create。
將 POST 請求傳送至特定端點的 URL （例如： https://<host.databricks.com>/serving-endpoints/\<model-name\>/invocations）。如需更多詳細資訊，請參閱端點的模型服務頁面和模型服務文件。

curl --request POST \
  --url https://<host.databricks.com\>/serving-endpoints/responses \
  --header 'Authorization: Bearer <OAuth token>' \
  --header 'content-type: application/json' \
  --data '{
    "model": "\<model-name\>",
    "input": [{ "role": "user", "content": "hi" }],
    "stream": true
  }'

如果要傳入 custom_inputs 或 databricks_options，可以使用 extra_body 參數來添加它們：

curl --request POST \
  --url https://<host.databricks.com\>/serving-endpoints/responses \
  --header 'Authorization: Bearer <OAuth token>' \
  --header 'content-type: application/json' \
  --data '{
    "model": "\<model-name\>",
    "input": [{ "role": "user", "content": "hi" }],
    "stream": true,
    "extra_body": {
      "custom_inputs": { "id": 5 },
      "databricks_options": { "return_trace": true }
    }
  }'

ChatAgent 或 ChatModel 端點

將此選項用於使用舊版 ChatAgent 或 ChatModel 介面建立的客服專員，這些介面仍受支援，但不建議新客服專員使用。這相當於：

使用 Databricks OpenAI 用戶端搭配 chat.completions.create。
將 POST 請求傳送至特定端點的 URL （例如： https://<host.databricks.com>/serving-endpoints/\<model-name\>/invocations）。如需更多詳細資訊，請參閱端點的模型服務頁面和模型服務文件。

curl --request POST \
  --url https://<host.databricks.com\>/serving-endpoints/chat/completions \
  --header 'Authorization: Bearer <OAuth token>' \
  --header 'content-type: application/json' \
  --data '{
    "model": "\<model-name\>",
    "messages": [{ "role": "user", "content": "hi" }],
    "stream": true
  }'

如果要傳入 custom_inputs 或 databricks_options，可以使用 extra_body 參數來添加它們：

curl --request POST \
  --url https://<host.databricks.com\>/serving-endpoints/chat/completions \
  --header 'Authorization: Bearer <OAuth token>' \
  --header 'content-type: application/json' \
  --data '{
    "model": "\<model-name\>",
    "messages": [{ "role": "user", "content": "hi" }],
    "stream": true,
    "extra_body": {
      "custom_inputs": { "id": 5 },
      "databricks_options": { "return_trace": true }
    }
  }'

人工智慧功能： `ai_query`

您可以使用 ai_query 來使用 SQL 查詢已部署的 AI 代理程式。如需 SQL 語法及參數定義，請參閱 ai_query 函數。

SELECT ai_query(
  "<model name>", question
) FROM (VALUES ('what is MLflow?'), ('how does MLflow work?')) AS t(question);

後續步驟

監控生成式人工智慧在生產環境中的運行

意見反應

此頁面對您有幫助嗎？

Last updated on 2025-11-13

共用方式為

查詢已部署的馬賽克 AI 代理程式

Databricks OpenAI 用戶端 （建議）

ResponsesAgent 代理程式端點

ChatAgent 或 ChatModel 端點

MLflow 部署客戶端

ResponsesAgent 代理程式端點

ChatAgent 或 ChatModel 端點

REST API

ResponsesAgent 代理程式端點

ChatAgent 或 ChatModel 端點

人工智慧功能： ai_query

後續步驟

意見反應

其他資源

Databricks OpenAI 用戶端（建議）

人工智慧功能： `ai_query`