教學課程：建立外部模型端點以查詢 OpenAI 模型

文章
10/30/2024

本文提供設定及查詢外部模型端點的逐步指示，這些端點提供OpenAI模型，以使用 MLflow部署SDK進行完成、聊天和內嵌。深入瞭解外部模型。

如果您想要使用服務 UI 來完成這項工作，請參閱建立外部模型服務端點。

需求

Databricks Runtime 13.0 ML 或更新版本。
MLflow 2.9 或更新版本。
OpenAI API 金鑰。
安裝 Databricks CLI 0.205 版或更新版本。

（選擇性）步驟 0：使用 Databricks Secrets CLI 儲存 OpenAI API 金鑰

您可以在步驟 3 或使用 Azure Databricks 秘密，提供 API 金鑰做為純文字字串。

若要將 OpenAI API 金鑰儲存為秘密，您可以使用 Databricks Secrets CLI（0.205 版和更新版本）。您也可以使用 REST API 進行秘密。

下列會建立名為的秘密範圍， my_openai_secret_scope然後在該範圍中建立秘密 openai_api_key 。

databricks secrets create-scope my_openai_secret_scope
databricks secrets put-secret my_openai_secret_scope openai_api_key

步驟 1：使用外部模型支持安裝 MLflow

使用下列項目來安裝具有外部模型支援的 MLflow 版本：

%pip install mlflow[genai]>=2.9.0

步驟 2：建立和管理外部模型端點

重要

本節中的程式代碼範例示範公開預覽 MLflow 部署 CRUD SDK 的使用方式。

若要建立大型語言模型的外部模型端點（LLM），請使用 create_endpoint() MLflow 部署 SDK 中的方法。您也可以在服務 UI 中建立外部模型端點。

下列代碼段會建立 OpenAI gpt-3.5-turbo-instruct的完成端點，如組態的 served_entities 區段所指定。針對您的端點，請務必為每個字段填入 name 和 openai_api_key 的唯一值。

import mlflow.deployments

client = mlflow.deployments.get_deploy_client("databricks")
client.create_endpoint(
    name="openai-completions-endpoint",
    config={
        "served_entities": [{
            "name": "openai-completions",
            "external_model": {
                "name": "gpt-3.5-turbo-instruct",
                "provider": "openai",
                "task": "llm/v1/completions",
                "openai_config": {
                    "openai_api_key": "{{secrets/my_openai_secret_scope/openai_api_key}}"
                }
            }
        }]
    }
)

下列代碼段示範如何提供 OpenAI API 金鑰做為純文字字串，以替代方式建立與上述相同的完成端點。

import mlflow.deployments

client = mlflow.deployments.get_deploy_client("databricks")
client.create_endpoint(
    name="openai-completions-endpoint",
    config={
        "served_entities": [{
            "name": "openai-completions",
            "external_model": {
                "name": "gpt-3.5-turbo-instruct",
                "provider": "openai",
                "task": "llm/v1/completions",
                "openai_config": {
                    "openai_api_key_plaintext": "sk-yourApiKey"
                }
            }
        }]
    }
)

如果您使用 Azure OpenAI，您也可以在 openai_config 組態的區段中指定 Azure OpenAI 部署名稱、端點 URL 和 API 版本。

client.create_endpoint(
    name="openai-completions-endpoint",
    config={
        "served_entities": [
          {
            "name": "openai-completions",
            "external_model": {
                "name": "gpt-3.5-turbo-instruct",
                "provider": "openai",
                "task": "llm/v1/completions",
                "openai_config": {
                    "openai_api_type": "azure",
                    "openai_api_key": "{{secrets/my_openai_secret_scope/openai_api_key}}",
                    "openai_api_base": "https://my-azure-openai-endpoint.openai.azure.com",
                    "openai_deployment_name": "my-gpt-35-turbo-deployment",
                    "openai_api_version": "2023-05-15"
                },
            },
          }
        ],
    },
)

若要更新端點，請使用 update_endpoint()。下列代碼段示範如何將端點的速率限制更新為每個使用者每分鐘 20 個呼叫。

client.update_endpoint(
    endpoint="openai-completions-endpoint",
    config={
        "rate_limits": [
            {
                "key": "user",
                "renewal_period": "minute",
                "calls": 20
            }
        ],
    },
)

步驟 3：將要求傳送至外部模型端點

重要

本節中的程序代碼範例示範 MLflow 部署 SDK 方法的使用 predict() 方式。

您可以使用 MLflow 部署 SDK predict() 的方法，將聊天、完成和內嵌要求傳送至外部模型端點。

下列命令會將要求傳送給 gpt-3.5-turbo-instruct OpenAI所裝載的要求。

completions_response = client.predict(
    endpoint="openai-completions-endpoint",
    inputs={
        "prompt": "What is the capital of France?",
        "temperature": 0.1,
        "max_tokens": 10,
        "n": 2
    }
)
completions_response == {
    "id": "cmpl-8QW0hdtUesKmhB3a1Vel6X25j2MDJ",
    "object": "text_completion",
    "created": 1701330267,
    "model": "gpt-3.5-turbo-instruct",
    "choices": [
        {
            "text": "The capital of France is Paris.",
            "index": 0,
            "finish_reason": "stop",
            "logprobs": None
        },
        {
            "text": "Paris is the capital of France",
            "index": 1,
            "finish_reason": "stop",
            "logprobs": None
        },
    ],
    "usage": {
        "prompt_tokens": 7,
        "completion_tokens": 16,
        "total_tokens": 23
    }
}

步驟 4：比較不同提供者的模型

模型服務支援許多外部模型提供者，包括 Open AI、Anthropic、Cohere、Amazon Bedrock、Google Cloud Vertex AI 等等。您可以比較跨提供者的 LLM，協助您使用 AI 遊樂場將應用程式的正確性、速度和成本優化。

下列範例會建立 Anthropic claude-2 的端點，並將其回應與使用 OpenAI gpt-3.5-turbo-instruct的問題進行比較。這兩個回應都有相同的標準格式，因此易於比較。

建立人類 claude-2 的端點

import mlflow.deployments

client = mlflow.deployments.get_deploy_client("databricks")

client.create_endpoint(
    name="anthropic-completions-endpoint",
    config={
        "served_entities": [
            {
                "name": "claude-completions",
                "external_model": {
                    "name": "claude-2",
                    "provider": "anthropic",
                    "task": "llm/v1/completions",
                    "anthropic_config": {
                        "anthropic_api_key": "{{secrets/my_anthropic_secret_scope/anthropic_api_key}}"
                    },
                },
            }
        ],
    },
)

比較每個端點的回應


openai_response = client.predict(
    endpoint="openai-completions-endpoint",
    inputs={
        "prompt": "How is Pi calculated? Be very concise."
    }
)
anthropic_response = client.predict(
    endpoint="anthropic-completions-endpoint",
    inputs={
        "prompt": "How is Pi calculated? Be very concise."
    }
)
openai_response["choices"] == [
    {
        "text": "Pi is calculated by dividing the circumference of a circle by its diameter."
                " This constant ratio of 3.14159... is then used to represent the relationship"
                " between a circle's circumference and its diameter, regardless of the size of the"
                " circle.",
        "index": 0,
        "finish_reason": "stop",
        "logprobs": None
    }
]
anthropic_response["choices"] == [
    {
        "text": "Pi is calculated by approximating the ratio of a circle's circumference to"
                " its diameter. Common approximation methods include infinite series, infinite"
                " products, and computing the perimeters of polygons with more and more sides"
                " inscribed in or around a circle.",
        "index": 0,
        "finish_reason": "stop",
        "logprobs": None
    }
]

其他資源

Mosaic AI Model Serving 中的外部模型。

分享方式：