将 SQL MCP Server 与本地模型配合使用

Important

数据 API 生成器版本 1.7 中提供了 SQL 模型上下文协议（MCP）服务器。有关最新功能和 bug 修复，请使用 2.0 预览版。

SQL 模型上下文协议（MCP）服务器适用于任何与 MCP 兼容的客户端，而不仅仅是云托管的 AI 服务。如果你的环境限制云大型语言模型（LLM）访问（在医疗保健、国防、金融、能源和海事行业中很常见），则可以连接通过 Ollama 或类似工具提供服务的本地模型。本指南介绍设置、字段元数据配置和提示模式，这些模式使小型本地模型可靠。

先决条件

安装并配置了至少一个实体的数据 API 生成器 CLI。安装 CLI。
Ollama 具有支持工具调用的模型（例如，qwen3:8bllama3.1:8b）。
Python 3.10+，并安装了 mcp 和 ollama 包。
包含数据的正在运行SQL Server实例。

步骤 1：配置字段元数据

字段元数据是本地模型准确性最重要的配置步骤。如果没有字段名称和描述，代理只能看到实体名称，因此会错误地猜测列名。

Warning

跳过此步骤会生成一个 MCP 服务器，它在技术上可以运行，但对于任何会读取工具响应的模型来说，在功能上都无法使用。该模型不了解您的各列。

为您的实体添加说明，然后为受限列添加包含有效值的字段描述：

dab add ServerInventory \
  --source dbo.ServerInventory \
  --permissions "anonymous:read" \
  --description "SQL Server instance inventory with version, environment, and sizing data"

dab update ServerInventory \
  --fields.name InstanceName --fields.primary-key true \
  --fields.description "SQL Server instance name (e.g., YOURSERVER01)"

dab update ServerInventory \
  --fields.name Environment \
  --fields.description "Deployment environment. Valid values: Prod, Dev, Test, UAT"

有关完整的 CLI 参考和最佳做法（包括约束值、参数说明和脚本模式），请参阅向实体添加说明。

注释

dab update CLI 将逗号视为参数分隔符。如果说明包含逗号，请直接编辑 dab-config.json 。

步骤 2：启动 SQL MCP 服务器

dab start

默认情况下，SQL MCP Server 使用流式 HTTP 传输在 http://localhost:5000/mcp 上侦听。任何实现 MCP 协议的客户端都可以连接到此终结点。

步骤 3：连接本地模型

生成将 Ollama 模型连接到 SQL MCP Server 的 MCP 客户端。以下Python示例使用 MCP Python SDK 和 ollama 包。

安装依赖项

pip install mcp ollama

最小Python线束

import asyncio
import json
from mcp import ClientSession
from mcp.client.streamable_http import streamable_http_client
import ollama

MCP_URL = "http://localhost:5000/mcp"
MODEL = "qwen3:8b"

async def get_schema(session: ClientSession) -> str:
    """Call describe_entities and format results for the system prompt."""
    result = await session.call_tool("describe_entities", arguments={})
    entities = json.loads(result.content[0].text)
    lines = []
    for entity in entities.get("entities", []):
        fields = ", ".join(
            f"{f['name']} ({f.get('description', 'no description')})"
            for f in entity.get("fields", [])
        )
        lines.append(f"- {entity['name']}: {entity.get('description', '')}")
        if fields:
            lines.append(f"  Fields: {fields}")
    return "\n".join(lines)

async def run(user_question: str):
    async with streamable_http_client(MCP_URL) as (read, write, _):
        async with ClientSession(read, write) as session:
            await session.initialize()

            # Preinject schema into the system prompt
            schema_text = await get_schema(session)
            system_prompt = f"""You query a SQL database through MCP tools.

Available entities:
{schema_text}

Rules:
- Use the exact field names shown above.
- Answer count questions with the count only.
- Do not produce summaries unless asked.
- Do not invent example data. Only return data from tool responses.
- If no results, say "No results found" and stop.
"""
            # Get available tools for Ollama
            tools_result = await session.list_tools()
            ollama_tools = [
                {
                    "type": "function",
                    "function": {
                        "name": t.name,
                        "description": t.description or "",
                        "parameters": t.inputSchema,
                    },
                }
                for t in tools_result.tools
            ]

            messages = [
                {"role": "system", "content": system_prompt},
                {"role": "user", "content": user_question},
            ]

            # Chat loop: let the model call tools until it produces a final answer
            while True:
                response = ollama.chat(
                    model=MODEL, messages=messages, tools=ollama_tools
                )
                msg = response["message"]
                messages.append(msg)

                if not msg.get("tool_calls"):
                    print(msg["content"])
                    break

                for tc in msg["tool_calls"]:
                    result = await session.call_tool(
                        tc["function"]["name"],
                        arguments=tc["function"]["arguments"],
                    )
                    messages.append(
                        {
                            "role": "tool",
                            "content": result.content[0].text,
                        }
                    )

asyncio.run(run("How many SQL 2019 servers are in production?"))

该框架处理完整流程：模式预注入、工具发现、多轮工具调用以及最终答案提取。根据你的环境调整 MODEL 和 MCP_URL。

启动时预注入架构

当架构元数据在聊天开始前的系统提示符中时，小型本地模型（低于 14B 参数）会生成更可靠的工具调用。与其依赖模型在对话过程中自行调用 describe_entities，不如在 harness 启动时调用它并注入结果。

为什么预注入很重要

Approach	小型模型的行为
动态发现	模型必须首先决定调用 `describe_entities` ，然后解释结果，然后使用正确的字段名称调用正确的工具。多个故障点。
预注入	模型立即看到实体名称、字段名称和说明。在第一次尝试时正确调用工具。

上一部分中的 harness 示例演示了此模式。函数 get_schema() 在启动时调用 describe_entities 一次，并将结果格式化为系统提示符。

Tip

较大的云端模型（GPT-4o、Claude）通常无需预先注入，就能在对话过程中识别出 schema。此模式对于 14B 参数以下的模型最有价值。

限制模型响应

模型可以进行正确的工具调用、检索正确的数据，并仍然生成错误的答案。例如，当一个模型被问到“有多少台生产服务器？”时，它可能会正确检索出 16 条记录，却返回一段长达 40 行的高管摘要，里面还包含凭空编造的示例，而不是直接给出数字 16。

在系统提示词中添加明确的否定性规则：

Rules:
- Answer count questions with the count only.
- Do not produce summaries unless the user asks for one.
- Do not invent example data. Only return data from tool responses.
- If a tool returns no results, say "No results found" and stop.

工具调用保真度和回答约束性是不同的问题。 DAB 确保通过工具层准确检索数据。你的提示框架决定模型如何呈现结果。

Considerations

主题	详细信息
Hardware	工具调用可在配置不高的硬件上运行。在消费级 Nvidia GPU（8 GB 显存）上运行的 8B 参数模型能够产生有用的结果。每个问题的延迟可能达到数十秒，这适合批处理工作负载。
批处理与交互式	小型模型非常适合批处理（性能报告、库存查询），其中延迟容忍度更高。
工具可用性	`aggregate_records` 仅在版本 2.0 预览版及更高版本中可用。在版本 1.7.x 上，计数和聚合查询强制模型读取所有匹配行。按版本查看工具可用性。
运输	本地模型通过可流式传输的 HTTP 连接到 `/mcp`。标准输入/输出（stdio）传输是单进程设置的替代方法。
身份验证	对于本地开发，请使用 `anonymous` 权限。对于生产环境，请配置适合环境的身份验证。

反馈

此页面是否有帮助？

Last updated on 2026-05-20