Edit

MCP Server API reference

Expose RAG retrieval capabilities as tools over MCP (Model Context Protocol) for any MCP-compatible AI agent or client.

Important

Agentic Retrieval in Foundry Local is currently in PREVIEW. See the Supplemental Terms of Use for Microsoft Azure Previews for legal terms that apply to Azure features that are in beta, preview, or otherwise not yet released into general availability.

API information

Property Value
Protocol MCP over Streamable HTTP
Service indexed-sources-mcp-server
Port 8080
MCP endpoint /edgeai/mcp

Unlike the other Agentic RAG APIs, the MCP server uses a single endpoint for all operations. Every request is an HTTP POST to /edgeai/mcp with a JSON-RPC 2.0 body specifying the method (initialize, tools/list, tools/call). Responses are plain JSON.

Access Methods

External access (via ingress)

https://<cluster-domain>/edgeai/mcp

Requires a valid Bearer token (see Authentication).

Port forwarding (for development and testing)

kubectl port-forward deployment/indexed-sources-mcp-server-deployment 8080:8080 -n arc-rag

Then call http://localhost:8080/edgeai/mcp.


Authentication

Require authentication when IS_AUTH_ENABLED=true (the default). Authentication only applies to tools/call requests. Discovery methods (initialize, tools/list) don't require a token.

Pass the token as a Bearer token in the Authorization header. Dapr forwards the token to downstream services for RBAC enforcement on collection access.

Getting a token via Azure CLI

az account clear
az login --tenant <your-tenant-id> --output none

TOKEN=$(az account get-access-token \
  --resource "api://<your-app-client-id>" \
  --query accessToken -o tsv)

Tokens expire after about one hour. If you receive an authentication error, acquire a fresh token.

Connection

MCP requires a session handshake before tool calls: send an initialize request, capture the Mcp-Session-Id response header, then include it in all subsequent requests. MCP-compatible clients handle this automatically. For raw curl testing, follow the manual steps.

# Step 1 — Initialize
# The response body contains server capabilities. The Mcp-Session-Id is
# returned in the response headers — extract it and pass it to all
# subsequent requests.
curl -si -X POST http://localhost:8080/edgeai/mcp \
  -H "Content-Type: application/json" \
  -H "Accept: application/json, text/event-stream" \
  -d '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2025-03-26","capabilities":{},"clientInfo":{"name":"curl","version":"1.0"}}}'

# Example response headers:
#   HTTP/1.1 200 OK
#   Mcp-Session-Id: a1b2c3d4-e5f6-7890-abcd-ef1234567890
#   Content-Type: application/json

# Step 2 — List tools (pass the Mcp-Session-Id from Step 1)
curl -s -X POST http://localhost:8080/edgeai/mcp \
  -H "Content-Type: application/json" \
  -H "Accept: application/json, text/event-stream" \
  -H "Mcp-Session-Id: a1b2c3d4-e5f6-7890-abcd-ef1234567890" \
  -d '{"jsonrpc":"2.0","id":2,"method":"tools/list"}'

Tools

search_hybrid

Hybrid retrieval combines dense (semantic) and sparse (learned keyword) vector search, with results merged and reranked. Both vectors are generated by the BGE-M3 model. This method is the recommended default search mode.

Request

Parameter Type Required Default Constraints Description
query string Yes Search query text
collection_names list[string] No ["edgeragapp"] Index name prefixes to search
top_n integer No 5 1–50 Number of results to return
strictness integer No 1 0–5 Relevance threshold. 0 returns all results; higher values filter out lower-scoring chunks
filters string No null Milvus boolean filter expression (e.g., file_path like "%manual%")

Example

Request

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "search_hybrid",
    "arguments": {
      "query": "How do I reset my device?",
      "collection_names": ["edgeragapp"],
      "top_n": 5,
      "strictness": 1
    }
  }
}

Response

Tool responses are returned inside the standard MCP JSON-RPC envelope as a text content block. The tool output is a JSON string in the text field:

{
  "jsonrpc": "2.0",
  "id": 1,
  "result": {
    "content": [
      {
        "type": "text",
        "text": "{\"edgeragapp\": {\"results\": [...], \"metadata\": {...}}}"
      }
    ]
  }
}

Parsed tool output:

{
  "edgeragapp": {
    "results": [
      {
        "content": "To reset your device, press and hold the power button for 10 seconds...",
        "file_path": "10.244.3.70:/exports/manuals/device-guide.pdf",
        "chunk_id": "3",
        "score": 0.92,
        "page_numbers": [12],
        "vector_type": "dense"
      }
    ],
    "metadata": {
      "results_number": 1,
      "search_type": "hybrid_search",
      "collection": "edgeragapp"
    }
  }
}

The response is keyed by the collection name prefix you provided (e.g., edgeragapp). When querying multiple collections, each appears as a separate key in the response.

Response model: SearchChunksResponse

search_vector

Pure semantic (vector) retrieval. Best when query wording differs from document language.

Request

Same as search_hybrid - uses SearchToolRequest.

Example

Request

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "search_vector",
    "arguments": {
      "query": "power cycling procedure",
      "collection_names": ["edgeragapp"],
      "top_n": 3,
      "strictness": 2
    }
  }
}

Response

Same structure as search_hybrid. metadata.search_type is "vector_search".

Response model: SearchChunksResponse

search_text

Sparse vector keyword search using BGE-M3 learned sparse representations. Best for exact terms, codes, and names.

Request

Same as search_hybrid - uses SearchToolRequest.

Example

Request

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "search_text",
    "arguments": {
      "query": "ERR-4012",
      "collection_names": ["edgeragapp"],
      "top_n": 10,
      "strictness": 0
    }
  }
}

Response

Same structure as search_hybrid. metadata.search_type is "full_text_search".

Response model: SearchChunksResponse

search_image

Image retrieval by using vision embeddings.

Request

Same as search_hybrid - uses SearchToolRequest.

Example

Request

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "search_image",
    "arguments": {
      "query": "wiring diagram for sensor module",
      "collection_names": ["edgeragapp"],
      "top_n": 5,
      "strictness": 1
    }
  }
}

Response

Same structure as search_hybrid. metadata.search_type is "image_search" and vector_type is "image".

Response model: SearchChunksResponse

search_multimodal

Runs hybrid search (dense + sparse) and image search in parallel. Returns separate text_chunks and image_chunks.

Request

Parameter Type Required Default Constraints Description
query string Yes Search query
collection_names list[string] No ["edgeragapp"] Milvus collection names to search
top_n integer No 5 1–50 Number of results per modality
text_strictness integer No 1 0–5 Text relevance threshold. 0 returns all results; higher values filter out lower-scoring chunks
image_strictness integer No 1 0–5 Image relevance threshold. Same scale as text_strictness
filters string No null Milvus boolean filter expression (e.g., file_path like "%manual%")

Example

Request

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "search_multimodal",
    "arguments": {
      "query": "installation steps with diagrams",
      "collection_names": ["edgeragapp"],
      "top_n": 5,
      "text_strictness": 2,
      "image_strictness": 1
    }
  }
}

Response

{
  "edgeragapp": {
    "text_chunks": [
      {
        "content": "Step 1: Mount the bracket to the wall...",
        "file_path": "10.244.3.70:/exports/manuals/install-guide.pdf",
        "chunk_id": "7",
        "score": 0.88,
        "page_numbers": [3, 4],
        "vector_type": "dense"
      }
    ],
    "image_chunks": [
      {
        "content": "",
        "file_path": "10.244.3.70:/exports/manuals/install-guide.pdf",
        "chunk_id": "img-2",
        "score": 0.76,
        "page_numbers": [4],
        "vector_type": "image"
      }
    ],
    "metadata": {
      "results_number": 2,
      "search_type": "multimodal_search",
      "collection": "edgeragapp"
    }
  }
}

Response model: MultimodalSearchResponse

get_available_collections

Lists all collections the authenticated user has access to.

Request

No input parameters.

Example

Request

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "get_available_collections",
    "arguments": {}
  }
}

Response

{
  "collections": [
    {
      "name": "edgeragapp",
      "description": "Default system collection (auto-created)",
      "status": "active",
      "created_at": "2026-02-25T15:32:19.097957+00:00"
    },
    {
      "name": "my-docs",
      "description": "Technical manuals",
      "status": "active",
      "created_at": "2026-03-04T12:00:00.000000+00:00"
    }
  ],
  "total_count": 2
}

Response model: CollectionsResponse

Request Schemas

SearchToolRequest

Used by: search_hybrid, search_vector, search_text, search_image.

Field Type Required Default Constraints Description
query string Yes Search query text
collection_names list[string] No ["edgeragapp"] Index name prefixes to search (model suffix added automatically)
top_n integer No 5 ge=1, le=50 Number of results (1–50)
strictness integer No 1 ge=0, le=5 Relevance threshold (0–5). 0 returns all results; higher values filter out lower-scoring chunks
filters string \| null No null Milvus boolean filter expression (e.g., file_path like "%manual%")

MultimodalSearchRequest

Used by: search_multimodal.

Field Type Required Default Constraints Description
query string Yes Search query
collection_names list[string] No ["edgeragapp"] Milvus collection names to search
top_n integer No 5 ge=1, le=50 Number of results per modality (1–50)
text_strictness integer No 1 ge=0, le=5 Text relevance threshold (0–5). 0 returns all results; higher values filter out lower-scoring chunks
image_strictness integer No 1 ge=0, le=5 Image relevance threshold (0–5). Same scale as text_strictness
filters string \| null No null Milvus boolean filter expression (e.g., file_path like "%manual%")

Response Schemas

ToolChunkResult

A single chunk result from a search operation.

Field Type Description
content string Chunk text content
file_path string Source document path
chunk_id string Chunk position in document
score float Relevance score (0.0–1.0)
page_numbers list[integer] Pages where chunk appears
vector_type string Embedding type (dense, sparse, or image). Default: "dense"

SearchMetadata

Metadata about a search operation.

Field Type Description
results_number integer Total number of results returned
search_type string Type of search performed (for example, hybrid_search, vector_search, full_text_search, image_search, multimodal_search)
collection string Milvus collection name used

SearchChunksResponse

Returned by: search_hybrid, search_vector, search_text, search_image.

Field Type Description
results list[ToolChunkResult] List of matching chunks
metadata SearchMetadata Search operation metadata

MultimodalSearchResponse

Returned by: search_multimodal.

Field Type Description
text_chunks list[ToolChunkResult] Text search results
image_chunks list[ToolChunkResult] Image search results (IDs only)
metadata SearchMetadata Search metadata

CollectionInfo

Field Type Description
name string Collection name
description string \| null Collection description
status string Collection status (for example, "active", "deleting")
created_at string ISO 8601 creation timestamp

CollectionsResponse

Returned by: get_available_collections.

Field Type Description
collections list[CollectionInfo] List of accessible collections
total_count integer Total number of collections returned

Response codes

The JSON-RPC response returns errors in the error field. These codes apply to all tools.

Code Description
-32600 Invalid Request — the JSON-RPC request is malformed.
-32601 Method not found — the requested MCP method doesn't exist.
-32602 Invalid params — tool arguments failed schema validation (for example, top_n out of range).
-32603 Internal error — unexpected server error or missing auth token.