Share via

Issue with Custom Open API Tool in Foundry

Nimesh 210 Reputation points
2026-05-14T05:14:45.0033333+00:00

User's image

Hi Everyone,

I am using Custom Open API tool which uses Azure AI Search endpoint. When I prompt the agent, I get this error intermittently. This tool endpoint is searching a KB index with lot of data.

The reason for using this tool instead of not attaching the KB directly via Foundry IQ is because I want to publish this agent to Teams group chat. If I use Foundry IQ, it will continuously prompt for approval when added to Teams chat. To avoid it, I have tried using custom MCP, but custom MCP doesn't retrieve all the data and gets chunked data only. So the answer is not accurate.

Is there a way to address this issue via Foundry Workflows or any other method?

User's image

Foundry Tools
Foundry Tools

Formerly known as Azure AI Services or Azure Cognitive Services is a unified collection of prebuilt AI capabilities within the Microsoft Foundry platform


Answer accepted by question author

Karnam Venkata Rajeswari 3,675 Reputation points Microsoft External Staff Moderator
2026-05-22T20:31:16.8566667+00:00

Hello Nimesh,

Welcome to Microsoft Q&A .Thank you for reaching out to us.

Thank for sharing the detailed scenario and observations. Based on the behavior and error message, this situation is commonly associated with tool response size limits combined with retrieval and orchestration design pattern

The error below typically indicates that the response returned from the Custom OpenAPI tool exceeds limits allowed by the agent orchestration layer - “tool_user_error: Received message exceeds the maximum configured message size”

This can occur when large payloads (multiple chunks, full search responses, or unnecessary metadata) are returned. Since search responses vary by query, broader queries may intermittently trigger this condition. Returning the full knowledge base in a single response is generally not recommended due to these limits.

Please check if the following helps-

  1. Stabilizing execution and isolating the behavior A reliable starting point is to validate the Search API independently.
    1. Test Azure AI Search queries using Postman or cURL
    2. Inspect:
      • Payload size
      • Number of returned chunks
      • Included metadata and vector fields
      • Response latency
    If direct API calls succeed consistently, the issue is most likely related to tool-response handling within the agent orchestration layer.
  2. Controlling response size The most effective mitigation is to keep tool responses compact and targeted:
    • Limit results: Use $top=3 to $top=5
    • Restrict fields:Use $select for essential fields (chunk, title, chunk_id, parent_id)
    • Exclude unnecessary data:
    • Avoid vector fields such as text_vector
    • Remove extra metadata, scores, or debug fields
  3. Avoid returning raw search responses Instead of returning complete Azure AI Search payloads:
    • Retrieve only relevant chunks
    • Apply filtering or reranking
    • Aggregate or summarize responses
    • Return only concise, grounded context
    This significantly improves reliability and reduces message-size failures.
  4. Supporting complete knowledge retrieval safely When broader knowledge retrieval is required, a single large response can exceed limits. A more reliable approach is:
    • Use pagination or staged retrieval
    • Retrieve smaller batches across multiple calls
    • Aggregate and process results outside the agent
    • Return only the final curated response
  5. Improving retrieval quality while minimizing size To maintain accuracy with smaller payloads:
    • Use Hybrid Search (keyword + vector)
    • Enable semantic ranking / reranking
    • Use focused queries (top‑K retrieval)
    Optimize chunking strategy:
    • Chunk size: ~500–1000 tokens
    • Overlap: ~50–100 tokens
  6. Introducing a proxy or orchestration layer A scalable architecture is to introduce a middleware layer , the flow would be Agent > OpenAPI Tool > Function / Workflow > Azure AI Search > Aggregation > Final response This approach provides:
    • Controlled payload handling
    • Pagination and batching support
    • Improved logging and diagnostics
    • Better error handling

Thus , to summarise

The intermittent failures are most likely caused by large tool responses exceeding orchestration limits, with variability depending on query size. The most reliable long-term approach combines:

  • Payload optimization
  • Focused retrieval
  • Summarization/compression
  • Pagination (staged retrieval)
  • Workflow or middleware-based orchestration

The following references might be helpful , please check them out

Thank you

Please 'Upvote'(Thumbs-up) and 'Accept' as answer if the response was helpful. This will be benefitting other community members who face the same issue.

Was this answer helpful?

1 person found this answer helpful.
0 comments No comments

1 additional answer

Sort by: Most helpful
  1. Sedat SALMAN 14,455 Reputation points MVP
    2026-05-14T06:30:10.09+00:00

    This looks like a performance issue between the Foundry Agent, your custom OpenAPI tool, and Azure AI Search.

    here are suggestions

    • Test the API directly with Postman/cURL
    • Reduce Azure AI Search result size (top=3-5)
    • Return compact responses only
    • Check API timeout and latency
    • Ensure every API operation has a valid operationId

    https://learn.microsoft.com/en-us/azure/foundry-classic/agents/how-to/tools-classic/openapi-spec

    Was this answer helpful?

    0 comments No comments

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.