Share via

Function Calling via Foundry Agent in Voice Live API

Cem Işık Doğru 45 Reputation points
2026-02-03T11:49:44.1133333+00:00

Below are the quickstarts for foundry agent with Voice Live API, function calling with Voice Live API and foundry agent with function calling, respectively:

In the Azure Voice Live API function calling sample, function calling is supported in real-time voice sessions by defining FunctionTool back-end functions that are executed by the application (not by the model itself) as the conversation runs.

Azure AI Foundry agents also support function calling via the Foundry agent service, where the agent returns a function call that must be executed by application code.

Is it possible to combine these two capabilities — i.e., use a Microsoft Foundry Agent with Voice Live so that the agent will perform function calling and have those function calls executed during a real-time Voice Live session? If so, how does the Voice Live client detect and handle the Foundry agent’s function call requests and return results back into the session?

Azure Speech in Foundry Tools
0 comments No comments

Answer accepted by question author

Anshika Varshney 12,115 Reputation points Microsoft External Staff Moderator
2026-02-09T21:07:40.53+00:00

Hi Cem Işık Doğru,

Thanks for the detailed clarification this helps narrow the issue.

At the moment, Microsoft Foundry Agents do not natively execute functions/tools on your behalf, whether the interaction is text-based or Voice Live. This is an important architectural distinction.

How function calling actually works today

  • A Foundry Agent can decide to call a function and will emit a function-call payload (tool invocation) as part of its response.
  • Foundry does not execute the function. There is no built-in runtime inside Foundry that invokes Azure Functions, webhooks, or backend APIs automatically.
  • Execution responsibility always lives in your application/backend, regardless of whether the input comes from:
    • Text chat
      • Voice Live API
        • Realtime streaming scenarios

Expected lifecycle

  1. Voice Live captures speech and sends transcribed text to your application.
  2. Your application forwards the input to the Foundry Agent.
  3. The Agent returns either:
    • Normal assistant content, or
    • A function call request (tool name + arguments).
  4. Your application must:
  • Detect the function call in the agent response
    • Execute the function (API, service, Azure Function, etc.)
    • Send the function result back to the Agent as a follow-up turn
    1. The Agent then continues the conversation, which Voice Live can speak back to the user.

Voice Live–specific note

Voice Live does not change this flow. It does not automatically “listen” for Foundry tool calls or execute them. If the function result is never returned to the Agent, the session can appear to hang which matches the behavior you’re seeing.

Why direct Azure OpenAI works but Foundry Agent stalls

When calling Azure OpenAI directly, you’re likely handling the function loop explicitly in your app. With Foundry, that same loop must still exist Foundry only centralizes decision-making, not execution.

Current state of documentation

  • The existing quickstarts correctly describe function calling conceptually, but there is no end‑to‑end Voice Live + Foundry + backend execution sample yet.
  • No native Foundry mechanism currently invokes tools automatically or resumes the agent without the application returning the function result.

Your proposed goal architecture is correct it just needs the backend orchestration layer today.

Hopefully this clarifies the underlying mechanism and why responses never return in your current setup. Do let me know if you have any further queries.


If this answers your query, please do click Accept Answer and Yes for was this answer helpful.

Thankyou!

Was this answer helpful?


1 additional answer

Sort by: Most helpful
  1. AI answer

    2026-02-03T11:49:57.81+00:00

    Yes, it is possible to combine the capabilities of a Microsoft Foundry Agent with the Voice Live API to perform function calling during a real-time Voice Live session. In this setup, the Foundry agent can return function call requests that the application code must execute.

    The Voice Live client detects the Foundry agent’s function call requests by integrating the function calling capabilities into the real-time voice session. When the agent identifies that a function needs to be called, it will send the request to the application, which is responsible for executing the function and returning the results back into the session. This allows for dynamic interactions where the agent can leverage back-end functions while maintaining a conversational flow with the user.

    AI-generated content may be incorrect. Read our transparency notes for more information.

    Was this answer helpful?

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.