An Azure service that integrates speech processing into apps and services.
Hi Cem Işık Doğru,
Thanks for the detailed clarification this helps narrow the issue.
At the moment, Microsoft Foundry Agents do not natively execute functions/tools on your behalf, whether the interaction is text-based or Voice Live. This is an important architectural distinction.
How function calling actually works today
- A Foundry Agent can decide to call a function and will emit a function-call payload (tool invocation) as part of its response.
- Foundry does not execute the function. There is no built-in runtime inside Foundry that invokes Azure Functions, webhooks, or backend APIs automatically.
- Execution responsibility always lives in your application/backend, regardless of whether the input comes from:
- Text chat
- Voice Live API
- Realtime streaming scenarios
- Voice Live API
- Text chat
Expected lifecycle
- Voice Live captures speech and sends transcribed text to your application.
- Your application forwards the input to the Foundry Agent.
- The Agent returns either:
- Normal assistant content, or
- A function call request (tool name + arguments).
- Your application must:
- Detect the function call in the agent response
- Execute the function (API, service, Azure Function, etc.)
- Send the function result back to the Agent as a follow-up turn
- The Agent then continues the conversation, which Voice Live can speak back to the user.
Voice Live–specific note
Voice Live does not change this flow. It does not automatically “listen” for Foundry tool calls or execute them. If the function result is never returned to the Agent, the session can appear to hang which matches the behavior you’re seeing.
Why direct Azure OpenAI works but Foundry Agent stalls
When calling Azure OpenAI directly, you’re likely handling the function loop explicitly in your app. With Foundry, that same loop must still exist Foundry only centralizes decision-making, not execution.
Current state of documentation
- The existing quickstarts correctly describe function calling conceptually, but there is no end‑to‑end Voice Live + Foundry + backend execution sample yet.
- No native Foundry mechanism currently invokes tools automatically or resumes the agent without the application returning the function result.
Your proposed goal architecture is correct it just needs the backend orchestration layer today.
Hopefully this clarifies the underlying mechanism and why responses never return in your current setup. Do let me know if you have any further queries.
If this answers your query, please do click Accept Answer and Yes for was this answer helpful.
Thankyou!