How do I configure and access the cache for azure ai foundry agents' responses?

Question

How do I configure and access the cache for azure ai foundry agents' responses?

Sriram Vasan 20

I am using azure AI foundry and it's SDK to create and make

So I am unable to access the cache or understand how the agents are storing the memory for most frequently asked questions based on a given context or thread.
Using Model : Gpt-4.1
Tools : Bing search.

Accepted answer

1 additional answer

Your answer

Answer 1

Saideep Anchuri 9,425 Microsoft External Staff Moderator

Hi Sriram Vasan

To configure and access the cache for Azure AI Foundry agents' responses, you can enable semantic caching in Azure API Management.

Here are some steps:

Add the azure-openai-semantic-cache-lookup policy to check the cache before sending requests. You need to specify the embeddings-backend-id attribute for the Embeddings API backend you created.

   <azure-openai-semantic-cache-lookup 
   score-threshold="0.8"
    embeddings-backend-id="embeddings-deployment" 
   embeddings-backend-auth="system-assigned" 
   ignore-system-messages="true" 
   max-message-count="10"> <vary-by>@(context.Subscription.Id)</vary-by> 
   </azure-openai-semantic-cache-lookup>

Add the azure-openai-semantic-cache-store policy to store responses for future reuse.
```
   <azure-openai-semantic-cache-store duration="60" />
   
   
```
3.To verify that semantic caching is functioning correctly, you can trace a test Completion or Chat Completion operation using the test console in the portal. Check if the cache was utilized on subsequent tries by inspecting the trace.

Kindly refer below link: configure-semantic-caching-policies

Thank You.

Saideep Anchuri 9,425 Reputation points Microsoft External Staff Moderator

2025-06-10T01:44:54.5833333+00:00

Hi Sriram Vasan

Did you get any chance to check above response.

Thank you.
Sriram Vasan 20 Reputation points

2025-06-10T03:06:30.25+00:00

Hey @Saideep Anchuri
Can this be used with agents? The agents are in preview as of now (10 Jun 25). Is it supported or does the setup only support model API chat completions?
Saideep Anchuri 9,425 Reputation points Microsoft External Staff Moderator

2025-06-10T04:55:48.0133333+00:00

Hi Sriram Vasan

Yes, Azure AI Foundry's Agent Service does support agents, even in its current preview state. Agents in Foundry are designed to make decisions, invoke tools, and participate in workflows, rather than just handling chat completions.

since the service is still in preview, some features—like memory storage and retrieval—may not be fully available yet. Foundry does integrate with Mem0, a memory layer that allows AI applications to persist context across conversations. This could be useful for storing frequently asked questions and maintaining continuity in threads.

Additionally, AgentOps tools provide tracing, evaluation, and monitoring, which might help track how agents process queries. If you're only seeing support for model API chat completions, it could mean that full agent memory capabilities are not yet enabled in the preview.

Kindly refer below link: overvie

azure-ai-services

Thank You.
Sriram Vasan 20 Reputation points

2025-06-11T00:42:59.78+00:00
Hey @Saideep Anchuri ,

So for setting up the semantic cache, I did setup an Azure Managed Redis instance and have also tried working with the API Management.

The embedding models in the azure foundry cannot be used for the semantic caching, is there a way to use it?

For azure foundry even after adding the policies into inbound and outbound, the api management doesn't trigger the cache retrieval even for semantic threshold of 0.2 for same question.

The azure foundry functions like creating , using and deleting agents, threads and messages aren't supported in API gateway.

I am guessing this is because the foundry is still in preview ??
Saideep Anchuri 9,425 Reputation points Microsoft External Staff Moderator

2025-06-11T02:08:27.5133333+00:00

Hi Sriram Vasan

Azure Foundry is still in preview, which means certain functionalities, including semantic caching and API gateway support, may not be fully integrated yet.

Azure Foundry’s embedding models currently cannot be directly used for semantic caching. Azure Managed Redis, with vector search capabilities, can serve as an alternative for storing and retrieving cached responses based on semantic similarity. Despite adding inbound and outbound policies, API Management does not always trigger cache retrieval, likely due to preview limitations. Developers have reported that semantic caching policies in API Management tend to work more effectively with Azure OpenAI models rather than Foundry. Additionally, creating, using, and deleting agents, threads, and messages are currently unsupported in API Gateway, which is likely a consequence of Foundry’s preview status and incomplete API integration. To address these issues, one possible workaround is to rely on Azure Managed Redis for semantic caching rather than Foundry’s embedding models. Implementing cache retrieval logic manually within the application, rather than depending entirely on API Management policies, may help improve caching behavior. Furthermore, monitoring updates from Azure Foundry could be beneficial, as Microsoft may introduce better API Gateway support once Foundry moves out of the preview phase.

Kindly refer below link: semantic-cache

Thank You.
Sriram Vasan 20 Reputation points

2025-06-11T04:20:02.5166667+00:00

Hey Saideep Anchuri,
Is there a timeline on when these preview functions become stable or release versions?
Thank you.
Saideep Anchuri 9,425 Reputation points Microsoft External Staff Moderator

2025-06-11T04:29:52.91+00:00

Hi Sriram Vasan

Unfortunately, there is no specific timeline provided for when these preview functions will transition to stable or release versions.

To stay updated on release timelines, I recommend monitoring Microsoft’s official Azure blog and Build announcements. microsoft-build

Thank You.
Sriram Vasan 20 Reputation points

2025-06-11T08:37:52.54+00:00

Hey @Saideep Anchuri ,
Thanks a lot for your help. Appreciate it.
Saideep Anchuri 9,425 Reputation points Microsoft External Staff Moderator

2025-06-11T08:42:05.89+00:00

Hi Sriram Vasan

Could You please accept the answer.

Thank You.

Answer 2

Deleted

This answer has been deleted due to a violation of our Code of Conduct. The answer was manually reported or identified through automated detection before action was taken. Please refer to our Code of Conduct for more information.

Comments have been turned off. Learn more

Share via

How do I configure and access the cache for azure ai foundry agents' responses?

1 additional answer

Your answer