Hi Sriram Vasan
To configure and access the cache for Azure AI Foundry agents' responses, you can enable semantic caching in Azure API Management.
Here are some steps:
- Add the
azure-openai-semantic-cache-lookup
policy to check the cache before sending requests. You need to specify theembeddings-backend-id
attribute for the Embeddings API backend you created.<azure-openai-semantic-cache-lookup score-threshold="0.8" embeddings-backend-id="embeddings-deployment" embeddings-backend-auth="system-assigned" ignore-system-messages="true" max-message-count="10"> <vary-by>@(context.Subscription.Id)</vary-by> </azure-openai-semantic-cache-lookup>
- Add the azure-openai-semantic-cache-store policy to store responses for future reuse.
3.To verify that semantic caching is functioning correctly, you can trace a test Completion or Chat Completion operation using the test console in the portal. Check if the cache was utilized on subsequent tries by inspecting the trace.<azure-openai-semantic-cache-store duration="60" />
Kindly refer below link: configure-semantic-caching-policies
Thank You.