Events
17 Mar, 9 pm - 21 Mar, 10 am
Join the meetup series to build scalable AI solutions based on real-world use cases with fellow developers and experts.
Register nowThis browser is no longer supported.
Upgrade to Microsoft Edge to take advantage of the latest features, security updates, and technical support.
APPLIES TO: All API Management tiers
The azure-openai-semantic-cache-store
policy caches responses to Azure OpenAI Chat Completion API and Completion API requests to a configured external cache. Response caching reduces bandwidth and processing requirements imposed on the backend Azure OpenAI API and lowers latency perceived by API consumers.
Note
Note
Set the policy's elements and child elements in the order provided in the policy statement. Learn more about how to set or edit API Management policies.
The policy is used with APIs added to API Management from the Azure OpenAI Service of the following types:
API type | Supported models |
---|---|
Chat completion | gpt-3.5 gpt-4 gpt-4o1 |
Completion | gpt-3.5-turbo-instruct |
Embeddings | text-embedding-3-large text-embedding-3-small text-embedding-ada-002 |
1 The gpt-4o
model is multimodal (accepts text or image inputs and generates text).
For more information, see Azure OpenAI Service models.
<azure-openai-semantic-cache-store duration="seconds"/>
Attribute | Description | Required | Default |
---|---|---|---|
duration | Time-to-live of the cached entries, specified in seconds. Policy expressions are allowed. | Yes | N/A |
<policies>
<inbound>
<base />
<azure-openai-semantic-cache-lookup
score-threshold="0.05"
embeddings-backend-id ="azure-openai-backend"
embeddings-backend-auth ="system-assigned" >
<vary-by>@(context.Subscription.Id)</vary-by>
</azure-openai-semantic-cache-lookup>
</inbound>
<outbound>
<azure-openai-semantic-cache-store duration="60" />
<base />
</outbound>
</policies>
For more information about working with policies, see:
Events
17 Mar, 9 pm - 21 Mar, 10 am
Join the meetup series to build scalable AI solutions based on real-world use cases with fellow developers and experts.
Register now