An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
Hi ,
Thanks for reaching out to Microsoft Q&A.
In Azure OpenAI, you can chain turns using previous_response_id, and the service maintains the conversation state for you, but all prior messages still count toward input tokens for billing and limits. Azure does not provide a standalone token-estimation endpoint similar to OpenAI’s /responses/input_tokens; instead, token usage is reported only after the call in the response metadata. Likewise, Azure’s implementation of the responses API does not currently expose OpenAI’s server-side context compaction controls (such as context_management), so if you want history trimming or summarisation you must implement that logic in your application layer.
https://learn.microsoft.com/en-in/azure/foundry/openai/how-to/responses
Please 'Upvote'(Thumbs-up) and 'Accept' as answer if the reply was helpful. This will be benefitting other community members who face the same issue.