An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
Hello Sanjeet !
Thank you for posting on Microsoft Learn.
This is a known limitation in background job handling for Azure OpenAI Responses API (and similarly in OpenAI own platform APIs), cancellation requests do not guarantee immediate or complete termination of processing, especially for background jobs using stream: false.
Is it known that background jobs may complete after being cancelled?
Yes, this is known and expected behavior cancellation is "best effort" :
- the backend does not forcibly interrupt GPU/CPU execution mid-process, especially in non-streamed (long-run) tasks
- if cancellation is requested after the job is already picked up for execution, it might still complete
What are the guarantees around cancellation?
There are no hard guarantees that:
- the task will be terminated immediately
- the task won’t complete and return a response
- the backend resources will be freed right away
In short, cancellation is advisory, and due to backend constraints, some jobs will still run to completion.
Should clients ignore completed results of previously cancelled tasks?
Yes, your client-side logic should maintain a job-state map, for example:
{
"job_id": "abc123",
"status": "cancelled"
}
Then, if a later request to GET /jobs/abc123 returns "completed":
- you should discard the payload if your internal state marks it cancelled
- this ensures consistency from the user’s perspective
Is there a race condition between cancel and actual termination?
Yes, a race condition can occur:
- if the cancellation request reaches the backend after the job has been assigned to a worker (especially with large prompts), it may be too late
- status propagation (from "running" → "cancelled" or → "completed") may also lag slightly