An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
Hi Yamini Modipalli,
Thank you for reaching out on the Microsoft Q&A.
A sudden spike like this is understandably concerning, especially when you’ve already confirmed it didn’t originate from your application.
At the platform level, Azure OpenAI itself doesn’t expose per‑caller or per‑IP request attribution. If keys are compromised or reused unintentionally, the service will still process those requests and meter usage normally. This means that once usage is observed, the investigation has to rely on Azure-side monitoring and diagnostics, rather than application logs alone.
A few things that may help narrow the scope:
Azure Monitor metrics Review metrics such as Tokens Processed, Requests, and Latency for the affected OpenAI resource. These can sometimes indicate whether the spike was due to a burst of short requests, a smaller number of very large prompts, or sustained traffic over time. Azure OpenAI exposes these metrics but not caller identities.
Diagnostic logs (if enabled) If diagnostic settings were enabled before the spike, you may be able to see request volume and timing patterns exported to Log Analytics, Event Hub, or Storage. If diagnostics were not enabled at the time, unfortunately there’s no historical reconstruction available afterward.
Key regeneration was the right immediate action Rotating and deleting keys is the correct containment step when unexpected usage is detected. This prevents any further unauthorized access going forward, even if the exact source can’t be identified retroactively.
If, after reviewing metrics and logs, the source of the spike is still unclear, do let me know.
For future prevention, consider:
- Enabling diagnostic logs permanently on Azure OpenAI resources
- Using separate keys per environment
- Applying network restrictions or private endpoints, where feasible
Hope this helps provide some direction and thank you for sharing the details.
Thankyou!