Hello HAL9000,
Welcome to the Microsoft Q&A and thank you for posting your questions here.
I understand that your Agent Service in Sweden Central are out of service.
For clarity about timeframe confusion (27th vs 28–29th), status page lag vs user experience and underestimation of downstream service coupling (OpenAI > Foundry Agents) are misconception.
In Sweden Central, Foundry Agent Service v1 shows agents missing and creation returns HTTP 500, persisting after the Jan 27 mitigation window. Therefore, we must (a) restore working agents, (b) recover/validate metadata, and (c) prevent recurrence with EU‑resident resilience.
If you need immediate restore service for production workloads then:
- Create/enable paired deployments (models + Agents) in West Europe, that route traffic there now. This is fully aligned with Azure’s own outage advice to route to alternative regions during Sweden Central incidents. - https://rssfeed.azure.status.microsoft/en-us/status/feed/
- Confirm the Persistent Agents or Agents v1 client points at the West Europe project base URL and re‑attempt agent listing/creation. Foundry Agents SDK and docs show how to bind to a specific project endpoint. - https://learn.microsoft.com/en-us/agent-framework/user-guide/agents/agent-types/azure-ai-foundry-agent
The traffic will be unblocked in West Europe immediately, removing Sweden Central from the critical path while you repair it.
Also, you can recover Sweden Central Agent Service v1 state (the “agents disappeared” issue) by implementing force a control‑plane refresh and re‑enumeration of agents, re‑attach downstream resources/tools and re‑save each agent configuration, and if agents still don’t list or load cleanly, re‑create from source of truth.
In addition, you can make recurrence operationally harmless by implementing an Active‑Active within the EU (West Europe + North Europe or West Europe) to keep redundant Agents + model deployments in two EU regions and front them with Azure Front Door or Traffic Manager using health probes and weighted/failover routing. This is directly consistent with Azure’s own workaround guidance (route to alternative regions during OpenAI incidents) and industry best practice highlighted after the Jan 27 outage. - https://rssfeed.azure.status.microsoft/en-us/status/feed/
I hope this is helpful! Do not hesitate to let me know if you have any other questions or clarifications.
Please don't forget to close up the thread here by upvoting and accept it as an answer if it is helpful.