Share via

production down

Dani Anjaya 0 Reputation points
2026-06-12T02:19:06.7266667+00:00

"Face API resource 'rise-try-me' was moved from subscription 12e86485... to b657b2a6... Resource shows Active/Succeeded, NetworkRuleSet DefaultAction=Allow, PublicNetworkAccess=Enabled, but all data-plane requests to the endpoint are connection-reset at the AI gateway with no HTTP response (HTTP/2 stream CANCEL / TCP RST after request sent). Suspect gateway hostname-to-backend mapping not re-registered after subscription move. Production service down."

Azure AI Face
Azure AI Face

An Azure service that provides artificial intelligence algorithms that detect, recognize, and analyze human faces in images.


3 answers

Sort by: Most helpful
  1. Dani Anjaya 0 Reputation points
    2026-06-12T02:55:53.2266667+00:00

    All three re-sync methods attempted — issue persists, suspect data-plane suspension

    Hi, thank you for the suggestions. I have now attempted all three of the recommended synchronization methods, and unfortunately, none of them restored data-plane connectivity:

    1. Metadata modification — Added a new resource tag via Update-AzTag (confirmed applied: tag visible on the resource). No change after propagation wait.
    2. Network rule toggle — Switched NetworkRuleSet DefaultAction from Allow → Deny, waited 90 seconds, then reverted to Allow (confirmed: DefaultAction: Allow, verified via Get-AzCognitiveServicesAccount). No change.
    3. Key regeneration — Regenerated Key2 via New-AzCognitiveServicesAccountKey (completed successfully). No change.

    All three operations succeeded at the control plane, but data-plane routing was not restored.

    Current symptoms remain exactly as before:

    • Resource: Face API "rise-try-me" (RG: mcpp-purchase, Southeast Asia), state Active/Succeeded, SKU S0, PublicNetworkAccess: Enabled, DefaultAction: Allow
    • DNS resolves correctly through the AI gateway chain (*.ai-gateway.southeastasia-01.azure-api.net → Traffic Manager → regional APIM)
    • TLS handshake completes, request is fully sent, then the connection is reset with no HTTP response (HTTP/2: stream CANCEL err 8; HTTP/1.1: TCP RST / "Connection reset by peer")
    • Requests with an intentionally invalid subscription key are also reset — no 401 is returned, which indicates the gateway is not routing the request to the backend at all (authentication is never reached)
    • The same behavior existed before the resource was moved cross-subscription (from 12e86485-... to b657b2a6-...), so the block followed the resource through the move
    • Additionally, a newly created Face resource ("rise-prod") in the destination subscription has been stuck in "Creating" provisioning state for over an hour, which may indicate a related backend/regional issue

    Given that control-plane operations succeed but no data-plane request ever reaches the backend (not even far enough to fail authentication), this looks like a gateway backend-mapping failure or a data-plane suspension that cannot be resolved from the customer side.

    This is a production service with enrolled customer face data, currently down. Could you please advise whether this can be escalated for backend investigation, or confirm that a Microsoft support ticket is the appropriate next step? I have the full diagnostic timeline available if needed.All three re-sync methods attempted — issue persists, suspect data-plane suspension

    Hi, thank you for the suggestions. I have now attempted all three of the recommended synchronization methods, and unfortunately none of them restored data-plane connectivity:

    1. Metadata modification — Added a new resource tag via Update-AzTag (confirmed applied: tag visible on the resource). No change after propagation wait.
    2. Network rule toggle — Switched NetworkRuleSet DefaultAction from Allow → Deny, waited 90 seconds, then reverted to Allow (confirmed: DefaultAction: Allow, verified via Get-AzCognitiveServicesAccount). No change.
    3. Key regeneration — Regenerated Key2 via New-AzCognitiveServicesAccountKey (completed successfully). No change.

    All three operations succeeded at the control plane, but data-plane routing was not restored.

    Current symptoms remain exactly as before:

    • Resource: Face API "rise-try-me" (RG: mcpp-purchase, Southeast Asia), state Active/Succeeded, SKU S0, PublicNetworkAccess: EnabledDefaultAction: Allow
    • DNS resolves correctly through the AI gateway chain (*.ai-gateway.southeastasia-01.azure-api.net → Traffic Manager → regional APIM)
    • TLS handshake completes, request is fully sent, then the connection is reset with no HTTP response (HTTP/2: stream CANCEL err 8; HTTP/1.1: TCP RST / "Connection reset by peer")
    • Requests with an intentionally invalid subscription key are also reset — no 401 is returned, which indicates the gateway is not routing the request to the backend at all (authentication is never reached)
    • The same behavior existed before the resource was moved cross-subscription (from 12e86485-... to b657b2a6-...), so the block followed the resource through the move
    • Additionally, a newly created Face resource ("rise-prod") in the destination subscription has been stuck in "Creating" provisioning state for over an hour, which may indicate a related backend/regional issue

    Given that control-plane operations succeed but no data-plane request ever reaches the backend (not even far enough to fail authentication), this looks like a gateway backend-mapping failure or a data-plane suspension that cannot be resolved from the customer side.

    This is a production service with enrolled customer face data, currently down. Could you please advise whether this can be escalated for backend investigation, or confirm that a Microsoft support ticket is the appropriate next step? I have the full diagnostic timeline available if needed.

    Was this answer helpful?

    0 comments No comments

  2. KUM 0 Reputation points
    2026-06-12T02:49:27.7833333+00:00

    Same for OCR document, production is down also

    Was this answer helpful?

    0 comments No comments

  3. Jose Benjamin Solis Nolasco 8,401 Reputation points Volunteer Moderator
    2026-06-12T02:32:18.0966667+00:00

    Welcome to Microsoft Q&A

    Hello Dani Anjaya, I hope you are doing well

    Based on the symptoms described, the behavior could indicate a potential synchronization latency or gap between the Azure Resource Manager (ARM) control plane and the data-plane gateway routing infrastructure following a cross-subscription migration. While ARM reflects an updated state (Active/Succeeded)

    To address this condition, administrators frequently choose to trigger a manual state refresh to push a new configuration payload to the resource provider. If you decide to proceed with troubleshooting, the following optional methods are common practices for forcing a synchronization event:

    • Key Regeneration: You may choose to navigate to Keys and Endpoint and regenerate an access key. If selected, please note that your consuming applications must be manually updated with the new key string to prevent authentication failures.
    • Network Rule Toggle: You may opt to navigate to Networking and temporarily adjust the firewall rules (e.g., switching from All networks to Selected networks, saving the change, and then reverting the configuration).
    • Metadata Modification: Alternatively, you may apply a modification to the resource metadata by adding or updating a resource tag within the Tags menu.

    Please evaluate these options carefully against your organization's change management policies and production impact thresholds.

    😊 If my answer helped you resolve your issue, please consider marking it as the correct answer. This helps others in the community find solutions more easily. Thanks!

    Was this answer helpful?

    0 comments No comments

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.