Cosmos DB SQL API account unreachable — TCP 443 fails, SDK returns 503/20003, ActivityId = 00000000-0000-0000-0000-000000000000

Dave Thaler 20 Reputation points
2025-12-19T18:54:39.4+00:00

I appear to be experiencing a complete data‑plane outage on my Azure Cosmos DB SQL API account named orcasound (WestUS2).

All SDK operations fail with:

503 ServiceUnavailable (Substatus 20003)  
GatewayStoreClient Request Timeout  
ActivityId: 00000000-0000-0000-0000-000000000000

The ActivityId is always all zeroes, which I understand indicates the request never reaches the Cosmos DB gateway. To verify this is not an SDK or client issue, I tested connectivity directly.

1. Test-NetConnection:

Test-NetConnection -ComputerName orcasound.documents.azure.com -Port 443

Result:

TcpTestSucceeded : False

2. curl test:

curl -v https://orcasound.documents.azure.com/

cmd.exe result:

* Host orcasound.documents.azure.com:443 was resolved.
* IPv6: (none)
* IPv4: 40.64.135.2
*   Trying 40.64.135.2:443...
* connect to 40.64.135.2 port 443 from 0.0.0.0 port 63195 failed: Timed out
* Failed to connect to orcasound.documents.azure.com port 443 after 21086 ms: Could not connect to server
* closing connection #0
curl: (28) Failed to connect to orcasound.documents.azure.com port 443 after 21086 ms: Could not connect to server

powershell result:

VERBOSE: GET with 0-byte payload
curl : Unable to connect to the remote server
At line:1 char:1
+ curl -v https://orcasound.documents.azure.com/
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : InvalidOperation: (System.Net.HttpWebRequest:HttpWebRequest) [Invoke-WebRequest], WebException
    + FullyQualifiedErrorId : WebCmdletWebResponseException,Microsoft.PowerShell.Commands.InvokeWebRequestCommand

It seems that the data‑plane endpoint for my account is not accepting TCP connections from any network.

Additional details:

  • DNS resolutions succeed
  • Resource Health shows “Available”
  • RU consumption is normal (~62%)
  • Direct mode also fails
  • Gateway mode fails
  • Trying from different networks fail
  • No firewall restrictions (public access = All networks)
  • No service health events

The control plane seems healthy, but the data plane appears to be unreachable.

Any hints?

Azure Cosmos DB
Azure Cosmos DB
An Azure NoSQL database service for app development.
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Pilladi Padma Sai Manisha 920 Reputation points Microsoft External Staff Moderator
    2025-12-19T19:24:14.4266667+00:00

    Hi Dave Thaler,
    it sounds like you're experiencing a tough situation with your Azure Cosmos DB instance being unreachable. This can be frustrating! Given the diagnostics you've already run, here are some steps and things to consider that might help resolve the issue:

    1. Check Firewall Rules: Since you've noted that public access is set to "All networks", confirm there are no firewall restrictions specifically blocking traffic at the network level. Even if the settings in Azure are correct, other security appliances could potentially block necessary ports.
    2. Verify Required Ports: As per the documentation, when using Direct mode, ensure that the required TCP port range (10000-20000) is open. If using Gateway mode, ensure port 443 is accessible. Run a test to check if all specified ports are properly open.
    3. Network Configuration: Since you mentioned testing from different networks and still facing issues, consider testing connectivity with a VPN set to a different location. This can help rule out any specific network restrictions.
    4. SDK Update: If you're using an older version of the SDK, consider updating to the latest version. This can resolve compatibility issues or bugs that might be causing connectivity problems.
    5. Switch Policies: If you're not already, try switching from Direct mode to Gateway mode to see if it improves the situation. You can adjust the client configuration as follows:
      
         CosmosClient client = new CosmosClient(connectionString, new CosmosClientOptions
      
         {
      
             ConnectionMode = ConnectionMode.Gateway
      
         });
      
      
    6. Check for Resource Exhaustion: Review CPU and memory usage on your client machine. High resource utilization can lead to connection issues. If CPU usage is over 70%, consider scaling up the resources.
    7. Review Azure Service Health: Even though you've mentioned Resource Health shows "Available", check the Azure status page for any ongoing incidents that may not specifically relate to your account.
    8. Retry Logic: Implement exponential backoff retry logic in your application for transient errors to manage connection dips better.

    If the problem persists, here are some follow-up questions that might give more insight into the situation:

    • Can you confirm any recent changes made in your Azure account or network configurations?
    • Are any additional security features, like VPN or proxies, implemented that may impact traffic?
    • Have you tried accessing the Cosmos DB instance through a different SDK or tool, and were there any differences in behavior?

    Hope this helps! Let me know if you need further assistance!


Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.