Production AKS Cluster Stuck in Failed State - KubernetesAPICallFailed Error
Our production AKS cluster has been in a Failed state for over 7 hours and all recovery attempts have been unsuccessful. The cluster is completely inaccessible and blocking all operations.
Timeline of Events
- 02 Oct 2025, 05:45:52 UTC: Initial operation failure began
- 02 Oct 2025, 07:32:05 UTC: Start operation failed with KubernetesAPICallFailed error
- Current time: Cluster remains in Failed state (7+ hours of downtime)
Error DetailsOur production AKS cluster has been in a Failed state for over 7 hours and all recovery attempts have been unsuccessful. The cluster is completely inaccessible and blocking all operations.
{
"status": "Failed",
"error": {
"code": "ResourceOperationFailure",
"message": "The resource operation completed with terminal provisioning state 'Failed'.",
"details": [{
"code": "KubernetesAPICallFailed",
"message": "failed to get crd overlayextensionconfigs.acn.azure.com"
}]
}
}
Correlation ID: 051f299c-dc47-4837-8e60-9be8b26bb2e2 Recovery Attempts Made We have attempted the following recovery actions without success:
- Cluster Start/Stop Operations
az aks stop- Failed -
az aks start- Failed with KubernetesAPICallFailed - Update Operations
az aks update- Returns KubernetesAPICallFailed -
az resource update- Returns KubernetesAPICallFailed - Upgrade Attempt
az aks upgrade- Failed with "OperationNotAllowed: Upgrades are disallowed while cluster is in a failed state" - Operation Abort
az aks operation-abort- Failed with "OperationNotAllowed: Cancel operation is not allowed when ProvisioningState is Failed" - __REST API Reconciliation__Attempted PATCH operations via Azure REST API to force reconciliation
- Attempted to disable network policies to bypass the CRD issue
Current Status
Cluster Operation Status: Failed
Provisioning State: Failed
Power State: Unknown/Failed
API Server: Unreachable
Azure Kubernetes Service
1 réponse
Trier par : Le plus utile
-
Supprimé
Cette réponse a été supprimée en raison d’une violation de notre Code de conduite. La réponse a été signalée manuellement ou identifiée via la détection automatisée avant que l’action ne soit entreprise. Pour obtenir plus d’informations, veuillez consulter notre Code de conduite.
Les commentaires ont été désactivés. En savoir plus