At this time, the Activity Logs only show a generic backend error:
“Upgrade Failed with status Unspecified, error: Unknown error”
The failures are tied to the automated Create or Update Agent Pool operations initiated by Microsoft.ContainerService during the scheduled weekly node image upgrade process.
Based on the current findings:
- The Kubernetes control plane itself was not impacted and remains healthy on version
1.35.0
- These failures indicate that the node image upgrade process did not complete successfully for the affected node pools
- Since the node image upgrade failed, the latest weekly node OS/security image updates were most likely not applied to those nodes during this maintenance cycle
- Existing nodes continue running on their previous validated node image, so workloads should remain operational, but the latest security and bug-fix patches may not yet be present
Possible causes for this type of AKS node image upgrade failure commonly include:
- Temporary Azure backend/platform issues
- Node drain failures caused by workloads or Pod Disruption Budgets (PDBs)
- Insufficient surge capacity during upgrade
- VM allocation/SKU availability constraints in the region
- Transient agent pool reconciliation failures
Recommended next steps:
- Retry the node image upgrade manually from the AKS Upgrade/Node Pool section or via CLI
- Review node pool events and upgrade operations in Azure Monitor / Activity Logs
- Verify if any strict PDBs or workload constraints prevented node draining
Confirm current node image versions using:
or
- If the issue persists, raise a Microsoft support case with correlation IDs from the failed operations for backend investigation