An Azure service that provides serverless Kubernetes, an integrated continuous integration and continuous delivery experience, and enterprise-grade security and governance.
We tried to reproduce the issue on our side and, in our testing, the NICs were deleted correctly along with the VMs when using Node Auto Provisioner (NAP). Here’s what we did:
- Enabled NAP (
--node-provisioning-mode Auto) - Tainted existing nodes, deployed 25 high-resource pods
- NAP provisioned Karpenter VM (
aks-default-4f25g,karpenter.sh/nodepool=default) - Pods ran on Karpenter node
- Scaled deployment to 0
- Karpenter deprovisioned node (~10min)
- VM + NIC cleaned up completely (Portal/CLI confirmed).
Based on our discussion with Nate, this doesn’t appear to be an inherent AKS issue. It seems more likely to occur when node lifecycle ownership becomes unclear, especially if Karpenter-managed nodes are deprovisioned outside of the AKS Node Auto Provisioner path. In those situations, the VM may delete successfully, but dependent resources like NICs might not clean up due to timing, dependency, or policy constraints.
Since we couldn’t reproduce the issue, we’re unable to determine the exact root cause of the stale or unattached NICs from the original scenario, and therefore can’t point to a specific mitigation for that event.
That said, we recommend a few supported ways to detect and manage stale NICs proactively.
- Azure Policy can be used to identify NICs not attached to any VM or VMSS (while excluding Private Endpoint NICs), and with remediation tasks it can help automate controlled cleanup. (see: https://docs.azure.cn/en-us/governance/policy/overview)
- Azure Resource Graph provides a scalable, read-only way to query and list orphaned NICs across subscriptions for review or automation. (see: https://learn.microsoft.com/en-us/azure/governance/resource-graph/concepts/azure-resource-graph-get-list-api)
- Azure Automation runbooks (PowerShell or CLI) can be scheduled to periodically scan for unattached NICs and safely remove them, which is a common approach for ongoing environment hygiene. (see: https://learn.microsoft.com/en-us/azure/automation/automation-runbook-types?tabs=lps74%2Cpy10)