Get rid of AKS pods stuck in terminating state

Question

Get rid of AKS pods stuck in terminating state

Reshma Nair 140

I have an AKS cluster set up with Terraform and Kubernetes deployments managed by Helm charts. During a Helm update, new pods are provisioned, but the old ones enter a terminating state and get stuck there as shown below.

User's image

It does not affect the functioning of my cluster, but having numerous pods stuck in a terminating state is not development friendly.

I am aware that it can be deleted manually with the command provided, but I require an automatic method to do so.

kubectl delete pod --grace-period=0 --force

I tried the following ways to find why the pods are entering to a terminating state but no luck.

No errors found in my deployment helm

Checked pod logs and nothing specific than below

state:

  terminated:

    containerID: containerd://bd887f75ac7aa7d6173fb25dc58eae6bd486daf87a0d305579da99d0a807789f

    exitCode: -1073741510

    finishedAt: "2024-05-21T07:36:00Z"

    reason: Error

    startedAt: "2024-05-13T13:51:32Z"

hostIP: phase: Failed podIP: podIPs:

ip:

qosClass: BestEffort startTime: "2024-05-13T13:51:09Z"

No finalizers found
Set pod.Spec.TerminationGracePeriodSeconds to 60 s

I am not sure whether the issue lies with the AKS cluster creation, the deployment helm, or if there's an alternative method to automatically remove pods in a terminating state.

Any assistance would be greatly appreciated. Thanks in advance.

1 answer

Your answer

Answer 1

Goncalo Correia 351 Microsoft Employee

Hi Reshma,

Thanks for posting your question here!

This could be happening for multiple reasons, either way the automatical removal of those pods/containers is/should be made by the Garbage Collector: https://kubernetes.io/docs/concepts/architecture/garbage-collection/

You can review the kubelet logs on the nodes where those pods were running and try to figure out why the containers are not being removed or not found.

You might also need to follow-up with the logs from the container runtime.

Hope this helps!

Best Regards

Reshma Nair 140 Reputation points

2024-05-21T13:41:36.6766667+00:00

Can a Cron Job be scheduled for removing the terminating pods? Is that a good method?
Goncalo Correia 351 Reputation points Microsoft Employee

2024-05-21T16:32:57.1166667+00:00

You can workaround the issue with something like that, yes.
However I think its worth investigating further on why the GC is not cleaning those, before moving to a workaround like that.
You may open a support ticket to help you on troubleshooting this further,

Share via

Get rid of AKS pods stuck in terminating state

1 answer

Your answer