@Vignesh Murugan , Thank you for your question.
The kube-controller-manager option --terminated-pod-gc-threshold
defines the number of terminated pods that can exist before the terminated pod garbage collector starts deleting terminated pods. If <= 0, the terminated pod garbage collector is disabled. Reference.
As you correctly pointed out, since the control plane of an AKS cluster is managed by Microsoft, at the time of writing, it is not possible to enable this flag.
There can be different approaches to working around this with varying use cases and varying degrees of complexity. For instance, one might build a custom Kubernetes controller which runs outside the control plane and watches for Events on the API Server and performs certain actions based on the defined logic.
Or, if most of your Pods which have completed execution and are not removed automatically, are managed by a higher level controller, such as CronJobs, then the Jobs can be cleaned up by CronJobs based on the specified capacity-based cleanup policy.
TTL mechanism for finished Jobs
FEATURE STATE: Kubernetes v1.21 [beta]
Another way to clean up finished Jobs (either Complete
or Failed
) automatically is to use a TTL mechanism provided by a TTL controller for finished resources, by specifying the .spec.ttlSecondsAfterFinished
field of the Job.
When the TTL controller cleans up the Job, it will delete the Job cascadingly, i.e. delete its dependent objects, such as Pods, together with the Job. Note that when the Job is deleted, its lifecycle guarantees, such as finalizers, will be honored.
Here is yet another approach that might interest you. With this approach we shall:
- Create a bash script that checks for Pods which have Succeeded or Failed
pod.status.phase
and deletes them from inside a Kubernetes Pod. Read more on Pod Status here - Create a docker image to continuously run this script every one minute
- Push the image to a container registry
- Create a Namespace, Service Account, Clusterrole and Clusterrolebinding before we deploy the solution. The Service Account will be granted permission to get,list and delete pods on the cluster scope defined in the Clusterrole through the Clusterrolebinding. We will mount this Service Account on the pods of our Deployment in the next step to enable the pods to access the required Kubernetes APIs.
- Create a Deployment in the AKS with the aforementioned docker image and using the service account created in the previous step.
Creating the bash script
- Create a fresh directory on your client machine and change the present working directory to this newly created directory:
mkdir directory-name cd directory-name
- Create a
pod-gc-script.sh
file with the following content:APISERVER=https://kubernetes.default.svc # Path to ServiceAccount token SERVICEACCOUNT=/var/run/secrets/kubernetes.io/serviceaccount # Read this Pod's namespace NAMESPACE=$(cat ${SERVICEACCOUNT}/namespace) # Read the ServiceAccount bearer token TOKEN=$(cat ${SERVICEACCOUNT}/token) # Reference the internal certificate authority (CA) CACERT=${SERVICEACCOUNT}/ca.crt #List the Pods with pod.status.phase in Succeeded or Failed #If you want to add list the pods with more pod.status.phase values please add or .status.phase=="Failed" to the select function of jq curl --cacert ${CACERT} --header "Authorization: Bearer ${TOKEN}" -X GET ${APISERVER}/api/v1/pods | jq '[.items[] | select (.status.phase=="Succeeded" or .status.phase=="Failed") | .metadata | {name,namespace}]' >/test.json #Delete the listed pods from the last step for(( i=0 ; i < $(jq '.|length' /test.json) ; i++ )) ; do curl --cacert ${CACERT} --header "Authorization: Bearer ${TOKEN}" -X DELETE ${APISERVER}/api/v1/namespaces/$(jq ".[$i].namespace" /test.json | sed 's/\"//g')/pods/$(jq ".[$i].name" /test.json | sed 's/\"//g') ; done
Create a docker image
- Create a file named
Dockerfile
in the same working directory with the following contents:FROM centos:7 RUN yum install epel-release -y RUN yum update -y RUN yum install jq -y COPY ./pod-gc-script.sh /pod-gc-script.sh RUN chmod +x /pod-gc-script.sh # The command set will iterate every 1 minute. If you want to change the interval please set the sleep command accordingly CMD ["/bin/bash", "-c", "while :; do /pod-gc-script.sh;sleep 60;done"]
- Build the docker image using:
docker build -t <your-registry-server>/<your-repository-name>:<your-tag> .
Push the image to a container registry
- Login to your container registry. [Reference]
- Push the docker image to your container registry using:
docker push <your-registry-server>/<your-repository-name>:<your-tag>
Create a Namespace, Service Account, Clusterrole and Clusterrolebinding
In the AKS cluster,
- Create a namespace like:
kubectl create ns pod-gc
- Create a Service Account in the namespace like:
kubectl create sa pod-gc -n pod-gc
- Create a Clusterrole like:
kubectl create clusterrole pod-gc-clusterrole --resource=pods,pods/status --verb=get,list,delete
- Create a Clusterrolebinding like:
kubectl create clusterrolebinding pod-gc-clusterrolebinding --clusterrole pod-gc-clusterrole --serviceaccount pod-gc:pod-gc
Create Deployment
Create the deployment on the AKS cluster like:
cat <<EOF | kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: pod-gc
name: pod-gc
spec:
replicas: 1
selector:
matchLabels:
app: pod-gc
template:
metadata:
labels:
app: pod-gc
spec:
containers:
- image: <your-registry-server>/<your-repository-name>:<your-tag>
name: pod-gc
serviceAccountName: pod-gc
EOF
Hope this helps.
Please "Accept as Answer" if it helped, so that it can help others in the community looking for help on similar topics.