For your ease, I am putting my suggestion here as well-
Hi Diptesh,
I tried to mimic your setup, and I think I figured out the root cause and found the fix for it.
The issue you're facing where the pod pulls the container image successfully from ACR but fails with repeated restarts and the message Back-off restarting failed container
is not due to an image pull error, but due to a runtime crash inside the container.
Why the container crashes
Ans- Your Docker container expects a required environment variable (in your case, AZP_TOKEN
) to authenticate and start a DevOps agent inside the container. If this token is missing or invalid, the container exits immediately, causing the BackOff
crash loop you’re seeing.
I built and pushed the image acrsss.azurecr.io/acrrepo:20250419.39
to Azure Container Registry.
Deployed it to AKS using a Deployment.yaml
with a secret named azp-secret
containing a dummy token (invalid-token-xyz
). My container’s start.sh
script checked if the token was valid. Since it was not, it exited with code 1 and od logs showed-
[error] Missing or invalid AZP_TOKEN. Agent will not start.
How to fix it?
Ans- You need to make sure that the correct Azure DevOps PAT token is stored in the Kubernetes secret.
First check what is the value?
kubectl get secret azp-secret -o yaml
If incorrect then fix it using-
kubectl create secret generic azp-secret \
--from-literal=AZP_TOKEN=<your-valid-pat-token> \
--dry-run=client -o yaml | kubectl apply -f -
Please note, that the PAT must have at least Agent Pools (read, manage) permissions.
Ensure your Deployment YAML refers to this secret
env:
- name: AZP_TOKEN
valueFrom:
secretKeyRef:
name: azp-secret
key: AZP_TOKEN
Do a fresh kubectl rollout restart deployment aks-agent-deployment
Now check, it should work
kubectl get pods