How to fix Azure Machine Learning extension deployment in AKS?
Derick John
0
Reputation points
I ran the below command in Azure cloud shell:
az k8s-extension create --name test-ml-extension --extension-type Microsoft.AzureML.Kubernetes --config enableInference=True allowInsecureConnections=True inferenceRouterServiceType=LoadBalancer --cluster-type managedClusters --cluster-name test-kubernetes28a922149 --resource-group training-vm_group --scope cluster
And got the following error:
(ExtensionOperationFailed) The extension operation failed with the following error: Error: [ InnerError: [Helm installation failed : : InnerError [release test-ml-extension failed, and has been uninstalled due to atomic being set: failed pre-install: 1 error occurred:
* pod healthcheck failed
]]] occurred while doing the operation : [Create] on the config, For general troubleshooting visit: https://aka.ms/k8s-extensions-TSG, For more application specific troubleshooting visit: Troubleshooting: https://aka.ms/arcmltsg.
Code: ExtensionOperationFailed
Message: The extension operation failed with the following error: Error: [ InnerError: [Helm installation failed : : InnerError [release test-ml-extension failed, and has been uninstalled due to atomic being set: failed pre-install: 1 error occurred:
* pod healthcheck failed
]]] occurred while doing the operation : [Create] on the config, For general troubleshooting visit: https://aka.ms/k8s-extensions-TSG, For more application specific troubleshooting visit: Troubleshooting: https://aka.ms/arcmltsg.
So I ran the following command to Troubleshoot:
kubectl describe configmap -n azureml arcml-healthcheck
And got back this:
Name: arcml-healthcheck
Namespace: azureml
Labels:
Azure Kubernetes Service
Azure Kubernetes Service
An Azure service that provides serverless Kubernetes, an integrated continuous integration and continuous delivery experience, and enterprise-grade security and governance.
2,447 questions
Sign in to answer