AciDeploymentFailed

Mihai Munteanu 1 Reputation point
2022-06-17T11:13:50.947+00:00

Hello,

It's been a few days already since I've been struggling with this error which is not suggesting me anything.
This is the error I receive. Every time I'm trying to access the logs it displays me "None". Also, the init() function is a very basic one. It's the one I've found in your tutorials and while I've followed your tutorials I didn't encounter this bug.

score.py script:

import pandas as pd  
import numpy as np  
import joblib  
import json  
import os  
  
# Called when the service is loaded  
def init():  
    global model  
    # Get the path to the deployed model file and load it  
    model = joblib.load(Model.get_model_path(model_name='aml_live_model_end'))  
  
# Called when a request is received  
def run(raw_data):  
    # Get the input data as a numpy array  
    data = np.array(json.loads(raw_data)['data'])  
    # Get a prediction from the model  
    predictions = model.predict(data)  
    # Get the corresponding classname for each prediction (0 or 1)  
    classnames = ['De avizat', 'De analizat']  
    predicted_classes = []  
    for prediction in predictions:  
        predicted_classes.append(classnames[prediction])  
    # Return the predictions as JSON  
    return json.dumps(predicted_classes)  

.yaml file

name: aml_live_env  
dependencies:  
- python=3.6.2  
- scikit-learn  
- ipykernel  
- matplotlib  
- pandas  
- pip  
- pip:  
  - azureml-defaults  
  - pyarrow  

The error I receive

Deploying model...  
Tips: You can try get_logs(): https://aka.ms/debugimage#dockerlog or local deployment: https://aka.ms/debugimage#debug-locally to debug if deployment takes longer than 10 minutes.  
Running  
2022-06-17 10:52:16+00:00 Creating Container Registry if not exists.  
2022-06-17 10:52:16+00:00 Registering the environment.  
2022-06-17 10:52:17+00:00 Use the existing image.  
2022-06-17 10:52:17+00:00 Generating deployment configuration.  
2022-06-17 10:52:18+00:00 Submitting deployment to compute.  
2022-06-17 10:52:20+00:00 Checking the status of deployment aml-live-service-model..  
2022-06-17 10:54:07+00:00 Checking the status of inference endpoint aml-live-service-model.  
Failed  
Service deployment polling reached non-successful terminal state, current service state: Failed  
Operation ID: 93d48e89-cb16-4d1c-bbb6-f453acaeaa7f  
More information can be found using '.get_logs()'  
Error:  
{  
  "code": "AciDeploymentFailed",  
  "statusCode": 400,  
  "message": "Aci Deployment failed with exception: Your container application crashed. This may be caused by errors in your scoring file's init() function.  
	1. Please check the logs for your container instance: aml-live-service-model. From the AML SDK, you can run print(service.get_logs()) if you have service object to fetch the logs.  
	2. You can interactively debug your scoring file locally. Please refer to https://learn.microsoft.com/azure/machine-learning/how-to-debug-visual-studio-code#debug-and-troubleshoot-deployments for more information.  
	3. You can also try to run image libraaimlwor4c93b458.azurecr.io/azureml/azureml_8fd1decee2b379a3d59fed509c692f31 locally. Please refer to https://aka.ms/debugimage#service-launch-fails for more information.",  
  "details": [  
    {  
      "code": "CrashLoopBackOff",  
      "message": "Your container application crashed. This may be caused by errors in your scoring file's init() function.  
	1. Please check the logs for your container instance: aml-live-service-model. From the AML SDK, you can run print(service.get_logs()) if you have service object to fetch the logs.  
	2. You can interactively debug your scoring file locally. Please refer to https://learn.microsoft.com/azure/machine-learning/how-to-debug-visual-studio-code#debug-and-troubleshoot-deployments for more information.  
	3. You can also try to run image libraaimlwor4c93b458.azurecr.io/azureml/azureml_8fd1decee2b379a3d59fed509c692f31 locally. Please refer to https://aka.ms/debugimage#service-launch-fails for more information."  
    },  
    {  
      "code": "AciDeploymentFailed",  
      "message": "Your container application crashed. Please follow the steps to debug:  
	1. From the AML SDK, you can run print(service.get_logs()) if you have service object to fetch the logs. Please refer to https://aka.ms/debugimage#dockerlog for more information.  
	2. If your container application crashed. This may be caused by errors in your scoring file's init() function. You can try debugging locally first. Please refer to https://aka.ms/debugimage#debug-locally for more information.  
	3. You can also interactively debug your scoring file locally. Please refer to https://learn.microsoft.com/azure/machine-learning/how-to-debug-visual-studio-code#debug-and-troubleshoot-deployments for more information.  
	4. View the diagnostic events to check status of container, it may help you to debug the issue.  
"RestartCount": 5  
"CurrentState": {"state":"Waiting","startTime":null,"exitCode":null,"finishTime":null,"detailStatus":"CrashLoopBackOff: Back-off restarting failed"}  
"PreviousState": {"state":"Terminated","startTime":"2022-06-17T10:56:57.554Z","exitCode":111,"finishTime":"2022-06-17T10:57:01.314Z","detailStatus":"Error"}  
"Events":  
{"count":1,"firstTimestamp":"2022-06-17T10:26:38Z","lastTimestamp":"2022-06-17T10:26:38Z","name":"Pulling","message":"pulling image "libraaimlwor4c93b458.azurecr.io/azureml/azureml_8fd1decee2b379a3d59fed509c692f31@sha256:47fc896a553e4b7bb972cbaa9a31de99b4755688dd525b9c725563ddde86aa0c"","type":"Normal"}  
{"count":1,"firstTimestamp":"2022-06-17T10:27:42Z","lastTimestamp":"2022-06-17T10:27:42Z","name":"Pulled","message":"Successfully pulled image "libraaimlwor4c93b458.azurecr.io/azureml/azureml_8fd1decee2b379a3d59fed509c692f31@sha256:47fc896a553e4b7bb972cbaa9a31de99b4755688dd525b9c725563ddde86aa0c"","type":"Normal"}  
{"count":10,"firstTimestamp":"2022-06-17T10:28:00Z","lastTimestamp":"2022-06-17T10:47:11Z","name":"Started","message":"Started container","type":"Normal"}  
{"count":9,"firstTimestamp":"2022-06-17T10:28:03Z","lastTimestamp":"2022-06-17T10:40:46Z","name":"Killing","message":"Killing container with id a7e717efa63259b36b19bc4951b3f3dcc5f1093177e729c589355a7371353ca3.","type":"Normal"}  
{"count":1,"firstTimestamp":"2022-06-17T10:31:33Z","lastTimestamp":"2022-06-17T10:31:33Z","name":"Pulling","message":"pulling image "libraaimlwor4c93b458.azurecr.io/azureml/azureml_8fd1decee2b379a3d59fed509c692f31@sha256:47fc896a553e4b7bb972cbaa9a31de99b4755688dd525b9c725563ddde86aa0c"","type":"Normal"}  
{"count":1,"firstTimestamp":"2022-06-17T10:32:31Z","lastTimestamp":"2022-06-17T10:32:31Z","name":"Pulled","message":"Successfully pulled image "libraaimlwor4c93b458.azurecr.io/azureml/azureml_8fd1decee2b379a3d59fed509c692f31@sha256:47fc896a553e4b7bb972cbaa9a31de99b4755688dd525b9c725563ddde86aa0c"","type":"Normal"}  
{"count":5,"firstTimestamp":"2022-06-17T10:47:52Z","lastTimestamp":"2022-06-17T10:52:09Z","name":"Started","message":"Started container","type":"Normal"}  
{"count":6,"firstTimestamp":"2022-06-17T10:48:26Z","lastTimestamp":"2022-06-17T10:52:37Z","name":"Killing","message":"Killing container with id 2bdec1005a6dd58312e10ab939d88ea08b312771de6573d9b86c5f571104277e.","type":"Normal"}  
"  
    }  
  ]  
}  
  
---------------------------------------------------------------------------  
WebserviceException                       Traceback (most recent call last)  
<ipython-input-17-315dbb5f83ec> in <module>  
     16 service_name = "aml-live-service-model"  
     17 service = Model.deploy(ws, service_name, [model], inference_config, deployment_config, overwrite=True)  
---> 18 service.wait_for_deployment(True)  
     19 print(service.state)  
  
/anaconda/envs/azureml_py38/lib/python3.8/site-packages/azureml/core/webservice/webservice.py in wait_for_deployment(self, show_output, timeout_sec)  
    916                     logs_response = 'Current sub-operation type not known, more logs unavailable.'  
    917   
--> 918                 raise WebserviceException('Service deployment polling reached non-successful terminal state, current '  
    919                                           'service state: {}\n'  
    920                                           'Operation ID: {}\n'  
  
WebserviceException: WebserviceException:  
	Message: Service deployment polling reached non-successful terminal state, current service state: Failed  
Operation ID: 93d48e89-cb16-4d1c-bbb6-f453acaeaa7f  
More information can be found using '.get_logs()'  
Error:  
{  
  "code": "AciDeploymentFailed",  
  "statusCode": 400,  
  "message": "Aci Deployment failed with exception: Your container application crashed. This may be caused by errors in your scoring file's init() function.  
	1. Please check the logs for your container instance: aml-live-service-model. From the AML SDK, you can run print(service.get_logs()) if you have service object to fetch the logs.  
	2. You can interactively debug your scoring file locally. Please refer to https://learn.microsoft.com/azure/machine-learning/how-to-debug-visual-studio-code#debug-and-troubleshoot-deployments for more information.  
	3. You can also try to run image libraaimlwor4c93b458.azurecr.io/azureml/azureml_8fd1decee2b379a3d59fed509c692f31 locally. Please refer to https://aka.ms/debugimage#service-launch-fails for more information.",  
  "details": [  
    {  
      "code": "CrashLoopBackOff",  
      "message": "Your container application crashed. This may be caused by errors in your scoring file's init() function.  
	1. Please check the logs for your container instance: aml-live-service-model. From the AML SDK, you can run print(service.get_logs()) if you have service object to fetch the logs.  
	2. You can interactively debug your scoring file locally. Please refer to https://learn.microsoft.com/azure/machine-learning/how-to-debug-visual-studio-code#debug-and-troubleshoot-deployments for more information.  
	3. You can also try to run image libraaimlwor4c93b458.azurecr.io/azureml/azureml_8fd1decee2b379a3d59fed509c692f31 locally. Please refer to https://aka.ms/debugimage#service-launch-fails for more information."  
    },  
    {  
      "code": "AciDeploymentFailed",  
      "message": "Your container application crashed. Please follow the steps to debug:  
	1. From the AML SDK, you can run print(service.get_logs()) if you have service object to fetch the logs. Please refer to https://aka.ms/debugimage#dockerlog for more information.  
	2. If your container application crashed. This may be caused by errors in your scoring file's init() function. You can try debugging locally first. Please refer to https://aka.ms/debugimage#debug-locally for more information.  
	3. You can also interactively debug your scoring file locally. Please refer to https://learn.microsoft.com/azure/machine-learning/how-to-debug-visual-studio-code#debug-and-troubleshoot-deployments for more information.  
	4. View the diagnostic events to check status of container, it may help you to debug the issue.  
"RestartCount": 5  
"CurrentState": {"state":"Waiting","startTime":null,"exitCode":null,"finishTime":null,"detailStatus":"CrashLoopBackOff: Back-off restarting failed"}  
"PreviousState": {"state":"Terminated","startTime":"2022-06-17T10:56:57.554Z","exitCode":111,"finishTime":"2022-06-17T10:57:01.314Z","detailStatus":"Error"}  
"Events":  
{"count":1,"firstTimestamp":"2022-06-17T10:26:38Z","lastTimestamp":"2022-06-17T10:26:38Z","name":"Pulling","message":"pulling image "libraaimlwor4c93b458.azurecr.io/azureml/azureml_8fd1decee2b379a3d59fed509c692f31@sha256:47fc896a553e4b7bb972cbaa9a31de99b4755688dd525b9c725563ddde86aa0c"","type":"Normal"}  
{"count":1,"firstTimestamp":"2022-06-17T10:27:42Z","lastTimestamp":"2022-06-17T10:27:42Z","name":"Pulled","message":"Successfully pulled image "libraaimlwor4c93b458.azurecr.io/azureml/azureml_8fd1decee2b379a3d59fed509c692f31@sha256:47fc896a553e4b7bb972cbaa9a31de99b4755688dd525b9c725563ddde86aa0c"","type":"Normal"}  
{"count":10,"firstTimestamp":"2022-06-17T10:28:00Z","lastTimestamp":"2022-06-17T10:47:11Z","name":"Started","message":"Started container","type":"Normal"}  
{"count":9,"firstTimestamp":"2022-06-17T10:28:03Z","lastTimestamp":"2022-06-17T10:40:46Z","name":"Killing","message":"Killing container with id a7e717efa63259b36b19bc4951b3f3dcc5f1093177e729c589355a7371353ca3.","type":"Normal"}  
{"count":1,"firstTimestamp":"2022-06-17T10:31:33Z","lastTimestamp":"2022-06-17T10:31:33Z","name":"Pulling","message":"pulling image "libraaimlwor4c93b458.azurecr.io/azureml/azureml_8fd1decee2b379a3d59fed509c692f31@sha256:47fc896a553e4b7bb972cbaa9a31de99b4755688dd525b9c725563ddde86aa0c"","type":"Normal"}  
{"count":1,"firstTimestamp":"2022-06-17T10:32:31Z","lastTimestamp":"2022-06-17T10:32:31Z","name":"Pulled","message":"Successfully pulled image "libraaimlwor4c93b458.azurecr.io/azureml/azureml_8fd1decee2b379a3d59fed509c692f31@sha256:47fc896a553e4b7bb972cbaa9a31de99b4755688dd525b9c725563ddde86aa0c"","type":"Normal"}  
{"count":5,"firstTimestamp":"2022-06-17T10:47:52Z","lastTimestamp":"2022-06-17T10:52:09Z","name":"Started","message":"Started container","type":"Normal"}  
{"count":6,"firstTimestamp":"2022-06-17T10:48:26Z","lastTimestamp":"2022-06-17T10:52:37Z","name":"Killing","message":"Killing container with id 2bdec1005a6dd58312e10ab939d88ea08b312771de6573d9b86c5f571104277e.","type":"Normal"}  
"  
    }  
  ]  
}  
	InnerException None  
	ErrorResponse   
{  
    "error": {  
        "message": "Service deployment polling reached non-successful terminal state, current service state: Failed\nOperation ID: 93d48e89-cb16-4d1c-bbb6-f453acaeaa7f\nMore information can be found using '.get_logs()'\nError:\n{\n  \"code\": \"AciDeploymentFailed\",\n  \"statusCode\": 400,\n  \"message\": \"Aci Deployment failed with exception: Your container application crashed. This may be caused by errors in your scoring file's init() function.\n\t1. Please check the logs for your container instance: aml-live-service-model. From the AML SDK, you can run print(service.get_logs()) if you have service object to fetch the logs.\n\t2. You can interactively debug your scoring file locally. Please refer to https://learn.microsoft.com/azure/machine-learning/how-to-debug-visual-studio-code#debug-and-troubleshoot-deployments for more information.\n\t3. You can also try to run image libraaimlwor4c93b458.azurecr.io/azureml/azureml_8fd1decee2b379a3d59fed509c692f31 locally. Please refer to https://aka.ms/debugimage#service-launch-fails for more information.\",\n  \"details\": [\n    {\n      \"code\": \"CrashLoopBackOff\",\n      \"message\": \"Your container application crashed. This may be caused by errors in your scoring file's init() function.\n\t1. Please check the logs for your container instance: aml-live-service-model. From the AML SDK, you can run print(service.get_logs()) if you have service object to fetch the logs.\n\t2. You can interactively debug your scoring file locally. Please refer to https://learn.microsoft.com/azure/machine-learning/how-to-debug-visual-studio-code#debug-and-troubleshoot-deployments for more information.\n\t3. You can also try to run image libraaimlwor4c93b458.azurecr.io/azureml/azureml_8fd1decee2b379a3d59fed509c692f31 locally. Please refer to https://aka.ms/debugimage#service-launch-fails for more information.\"\n    },\n    {\n      \"code\": \"AciDeploymentFailed\",\n      \"message\": \"Your container application crashed. Please follow the steps to debug:\n\t1. From the AML SDK, you can run print(service.get_logs()) if you have service object to fetch the logs. Please refer to https://aka.ms/debugimage#dockerlog for more information.\n\t2. If your container application crashed. This may be caused by errors in your scoring file's init() function. You can try debugging locally first. Please refer to https://aka.ms/debugimage#debug-locally for more information.\n\t3. You can also interactively debug your scoring file locally. Please refer to https://learn.microsoft.com/azure/machine-learning/how-to-debug-visual-studio-code#debug-and-troubleshoot-deployments for more information.\n\t4. View the diagnostic events to check status of container, it may help you to debug the issue.\n\"RestartCount\": 5\n\"CurrentState\": {\"state\":\"Waiting\",\"startTime\":null,\"exitCode\":null,\"finishTime\":null,\"detailStatus\":\"CrashLoopBackOff: Back-off restarting failed\"}\n\"PreviousState\": {\"state\":\"Terminated\",\"startTime\":\"2022-06-17T10:56:57.554Z\",\"exitCode\":111,\"finishTime\":\"2022-06-17T10:57:01.314Z\",\"detailStatus\":\"Error\"}\n\"Events\":\n{\"count\":1,\"firstTimestamp\":\"2022-06-17T10:26:38Z\",\"lastTimestamp\":\"2022-06-17T10:26:38Z\",\"name\":\"Pulling\",\"message\":\"pulling image \"libraaimlwor4c93b458.azurecr.io/azureml/azureml_8fd1decee2b379a3d59fed509c692f31@sha256:47fc896a553e4b7bb972cbaa9a31de99b4755688dd525b9c725563ddde86aa0c\"\",\"type\":\"Normal\"}\n{\"count\":1,\"firstTimestamp\":\"2022-06-17T10:27:42Z\",\"lastTimestamp\":\"2022-06-17T10:27:42Z\",\"name\":\"Pulled\",\"message\":\"Successfully pulled image \"libraaimlwor4c93b458.azurecr.io/azureml/azureml_8fd1decee2b379a3d59fed509c692f31@sha256:47fc896a553e4b7bb972cbaa9a31de99b4755688dd525b9c725563ddde86aa0c\"\",\"type\":\"Normal\"}\n{\"count\":10,\"firstTimestamp\":\"2022-06-17T10:28:00Z\",\"lastTimestamp\":\"2022-06-17T10:47:11Z\",\"name\":\"Started\",\"message\":\"Started container\",\"type\":\"Normal\"}\n{\"count\":9,\"firstTimestamp\":\"2022-06-17T10:28:03Z\",\"lastTimestamp\":\"2022-06-17T10:40:46Z\",\"name\":\"Killing\",\"message\":\"Killing container with id a7e717efa63259b36b19bc4951b3f3dcc5f1093177e729c589355a7371353ca3.\",\"type\":\"Normal\"}\n{\"count\":1,\"firstTimestamp\":\"2022-06-17T10:31:33Z\",\"lastTimestamp\":\"2022-06-17T10:31:33Z\",\"name\":\"Pulling\",\"message\":\"pulling image \"libraaimlwor4c93b458.azurecr.io/azureml/azureml_8fd1decee2b379a3d59fed509c692f31@sha256:47fc896a553e4b7bb972cbaa9a31de99b4755688dd525b9c725563ddde86aa0c\"\",\"type\":\"Normal\"}\n{\"count\":1,\"firstTimestamp\":\"2022-06-17T10:32:31Z\",\"lastTimestamp\":\"2022-06-17T10:32:31Z\",\"name\":\"Pulled\",\"message\":\"Successfully pulled image \"libraaimlwor4c93b458.azurecr.io/azureml/azureml_8fd1decee2b379a3d59fed509c692f31@sha256:47fc896a553e4b7bb972cbaa9a31de99b4755688dd525b9c725563ddde86aa0c\"\",\"type\":\"Normal\"}\n{\"count\":5,\"firstTimestamp\":\"2022-06-17T10:47:52Z\",\"lastTimestamp\":\"2022-06-17T10:52:09Z\",\"name\":\"Started\",\"message\":\"Started container\",\"type\":\"Normal\"}\n{\"count\":6,\"firstTimestamp\":\"2022-06-17T10:48:26Z\",\"lastTimestamp\":\"2022-06-17T10:52:37Z\",\"name\":\"Killing\",\"message\":\"Killing container with id 2bdec1005a6dd58312e10ab939d88ea08b312771de6573d9b86c5f571104277e.\",\"type\":\"Normal\"}\n\"\n    }\n  ]\n}"  
    }  
}  
Azure Machine Learning
Azure Machine Learning
An Azure machine learning service for building and deploying models.
2,563 questions
{count} votes

3 answers

Sort by: Most helpful
  1. Mihai Munteanu 1 Reputation point
    2022-06-20T07:11:19.413+00:00

    @Ramr-msft I've tried to use both of the commands that you've mentioned above. Unfortunately they tell me nothing, both of them are displaying me "None" and it's not very suggestive honestly.
    212841-image.png

    Also, I've tried to debug it locally. This is the error that I receive:

    The environment variable 'AZUREML_MODEL_DIR' has not been set.  
    Use the --model_dir command line argument to set it.  
      
    Azure ML Inferencing HTTP server v0.4.11  
      
      
    Server Settings  
    ---------------  
    Entry Script Name: score_aml_live.py  
    Model Directory: None  
    Worker Count: 1  
    Worker Timeout (seconds): None  
    Server Port: 5001  
    Application Insights Enabled: false  
    Application Insights Key: None  
      
      
    Server Routes  
    ---------------  
    Liveness Probe: GET   127.0.0.1:5001/  
    Score:          POST  127.0.0.1:5001/score  
      
    Starting gunicorn 20.1.0  
    Listening at: http://0.0.0.0:5001 (17478)  
    Using worker: sync  
    Booting worker with pid: 17483  
    Initializing logger  
    2022-06-20 06:41:02,353 | root | INFO | Starting up app insights client  
    logging socket not found. logging not available.  
    logging socket not found. logging not available.  
    2022-06-20 06:41:02,358 | root | INFO | Starting up request id generator  
    2022-06-20 06:41:02,358 | root | INFO | Starting up app insight hooks  
    2022-06-20 06:41:02,358 | root | INFO | Invoking user's init function  
    2022-06-20 06:41:02,358 | root | ERROR | User's init function failed  
    2022-06-20 06:41:02,374 | root | ERROR | Encountered Exception Traceback (most recent call last):  
      File "/anaconda/envs/azureml_py38/lib/python3.8/site-packages/azureml_inference_server_http/server/aml_blueprint.py", line 201, in register  
        main.init()  
      File "/score_aml_live.py", line 41, in init  
        model_path = os.path.join(os.getenv('AZUREML_MODEL_DIR'), 'aml_live_model_end.pkl')  
      File "/lib/python3.8/posixpath.py", line 76, in join  
        a = os.fspath(a)  
    TypeError: expected str, bytes or os.PathLike object, not NoneType  
    

  2. Ramr-msft 17,611 Reputation points
    2022-06-20T07:39:14.62+00:00

    @Mihai Munteanu Thanks for the details. Is your files (model) in local file system or in AzureML directory?. Here is document to check the model path fail to debug.
    Also document to Advance entry script. Still facing an issue please share the notebook.


  3. Roxana 1 Reputation point
    2022-11-28T09:38:04.237+00:00

    Hi @Mihai Munteanu @Ramr-msft I am also facing the exactly same issue. Are there any tips that you can share with us? Thank you, Roxana.

    0 comments No comments