Creating an automated ML endpoint service, calling with an error

Xu, Pengcheng (CN - AB) 0 Reputation points
2024-07-06T09:10:58.85+00:00

Follow the "Tutorial: Training a Classification Model in Azure Machine Learning Studio with No-Code AutoML" to create the endpoint service, sample Python code call, and error message.

{"message": "An unexpected error occurred in scoring script. Check the logs for more info."},

Endpoint log error message

{

"error": {

    "code": "UserError",

    "message": "Expected column(s) 0 not found in fitted data.",

    "target": "X",

    "inner_error": {

        "code": "BadArgument",

        "inner_error": {

            "code": "MissingColumnsInData"

        }

    },

    "reference_code": "17049f70-3bbe-4060-a63f-f06590e784e5"

}

}

The input data used.

data = {

"Inputs": {

    #"columns": ["age", "job", "marital", "education", "default", "housing", "loan", "contact", "month", "duration", "campaign",

     #           "pdays", "previous", "poutcome", "emp.var.rate", "cons.price.idx", "cons.conf.idx", "euribor3m", "nr.employed"],

    "columns": [0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18],

    "index": [0, 1],

    "data": [

        [57, "technician", "married", "high.school", "no", "no", "yes", "cellular", "may", 371, 1, 999, 1, "failure", -1.8, 92.893, -46.2, 1.299, 5099.1],

        [30, "blue-collar", "single", "basic.9y", "no", "yes", "no", "cellular", "jul", 221, 1, 999, 0, "nonexistent", 1.4, 93.994, -36.4, 4.857, 5191]

    ]

}

}

Please guide me how to solve it.

Azure AI services
Azure AI services
A group of Azure services, SDKs, and APIs designed to make apps more intelligent, engaging, and discoverable.
2,833 questions
0 comments No comments
{count} votes

3 answers

Sort by: Most helpful
  1. Amira Bedhiafi 24,531 Reputation points
    2024-07-07T12:42:54.38+00:00

    First thing I think about is to verify the column names and their order in the input data match exactly with what was used during training because what I understood from the error message is that the service is expecting column names instead of numerical indices.

    To explain better, the input data provided for scoring should have the same structure (column names and data types) as the data used for training the model.

    So based on the example provided, you should replace the numerical indices with the corresponding column names.

    
    data = {
    
        "Inputs": {
    
            "columns": [
    
                "age", "job", "marital", "education", "default", "housing", "loan", "contact", "month", "duration", "campaign",
    
                "pdays", "previous", "poutcome", "emp.var.rate", "cons.price.idx", "cons.conf.idx", "euribor3m", "nr.employed"
    
            ],
    
            "index": [0, 1],
    
            "data": [
    
                [57, "technician", "married", "high.school", "no", "no", "yes", "cellular", "may", 371, 1, 999, 1, "failure", -1.8, 92.893, -46.2, 1.299, 5099.1],
    
                [30, "blue-collar", "single", "basic.9y", "no", "yes", "no", "cellular", "jul", 221, 1, 999, 0, "nonexistent", 1.4, 93.994, -36.4, 4.857, 5191]
    
            ]
    
        }
    
    }
    
    

  2. Xu, Pengcheng (CN - AB) 0 Reputation points
    2024-07-08T09:27:06.11+00:00

    Thanks for the reply.

    I replaced the numerical indices with the column name, still the same error.

    I've also tried the data code you provided, and the error is the same.

    And I've compared the dataset used for the model, the values and order of the columns are the same.

    Dataset capture

    1720405670014

    0 comments No comments

  3. romungi-MSFT 45,731 Reputation points Microsoft Employee
    2024-07-11T14:29:54.93+00:00

    @Xu, Pengcheng (CN - AB) I think the request data might not be correctly formatted in this case. I do not have this setup to test but the same example with the same dataset is available to test in azureml-examples repo, Please check this notebook where the section "Test the deployment" uses the following format for request instead of the one mentioned in your request.

    test_data = pd.read_csv("./data/test-mltable-folder/bank_marketing_test_data.csv")
    
    test_data = test_data.drop("y", axis=1)
    
    test_data_json = test_data.to_json(orient="records", indent=4)
    data = (
        '{ \
              "input_data": {"data": '
        + test_data_json
        + "}}"
    )
    
    request_file_name = "sample-request-bankmarketing.json"
    
    with open(request_file_name, "w") as request_file:
        request_file.write(data)
    
    ml_client.online_endpoints.invoke(
        endpoint_name=online_endpoint_name,
        deployment_name="bankmarketing-deploy",
        request_file=request_file_name,
    )
    
    
    

    If you can use the sample file and print the request data, you can find the correct format that you can use with your deployment. Thanks!!

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.