AzureML and PowerBI compatible endpoint error - "list index out of range"

Question

AzureML and PowerBI compatible endpoint error - "list index out of range"

Ed Lockhart 1

I have produced a model that is deployed as an ACI, which I can send data to via the REST API in python using the code described here (https://learn.microsoft.com/en-us/azure/machine-learning/how-to-consume-web-service?tabs=python#call-the-service-python).

However I have ran into a problem producing a PowerBI compatible endpoint which requires an inference schema described here (https://learn.microsoft.com/en-us/azure/machine-learning/how-to-deploy-advanced-entry-script#power-bi-compatible-endpoint).

I have adapted the input samples as required and the deployment goes ahead fine and is displayed as healthy. Checking the JSON, all seems fine with the schema. However when I try to send some data using the first script using the REST API, all that is returned is "list index out of range". All that has changed is the addition of the inference schema entry script.

Any idea what might be causing this error? I have tried to change a variety of things to do with the data being sent and the entry script but it always ends with the same error.

Thanks

EDIT:

I should also clarify that the model returns a probability output of each class using a wrapper for .predict() which works fine before adding this PowerBI compatibility.

class SklearnModelWrapper(mlflow.pyfunc.PythonModel):  
        def __init__(self, model):  
            self.model = model  
        def predict(self, model_input):  
            return self.model.predict_proba(model_input)

This is a binary problem so the output for two rows of data is the format [[0.3, 0.7], [0.64, 0.36]] representing the probability for each class. I have tried this for the sample output in the schema and still no change, the same error is produced regardless of deploying a wrapped model that returns the probability per each class or a normal model that returns the class. Both work fine without the inference schema.

Entry script:

import numpy as np  
import pandas as pd  
import joblib  
from azureml.core.model import Model  
  
from inference_schema.schema_decorators import input_schema, output_schema  
from inference_schema.parameter_types.standard_py_parameter_type import StandardPythonParameterType  
from inference_schema.parameter_types.numpy_parameter_type import NumpyParameterType  
from inference_schema.parameter_types.pandas_parameter_type import PandasParameterType  
  
def init():  
    global model  
    #Model name is the name of the model registered under the workspace  
    model_path = Model.get_model_path(model_name = 'databricksmodelpowerbi2')  
    model = joblib.load(model_path)  
  
#Provide 3 sample inputs for schema generation for 2 rows of data  
numpy_sample_input = NumpyParameterType(np.array([[2400.0, 78.26086956521739, 11100.0, 3.612565445026178, 3.0, 0.0], [368.55, 96.88311688311687, 709681.1600000012, 73.88059701492537, 44.0, 0.0]], dtype = 'float64'))  
pandas_sample_input = PandasParameterType(pd.DataFrame({'value': [2400.0, 368.55], 'delayed_percent': [78.26086956521739, 96.88311688311687], 'total_value_delayed': [11100.0, 709681.1600000012], 'num_invoices_per30_dealing_days': [3.612565445026178, 73.88059701492537], 'delayed_streak': [3.0, 44.0], 'prompt_streak': [0.0, 0.0]}))  
standard_sample_input = StandardPythonParameterType(0.0)  
  
# This is a nested input sample, any item wrapped by `ParameterType` will be described by schema  
sample_input = StandardPythonParameterType({'input1': numpy_sample_input,   
                                            'input2': pandas_sample_input,   
                                            'input3': standard_sample_input})  
  
sample_global_parameters = StandardPythonParameterType(1.0) #this is optional  
sample_output = StandardPythonParameterType([1.0, 1.0])  
  
@input_schema('inputs', sample_input)  
@input_schema('global_parameters', sample_global_parameters) #this is optional  
@output_schema(sample_output)  
  
def run(inputs, global_parameters):  
    try:  
        data = inputs['input1']  
        # data will be convert to target format  
        assert isinstance(data, np.ndarray)  
        result = model.predict(data)  
        return result.tolist()  
    except Exception as e:  
        error = str(e)  
        return error

Prediction script:

import requests  
import json  
from ast import literal_eval  
  
# URL for the web service  
scoring_uri = ''  
## If the service is authenticated, set the key or token  
#key = '<your key or token>'  
  
# Two sets of data to score, so we get two results back  
data = {"data": [[2400.0, 78.26086956521739, 11100.0, 3.612565445026178, 3.0, 0.0], [368.55, 96.88311688311687, 709681.1600000012, 73.88059701492537, 44.0, 0.0]]}  
# Convert to JSON string  
input_data = json.dumps(data)  
  
# Set the content type  
headers = {'Content-Type': 'application/json'}  
## If authentication is enabled, set the authorization header  
#headers['Authorization'] = f'Bearer {key}'  
  
# Make the request and display the response  
resp = requests.post(scoring_uri, input_data, headers=headers)  
print(resp.text)  
  
result = literal_eval(resp.text)

Yifei Yu 1 Reputation point

2020-10-16T02:26:28.357+00:00

Hi,

Could you please share the sample input & output, run() function signature (including the input schema annotation) and a sample request to help me repro what you are encountering? Thank you very much!

Yifei
Ed Lockhart 1 Reputation point

2020-10-16T07:46:29.467+00:00

Added the requested code. Thanks!
Yifei Yu 1 Reputation point

2020-11-02T04:00:12.733+00:00
Apologies for the delay. The PowerBI compatible endpoint requires "Inputs" and "GlobalParameters" data fields in the input, because that is the existing contract between PowerBI service and client. If you don't need to integrate with PowerBI, you can save the effort to provide a compatible swagger. But if you actually do need PowerBI integration, in your case the scoring payload should be something like:

{"Inputs": {"data": [[2400.0, 78.26086956521739, 11100.0, 3.612565445026178, 3.0, 0.0], [368.55, 96.88311688311687, 709681.1600000012, 73.88059701492537, 44.0, 0.0]]}, "GlobalParameters": {}, }

You could provide the below sample input:

sample_input = StandardPythonParameterType({'data': numpy_sample_input})

And in the scoring input use:

data = Inputs['data']

to obtain the input.

Ed Lockhart 1

Thanks for your reply!

I updated my entry script as you suggested with the following amendments:

# This is a nested input sample, any item wrapped by `ParameterType` will be described by schema
#sample_input = StandardPythonParameterType({'input1': numpy_sample_input, 'input2': pandas_sample_input, 'input3': standard_sample_input})
sample_input = StandardPythonParameterType({'data': numpy_sample_input})

and

def run(inputs, global_parameters):
    try:
        data = inputs['data']

I used your exact payload format in my script but I now get the error "Invalid input data type to parse. Expected: <class 'float'> but got <class 'dict'>".

Thanks for your help!

Your answer

Yifei Yu 1 Reputation point

2020-10-16T02:26:28.357+00:00

Hi,

Could you please share the sample input & output, run() function signature (including the input schema annotation) and a sample request to help me repro what you are encountering? Thank you very much!

Yifei
Ed Lockhart 1 Reputation point

2020-10-16T07:46:29.467+00:00

Added the requested code. Thanks!
Yifei Yu 1 Reputation point

2020-11-02T04:00:12.733+00:00

Apologies for the delay. The PowerBI compatible endpoint requires "Inputs" and "GlobalParameters" data fields in the input, because that is the existing contract between PowerBI service and client. If you don't need to integrate with PowerBI, you can save the effort to provide a compatible swagger. But if you actually do need PowerBI integration, in your case the scoring payload should be something like:

{"Inputs": {"data": [[2400.0, 78.26086956521739, 11100.0, 3.612565445026178, 3.0, 0.0], [368.55, 96.88311688311687, 709681.1600000012, 73.88059701492537, 44.0, 0.0]]}, "GlobalParameters": {}, }

You could provide the below sample input:

sample_input = StandardPythonParameterType({'data': numpy_sample_input})

And in the scoring input use:

data = Inputs['data']

to obtain the input.
Ed Lockhart 1 Reputation point

2020-11-02T10:55:37.843+00:00

Thanks for your reply!

I updated my entry script as you suggested with the following amendments:

# This is a nested input sample, any item wrapped by `ParameterType` will be described by schema #sample_input = StandardPythonParameterType({'input1': numpy_sample_input, 'input2': pandas_sample_input, 'input3': standard_sample_input}) sample_input = StandardPythonParameterType({'data': numpy_sample_input})

and

def run(inputs, global_parameters): try: data = inputs['data']

I used your exact payload format in my script but I now get the error "Invalid input data type to parse. Expected: <class 'float'> but got <class 'dict'>".

Thanks for your help!

Share via

AzureML and PowerBI compatible endpoint error - "list index out of range"

Your answer