Enabling Model Data Collector for ACI through az ml deploy
Hello,
I want to create a release pipeline in DevOps that deploys a model to ACI, I have a PowerShell script that deploys using the az ml package. However even though I set the model data collector flag in the deploy command, the endpoint does not collect the inputs. I have tried deploying the same model, using the exact same score.py, but deployed programmatically through azureml.core.webservice and it worked just fine with collection data, so the problem is elsewhere.
I have deployed my model using the following command in a PowerShell script (paraphrasing some of the arguments):
az ml model deploy -n $name --model $model -g $resource-group -w $workspace --es $entry_script_path --cf $conda_file_path --dc $deployment_config_path --md True --overwrite -v
Notice that I have set the --md argument to True, which is the model data collector flag.
And in my score.py, I have imported the ModelDataCollector, initialized it and I call collect. I have something like this:
from azureml.monitoring import ModelDataCollector
def init():
global model, scaler, input_name, label_name, inputs_dc, prediction_dc# variables to monitor model input and output data inputs_dc = ModelDataCollector("model", designation="inputs")
input_sample = pd.DataFrame(sample})
@input_schema('data',PandasParameterType(input_sample))
@output_schema(NumpyParameterType(np.array([0])))
def run(data):
try:
inputs_dc.collect(data)model inference
result = model.predict(data)
return {"result": result.tolist()}
except Exception as e:
result = e
return {"result": result}
However I deploy my model using the PowerShell script, and everything goes fine. The endpoint is callable, healthy and outputs correct results. But the data does not get saved in the storage account or in application insights (which I have enabled, and can see is enabled on the endpoint as well.
In my deployment config I see the following:
Data collection is not enabled. Set environment variable ML_MODEL_DC_STORAGE_ENABLED to 'true' to enable.
How do I go about that? I have enabled Data collection in my deployment command, why does it not work, and how can I set the environmental variable?
I tried adding it to my conda_env.yml which is part of the deployment command "--cf $conda_file_path" like so:
name: my_env
dependencies:
- python=3.6.2
- pip:
- numpy
- onnxruntime
- joblib
- azureml-core~=1.10.0
- azureml-defaults~=1.10.0
- scikit-learn==0.22.2.post1
- inference-schema
- inference-schema[numpy-support]
- azureml-monitoring
channels:- anaconda
- conda-forge
variables:
ML_MODEL_DC_STORAGE_ENABLED = true
But that just produces another error. How do I solve this problem?