Azure Machine Learing - Batch Scoring with ParallelRunConfig output_action='summary_only'

Daniel Tudorache 1 Reputation point
2022-02-03T13:37:13.557+00:00

Hello,

I have deployed a batch inferencing service and I want to save minibatch results in a json format. My understanding after reading the ParallelRunConfig documentation is that for output_action="append_row" you can return only list or pandas dataframe objects in the run() function.
I have tried to change output_action='summary_only' but nothing is saved into the datastore anymore.
I could not find any examples on how to use output_action='summary_only' except the below explanation, which does not give details on how to store the output:

'append_row' – All values output by run() method invocations will be aggregated into one unique file named parallel_run_step.txt that is created in the output location.
'summary_only' – User script is expected to store the output by itself. An output row is still expected for each successful input item processed. The system uses this output only for error threshold calculation (ignoring the actual value of the row).

Do you know how can I save the results of each minibatch of the run() function as a json into the datastore?

Thank you,
Daniel

Azure Machine Learning
Azure Machine Learning
An Azure machine learning service for building and deploying models.
2,579 questions
{count} votes