What does "model" parameter do in RunFunction class (v2 SDK)?

Rahul Kurian Jacob 40 Reputation points
2024-09-26T18:05:55.57+00:00

I am using parallel_run_function from azure.ai.ml.parallel as follows:

parallel_job = parallel_run_function(
	# Other configs
	inputs=dict(
		model=Input(type=AssetTypes.CUSTOM_MODEL)
	),
	task=RunFunction(
        code="./parallel_code/",
        entry_script="parallel_entry_script.py",
        program_arguments="--model_dir ${{inputs.model}}",
        model="${{inputs.model}}",
        append_row_to="${{outputs.job_output_file}}",
        environment=environment.id,
    )
)

My question is what does model parameter do here? I thought it would be like BatchDeployment where passing model parameter with Model class or string id of model with result in it being available through AZUREML_MODEL_DIR environment variable. I tested it but it is NOT the case.

I even passed a non-existent asset (e.g. model="${{inputs.non_exist_model}}") but this does not raise any error unlike in other parameters like program_arguments.

I could not find anything in the docs or https://github.com/Azure/azureml-examples GitHub repo. The only line in class documentation is "The model of the parallel task" which does not explain how or where the model can be retrieved during the job.

Azure Machine Learning
Azure Machine Learning
An Azure machine learning service for building and deploying models.
2,898 questions
{count} votes

Accepted answer
  1. santoshkc 8,775 Reputation points Microsoft Vendor
    2024-09-27T15:29:45.69+00:00

    Hi @Rahul Kurian Jacob,

    Thank you for reaching out to Microsoft Q&A forum!

    The model parameter in the RunFunction class specifies a model asset for the parallel job but does not automatically provide it like in BatchDeployment. It doesn't trigger errors for non-existent models, indicating it's more for metadata. You need to manage model loading explicitly in your entry script, as there’s no built-in way to access the model during job execution.

    To run parallel jobs in Azure Machine Learning, you can use the Azure CLI or Python SDK. This process involves splitting a task into mini-batches, distributing them across multiple compute nodes, and configuring inputs, data division, and compute resources.

    1. Setup: Ensure you have an Azure ML account, workspace, and necessary SDKs installed.
    2. Define Parallel Job: Create a parallel job step in your pipeline, specifying input data, instance count, mini-batch size, and error handling settings.
    3. Automation: Utilize optional settings for automatic error handling and resource monitoring.
    4. Pipeline Integration: Incorporate the parallel job as a step within your pipeline, binding inputs and outputs to coordinate with other steps.

    This approach can significantly reduce execution time and improve efficiency in tasks like model training and batch inferencing.

    Please look into: Use parallel jobs in pipelines.

    I hope you understand. Thank you.

    1 person found this answer helpful.
    0 comments No comments

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.