Pre-existing Compute Resource necessary for running a scheduled ML pipeline?

Himanshu Gautam 1 Reputation point
2021-05-12T07:14:16.847+00:00

Hi fellows,
I have been exploring Azure ML Pipeline. I am referring to this notebook for the below code:

Here is a small snippet from a MS Repo:
train_step = PythonScriptStep(name = "Prepare Data",
source_directory = experiment_folder,
script_name = "prep_diabetes.py",
arguments = ['--input-data', diabetes_ds.as_named_input('raw_data'),
'--prepped-data', prepped_data_folder],
outputs=[prepped_data_folder],
compute_target = pipeline_cluster,
runconfig = pipeline_run_config,
allow_reuse = True)

This suggests that while defining a pipeline, we must provide it a compute resource. This obviously makes sense, since specific compute might be required for a specific step.

But do we need to have this compute resource up and running always, so that whenever a pipeline is triggered, it can find the compute resource?

Also, i figured we can probably keep a cluster with Zero minimum nodes, in which cases cluster is resized whenever pipeline is triggered. But i think there is a minimal cost incurrent in probably container registry regularly in such a setup. Is this the recommended way to deploy ML pipelines or some more efficient approach is possible?

Azure Machine Learning
Azure Machine Learning
An Azure machine learning service for building and deploying models.
3,337 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Ramr-msft 17,826 Reputation points
    2021-05-13T10:02:30.86+00:00

    @Himanshu Gautam Thanks for the question. AmlCompute clusters are designed to scale dynamically based on your workload. The cluster can be scaled up to the maximum number of nodes you configure. As each run completes, the cluster will release nodes and scale to your configured minimum node count.

    To avoid charges when no jobs are running, set the minimum nodes to 0. This setting allows Azure Machine Learning to de-allocate the nodes when they aren't in use. Any value larger than 0 will keep that number of nodes running, even if they are not in use.

    Another way to save money on compute resources is Azure Reserved VM Instance. amlcompute supports reserved instances out of the box. These reservations can be used across azure compute resources (vmss/batch/vm) and AzureML compute.

    Check out this article to learn more about planning and managing costs : https://learn.microsoft.com/en-us/azure/machine-learning/concept-plan-manage-cost

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.