How can I pass on a PipelineInput to a component - SDK-v2

Manoni 10 Reputation points
2024-10-23T08:46:39.4433333+00:00

I have a component in AzureML, defined using SDK-v2. It runs fine if I hardcode my input strings in the pipeline definition, but as soon as I pass these as input into the pipeline and then pass them on to my component, I get this error:

"Unsupported input type: <class 'azure.ai.ml.entities._job.pipeline._io.base.PipelineInput'>, only Input, dict, str, bool, int and float are supported."

How can I adjust my component to use the parent input?

Thanks!!

User's image

Azure Machine Learning
{count} votes

1 answer

Sort by: Most helpful
  1. romungi-MSFT 49,106 Reputation points Microsoft Employee Moderator
    2024-10-23T11:57:22.5+00:00

    @Manoni You can use any of the samples in the azureml-samples repo to load the components from YAML and pass the inputs to the pipeline. See this sample for example.

    A pipeline is defined with the train model, score model and evaluate model steps all in their own yamls but loaded using load_component and then a pipeline job is created with the inputs and the required parameters.

    from azure.ai.ml import MLClient, Input
    from azure.ai.ml.dsl import pipeline
    from azure.ai.ml import load_component 
    
    
    train_model = load_component(source=parent_dir + "/train_model.yml")
    score_data = load_component(source=parent_dir + "/score_data.yml")
    eval_model = load_component(source=parent_dir + "/eval_model.yml")
    
    
    @pipeline()
    def pipeline_with_components_from_yaml(
        training_input,
        test_input,
        training_max_epochs=20,
        training_learning_rate=1.8,
        learning_rate_schedule="time-based",
    ):
        """E2E dummy train-score-eval pipeline with components defined via yaml."""
        # Call component obj as function: apply given inputs & parameters to create a node in pipeline
        train_with_sample_data = train_model(
            training_data=training_input,
            max_epochs=training_max_epochs,
            learning_rate=training_learning_rate,
            learning_rate_schedule=learning_rate_schedule,
        )
    
        score_with_sample_data = score_data(
            model_input=train_with_sample_data.outputs.model_output, test_data=test_input
        )
        score_with_sample_data.outputs.score_output.mode = "upload"
    
        eval_with_sample_data = eval_model(
            scoring_result=score_with_sample_data.outputs.score_output
        )
    
        # Return: pipeline outputs
        return {
            "trained_model": train_with_sample_data.outputs.model_output,
            "scored_data": score_with_sample_data.outputs.score_output,
            "evaluation_report": eval_with_sample_data.outputs.eval_output,
        }
    
    
    pipeline_job = pipeline_with_components_from_yaml(
        training_input=Input(type="uri_folder", path=parent_dir + "/data/"),
        test_input=Input(type="uri_folder", path=parent_dir + "/data/"),
        training_max_epochs=20,
        training_learning_rate=1.8,
        learning_rate_schedule="time-based",
    )
    
    
    
    

    For simplicity, I have skipped some cells from the sample notebook. The above should help you create a pipeline and a job where *.ymls will actually create the components and pass the inputs. I hope this helps!!

    If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.

    1 person found this answer helpful.
    0 comments No comments

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.