Questions about aml python sdk v2 for pipeline

Yongchao Liu (Neusoft America Inc) 191 Reputation points Microsoft External Staff
2023-03-21T02:13:31.0366667+00:00

We are currently migrating v1 to v2
pipeline takes the decorator form in v2. Is there any other way similar to v1


datastore = workspace.datastores['my_adlsgen2']
step1_output_data = OutputFileDatasetConfig(name="processed_data", destination=(datastore, "mypath/{run-id}/{output-name}")).as_upload()

step1 = PythonScriptStep(
    name="generate_data",
    script_name="step1.py",
    runconfig = aml_run_config,
    arguments = ["--output_path", step1_output_data]
)

step2 = PythonScriptStep(
    name="read_pipeline_data",
    script_name="step2.py",
    compute_target=compute,
    runconfig = aml_run_config,
    arguments = ["--pd", step1_output_data.as_input()]

)

pipeline = Pipeline(workspace=ws, steps=[step1, step2])

v1 is very convenient for ci/cd deployment and generalizing the code


cluster_name = "cpu-cluster"
custom_path = "azureml://datastores/workspaceblobstore/paths/custom_path/${{name}}/"

# define a pipeline with component
@pipeline(default_compute=cluster_name)
def pipeline_with_python_function_components(input_data, test_data, learning_rate):
    """E2E dummy train-score-eval pipeline with components defined via python function components"""

    # Call component obj as function: apply given inputs & parameters to create a node in pipeline
    train_with_sample_data = train_model(
        training_data=input_data, max_epochs=5, learning_rate=learning_rate
    )
    score_with_sample_data = score_data(
        model_input=train_with_sample_data.outputs.model_output, test_data=test_data
    )
    # example how to change path of output on step level,
    # please note if the output is promoted to pipeline level you need to change path in pipeline job level
    score_with_sample_data.outputs.score_output = Output(
        type="uri_folder", mode="rw_mount", path=custom_path
    )
    eval_with_sample_data = eval_model(
        scoring_result=score_with_sample_data.outputs.score_output
    )

    # Return: pipeline outputs
    return {
        "eval_output": eval_with_sample_data.outputs.eval_output,
        "model_output": train_with_sample_data.outputs.model_output,
    }


pipeline_job = pipeline_with_python_function_components(
    input_data=Input(
        path="wasbs://******@dprepdata.blob.core.windows.net/Titanic.csv", type="uri_file"
    ),
    test_data=Input(
        path="wasbs://******@dprepdata.blob.core.windows.net/Titanic.csv", type="uri_file"
    ),
    learning_rate=0.1,
)
# example how to change path of output on pipeline level
pipeline_job.outputs.model_output = Output(
    type="uri_folder", mode="rw_mount", path=custom_path
)

v2 I have two problems at the moment:

1.Decorator issue. Decorator is not supported for methods in the class,I had to do it.
User's image

2.pipeline_with_python_function_components, for example, seem unable to generalize

If I have multiple pipelines that need to be processed, I need to declare multiple such jobs for processing. The entry and sale may be different

So does anyone have a way around decorator

Thanks

Azure Machine Learning
Azure Machine Learning
An Azure machine learning service for building and deploying models.
3,332 questions
{count} votes

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.