How to give Source Directory on the step pipeline in Azure Machine Learning

Roopesh Bharatwaj K R 6 Reputation points
2022-12-05T17:50:34.633+00:00

Hi

I'm trying to Specify the Source Directory and tried several ways but i could not find the solution .

below is the example of my file: (where I'm trying to specify the source directory as CodeBase and file is Data.py
and my pipeline file as Datapipeline.py.

Folder Structure :

CodeBase
---Data.Py
---Data1.py
Pipeline
----Datapipeline.py

Code Example:
from azureml.pipeline.steps import PythonScriptStep

path= './CodeBase/'
dataprep_source_dir = path
entry_point = "Data.py"
data_prep_step = PythonScriptStep(name='Inference_Service',
script_name=entry_point,
source_directory=dataprep_source_dir,
inputs= [Top_150_Merchants.as_named_input('Top_150_Merchants'),
acquire.as_named_input('Weekly_volume_Acquire')],
outputs=[datafolder],
compute_target=compute_target,
runconfig=aml_run_config,
allow_reuse=True
)

ValueError: Step [Inference_Service]: script not found at: /mnt/batch/tasks/shared/LS_root/mounts/clusters/dev-mural-> gpu/code/Users/Roopesh.Bharatwaj/Mural_Code/Pipeline/CodeBase/Data.py.

Make sure to specify an appropriate source_directory on the Step or default_source_directory on the Pipeline.

Kindly Let me know, if anyone can help me in this. Thank you !!

Azure Machine Learning
Azure Machine Learning
An Azure machine learning service for building and deploying models.
2,571 questions
0 comments No comments
{count} vote

1 answer

Sort by: Most helpful
  1. Ramr-msft 17,616 Reputation points
    2022-12-06T13:12:53.947+00:00

    @Roopesh Bharatwaj K R Thanks for the question. Here is the sample to specify the source directory. If you are still facing problem, please share the sample that you are using.

    https://learn.microsoft.com/en-us/python/api/azureml-pipeline-steps/azureml.pipeline.steps.python_script_step.pythonscriptstep?view=azure-ml-py

    # step4 consumes the datasource (Datareference) in the previous step  
    # and produces processed_data1  
    trainStep = PythonScriptStep(  
        script_name="train.py",   
        arguments=["--input_data", blob_input_data, "--output_train", processed_data1],  
        inputs=[blob_input_data],  
        outputs=[processed_data1],  
        compute_target=aml_compute,   
        source_directory=source_directory,  
        runconfig=run_config  
    )  
    print("trainStep created")