Is it possible to access the "outputs" directory from a preempted job?

Ed 0 Reputation points
2023-12-08T21:53:24.16+00:00

Sometimes my Azure ML jobs that use low priority compute nodes are preempted, leaving directories like "outputs/retry_001", "outputs/retry_002", etc. I am wondering if it is possible to programmatically access these directories once the job is restarted.

Example: Job that creates files A.txt, B.txt, and C.txt, with no interdependencies.

Scenario: Job starts, creating file A.txt and B.txt in outputs/. Job is pre-empted. Job restarts, sees that A.txt and B.txt already exist in outputs, so creates only C.txt in outputs/retry_001 (rather than all of A+B+C).

Azure Machine Learning
{count} votes

1 answer

Sort by: Most helpful
  1. YutongTie-9091 54,016 Reputation points Moderator
    2023-12-09T03:24:58.1266667+00:00

    @Ed Thanks for reaching out to us, are you using SDK?

    You can access the files programmatically by using the Azure SDK. You'll need to access the Run object corresponding to your job and then use the download_file or download_files methods to download files from the "outputs" directory.

    Here is a sample Python code snippet that shows how you can do this:

    from azureml.core import Workspace, Experiment, Run

    # get workspace, experiment, and run  
    ws = Workspace.from_config()  
    exp = Experiment(workspace=ws, name='my_experiment')  
    run = Run(exp, run_id='my_run_id')  # replace with your run id  
      
    # download a single file  
    run.download_file('outputs/A.txt', output_file_path='./')  
      
    # or download all files in a directory  
    run.download_files(prefix='outputs/', output_directory='./')  
     
    

    In the above code, replace 'my_experiment' and 'my_run_id' with the name of your experiment and the ID of your run respectively. The output_file_path and output_directory parameters specify where on your local machine the files should be downloaded.

    Please note that the directories like "outputs/retry_001", "outputs/retry_002", etc., are just examples and the actual directories created may vary depending on your job and how it handles preemption. You should adjust the prefix parameter in download_files method accordingly to download files from the correct directory.

    Please have a try and let us know how it goes.

    Regards,

    Yutong

    0 comments No comments

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.