Unable to download output of job using mlclient - Azure python sdk v2

Question

Unable to download output of job using mlclient - Azure python sdk v2

Tadikonda Tarun HYD DIWID23 20

Below is the folder structure in the output of a job and trying to download metrics.json User's image

`

ml_client.jobs.download(name=job_name,output_name='outputs/metrics.json')

` The above does not download the file nor raise any exception. `

ml_client.jobs.download(name=job_name)

` The above one downloads all the file which are in the output. Am I missing anything while trying to download single file.

Tadikonda Tarun HYD DIWID23 20 Reputation points

2024-01-24T14:31:18.4633333+00:00

In the above example both sdk v1 and v2 are used. As per documentation using both in single project is not recommended.
dupammi 8,615 Reputation points Microsoft External Staff

2024-01-24T15:08:00.7233333+00:00
Hi @Tadikonda Tarun HYD DIWID23 , Thank you for the response. you can also use Run.download_file method to download files.

run.download_file('definition.json', output_file_path='./')

Thank you for understanding.
dupammi 8,615 Reputation points Microsoft External Staff

2024-01-25T02:05:55.14+00:00

Hi @Tadikonda Tarun HYD DIWID23 , We haven’t heard from you on the last response and was just checking back to see if you were able to follow above suggestions to resolve the download issue. Thank you.
Tadikonda Tarun HYD DIWID23 20 Reputation points

2024-01-25T03:01:17.3+00:00

Please read the above content.
dupammi 8,615 Reputation points Microsoft External Staff

2024-01-25T03:17:50.07+00:00

Hi @Tadikonda Tarun HYD DIWID23 , Thank you for the details. To further assist you with debugging, I recommend adding additional debug statements, print statements, try-catch-except blocks, and other relevant debugging techniques in your code. Additionally, providing relative or full paths in print statements can help identify the location of specific files or artifacts. I hope this helps! Thank you for understanding.

1 answer

Your answer

Tadikonda Tarun HYD DIWID23 20 Reputation points

2024-01-24T14:31:18.4633333+00:00

In the above example both sdk v1 and v2 are used. As per documentation using both in single project is not recommended.
dupammi 8,615 Reputation points Microsoft External Staff

2024-01-24T15:08:00.7233333+00:00

Hi @Tadikonda Tarun HYD DIWID23 , Thank you for the response. you can also use Run.download_file method to download files.

run.download_file('definition.json', output_file_path='./')

Thank you for understanding.
dupammi 8,615 Reputation points Microsoft External Staff

2024-01-25T02:05:55.14+00:00

Hi @Tadikonda Tarun HYD DIWID23 , We haven’t heard from you on the last response and was just checking back to see if you were able to follow above suggestions to resolve the download issue. Thank you.
Tadikonda Tarun HYD DIWID23 20 Reputation points

2024-01-25T03:01:17.3+00:00

Please read the above content.
dupammi 8,615 Reputation points Microsoft External Staff

2024-01-25T03:17:50.07+00:00

Hi @Tadikonda Tarun HYD DIWID23 , Thank you for the details. To further assist you with debugging, I recommend adding additional debug statements, print statements, try-catch-except blocks, and other relevant debugging techniques in your code. Additionally, providing relative or full paths in print statements can help identify the location of specific files or artifacts. I hope this helps! Thank you for understanding.

Answer 1

Hi @Tadikonda Tarun HYD DIWID23 , Thank you for reaching out. I understand you are facing challenges downloading a specific output file from your job using the mlclient in Azure Python SDK v2. It seems there might be a specific behavior with the output_name parameter in the ml_client.jobs.download method. In some cases, using output_name directly may not work as expected, eg. if the Job saves results as URL and the SDK expects string in the output_name. As a workaround, you can try using the all=True parameter along with additional debugging or print statements to identify the exact relative path or full path where the API is attempting to download files from. Here is an example that I tried to repro using the ml_client.jobs.download :

from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredential
from azureml.core import Workspace, Experiment, Run
from datetime import datetime
import os

# Replace with your own values
subscription_id = 'YOUR_SUBSCRIPTION_ID'
resource_group = 'YOUR_RESOURCE_GROUP'
workspace_name = 'YOUR_WORKSPACE_NAME'
experiment_name = 'YOUR_EXPERIMENT_NAME'
run_id = 'YOUR_RUN_ID'  

# Get the workspace
ws = Workspace(subscription_id=subscription_id, resource_group=resource_group, workspace_name=workspace_name)

# Get the experiment
exp = Experiment(workspace=ws, name=experiment_name)

# Get the run
run = Run(exp, run_id=run_id)

# Check if the run has completed successfully before downloading files
if run.get_status() == "Completed":
    # Create a timestamped log directory
    log_dir = f"./logs/{datetime.now().strftime('%Y%m%d-%H%M%S')}"
    os.makedirs(log_dir, exist_ok=True)
    
    # Create MLClient using DefaultAzureCredential
    ml_client = MLClient(DefaultAzureCredential(), subscription_id, resource_group, workspace_name)
    
    # Debugging step 1: List output files and artifacts
    outputs = run.get_file_names()
    print("Output files and artifacts:")
    for output in outputs:
        print(output)

    # Debugging step 2: Print debug information before download
    print("Before download:")
    
    # Download all logs and named outputs of the job
    try:
        ml_client.jobs.download(name=run.id, download_path=log_dir, all=True)
        print(f"Files downloaded successfully to: {log_dir}")
    except Exception as e:
        print(f"Error during download: {e}")
else:
    print("Run is not completed. Cannot download files.")

with the above additional debugging steps I added in my repro, along with the "all=true" parameter, I was able to find that it is trying to download from a URL , which is different from output_name, which is a string and not URL. For more details, please refer this joboperations-download

Output :

User's image

This approach will download all files, but it will help you identify the exact file paths. You can then manually filter and use the desired file from the downloaded files. I hope you understand. Thank you.

Vinh Cao 0 Reputation points

2024-03-11T23:44:41.47+00:00

does not work for me. The code stucks at the line

outputs = run.get_file_names()

But this is just a work-around. Do you know, when the bug will be fixed?

btw: the suggested solution doesn't work.

ml_client.jobs.download(name=run.id, download_path=log_dir)

this line already downloads all the artefacts, which we actually don't want. We just need e.g. the latest checkpoint.
dupammi 8,615 Reputation points Microsoft External Staff

2024-03-13T09:07:20.3566667+00:00

Hi Vinh Cao,

I apologize for the confusion. Regarding the issue you are facing with the run.get_file_names() method, it's possible that there might be some issues with the specific version of the Azure Python SDK you are using or with the job output format. I recommend checking the job output format to ensure that you are using the correct parameters and formats. Additionally, you can try reaching out to the Azure support team for further assistance in debugging the issue.

Regarding your requirement to download only the latest checkpoint, you can try using the ml_client.jobs.download method with the output_name parameter set to the specific checkpoint file you want to download. For example:

ml_client.jobs.download(name=run.id, output_name='outputs/checkpoints/latest_checkpoint.pth', download_path=log_dir)

This should download only the latest checkpoint file to the specified download path.

I hope you understand. Thank you.
Vinh Cao 0 Reputation points

2024-03-14T13:14:04.83+00:00

Hi @dupammi ,

ml_client.jobs.download(name=run.id, output_name='outputs/checkpoints/latest_checkpoint.pth', download_path=log_dir)

this code doesn't work like the original thread owner mentioned.
dupammi 8,615 Reputation points Microsoft External Staff

2024-03-15T01:22:44.0733333+00:00

Hi @Vinh Cao

In this case, I would recommend you raise a support case through Azure portal.

I hope you understand. Thank you!

Share via

Unable to download output of job using mlclient - Azure python sdk v2

1 answer

Your answer