How to tag job outputs data assets directly

Constantin 25 Reputation points
2024-07-10T06:28:58.9133333+00:00

Hi,

I follow the documentation here: https://learn.microsoft.com/en-us/azure/machine-learning/how-to-read-write-data-v2?view=azureml-api-2&tabs=python section: Write data from your Azure Machine Learning job to Azure Storage to output a named data asset of a job.

Therefore my job output definition roughly looks like this:

outputs = { 
"output_data": Output(
	type=data_type, 
	path=output_path, 
	mode=output_mode,
    name=my_name,
    # HOW to add tags? tags={"mytag": "test"} did not work
) }

However, i want to automatically set tags now (and perhaps in future properties) to the named data asset.

What is the best way of doing this?

Thanks a lot

Azure Machine Learning
Azure Machine Learning
An Azure machine learning service for building and deploying models.
2,891 questions
0 comments No comments
{count} votes

Accepted answer
  1. Amira Bedhiafi 24,531 Reputation points
    2024-07-10T10:45:43.86+00:00

    I am not expert but this is what I found in different forums, start by running your job to get the output :

    
    from azure.ai.ml import MLClient
    
    from azure.ai.ml.entities import Job
    
    from azure.ai.ml.entities import Output
    
    # Define your job
    
    outputs = { 
    
        "output_data": Output(
    
            type="uri_folder",  # replace with your data_type
    
            path="azureml://datastores/workspaceblobstore/paths/output-path",  # replace with your output_path
    
            mode="mount",  # replace with your output_mode
    
            name="my_output_data"
    
        ) 
    
    }
    
    # Create a job (replace this part with your actual job definition)
    
    job = Job(
    
        # job properties
    
        outputs=outputs
    
    )
    
    # Get a handle to your MLClient
    
    ml_client = MLClient.from_config()
    
    # Submit the job
    
    returned_job = ml_client.jobs.create_or_update(job)
    

    After your job completes, you can fetch the output data asset and add tags to it, using the ml_client.data.get method. So you need to update the data asset with the desired tags and save the updated data asset using the ml_client.data.update method.

    
    from azure.ai.ml.entities import Data
    
    # Get the data asset name from the job outputs
    
    output_data_name = returned_job.outputs["output_data"].name
    
    # Fetch the data asset
    
    data_asset = ml_client.data.get(name=output_data_name)
    
    # Update the data asset with tags
    
    data_asset.tags = {"mytag": "test"}
    
    # Update the data asset in the workspace
    
    ml_client.data.update(data_asset)
    

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.