Azure ML job logging issues - Transformers model

Question

Azure ML job logging issues - Transformers model

Mike Klinkhammer 0

I am working in azure trying to run a job that calls a training notebook. I can train and even evaluate my model just fine within said notebook but when I try to log it at the end it throws errors. The error that I am seeing is

[0;31mHFValidationError[0m: Repo id must be in the form 'repo_name' or 'namespace/repo_name': './models/finetuned_llama3/'. Use repo_type argument if needed.

From some research it seems that this means that it is trying to pull straight from hugging face based on my artifact path. I know that the the model exists where I am referencing because I am logging the directory and can see it exists there. I have tried setting arguments and environment variables telling it not to look for a repo with no success.

Here is what my logging logic looks like:

job_model_path = 'models/finetuned_llama3' 
peft_model = AutoPeftModelForCausalLM.from_pretrained( 
	job_model_path, 
	config=LoraConfig( 
		r=lora_config_dict["r"], 
		lora_alpha=lora_config_dict["lora_alpha"], 
		target_modules=lora_config_dict["target_modules"], 
		lora_dropout=lora_config_dict["lora_dropout"], 
		bias=lora_config_dict["bias"], 
		task_type=lora_config_dict["task_type"] ), 
	device_map="cuda" 
) 

peft_model.model.config.quantization_config.use_exllama = True 
peft_model.model.config.quantization_config.exllama_config = {"version": 2} 

mlflow.transformers.log_model( 
	transformers_model={"model": peft_model, "tokenizer": tokenizer}, 	
	artifact_path="finetuned_llama3", # Ensure the artifact path is correct 
	registered_model_name="huggingface-finetuned-model", 
	task="text-generation" # Specify the task type here 
)

When I try to log the model in this manner in an ML studio notebook it works as expected so it’s something with how we configure the job

Being that the mlflow flavor is relatively new it has been hard to find a ton of stuff out there about it. I have tried to find other posts / forums about this issue but haven’t found anything that was helpful. GPT and Copilot seem to have no clue how to solve my issue either.

I’ve seen people say that my artifact path cannot look like a full URL so I have changed that variable many times from full URLs to relative ones. I have also played around with my ‘transformers_model’ argument inputs from referencing the objects to just inputting the path.

I am expecting this to log a model to the azure model registry.

For reference this is the model we are finetuning: (astronomer/Llama-3-8B-Instruct-GPTQ-8-Bit · Hugging Face)

I've posted this question on on the hugging face forums and stackoverflow but this seems like it is an azure specific issue.

Manas Mohanty 5,620 Reputation points Microsoft External Staff Moderator

2025-03-17T07:09:40.2033333+00:00

Hi Mike Klinkhammer

We are checking to see if the below answer addressed your issue.

Thank you.
Manas Mohanty 5,620 Reputation points Microsoft External Staff Moderator

2025-03-18T06:04:31.7666667+00:00

Hi Mike Klinkhammer

We have not heard from you. Hope the pointers shared were useful to you.

Thank you

3 answers

Your answer

Manas Mohanty 5,620 Reputation points Microsoft External Staff Moderator

2025-03-17T07:09:40.2033333+00:00

Hi Mike Klinkhammer

We are checking to see if the below answer addressed your issue.

Thank you.
Manas Mohanty 5,620 Reputation points Microsoft External Staff Moderator

2025-03-18T06:04:31.7666667+00:00

Hi Mike Klinkhammer

We have not heard from you. Hope the pointers shared were useful to you.

Thank you

Answer 1

Hi Mike Klinkhammer

Sorry for the delay in response.

I found a relevant documentation on finetuning with QLora and PEFT which might be helpful to you.

Process in above documentation suggests pulling the run id of last training job using mlflow.start_run(runid) and log the finetuned model from trainer.

import mlflow
# Get the ID of the MLflow Run that was automatically created above
last_run_id = mlflow.last_active_run().info.run_id
# Save a tokenizer without padding because it is only needed for training
tokenizer_no_pad = AutoTokenizer.from_pretrained(base_model_id, add_bos_token=True)
# If you interrupt the training, uncomment the following line to stop the MLflow run
# mlflow.end_run()
with mlflow.start_run(run_id=last_run_id):
  mlflow.log_params(peft_config.to_dict())
  mlflow.transformers.log_model(
      transformers_model={"model": trainer.model, "tokenizer": tokenizer_no_pad},
      prompt_template=prompt_template,
      signature=signature,
      artifact_path="model",  # This is a relative path to save model files within MLflow run
  )

Relevant sections to reference

define-a-peft-model

Kick off a training job

Save and log the PEFT models

Hope it helps address the issues.

Thank you.

Answer 2

Mike Klinkhammer 20

Linking this forum with my explanation of the solution
https://discuss.huggingface.co/t/logging-finetuned-model-using-transformers-mlflow-flavor-in-azure/144687/5

Answer 3

Hi Mike Klinkhammer

I'm glad that you were able to resolve your issue and thank you for posting your solution so that others experiencing the same thing can easily reference this! Since the Microsoft Q&A community has a policy that "The question author cannot accept their own answer. They can only accept answers by others ", I'll repost your solution in case you'd like to accept the answer.

Ask: Facing the error "Repo id must be in the form 'repo_name' or 'namespace/repo_name': './models/finetuned_llama3/'. Use repo_typeargument if needed" while trying log a finetune model using mlflow

Solution: The issue is resolved by adding extra step to save peft_model.config as json.

with open("models/finetuned_llama3/config.json", "w") as f:
    json.dump(peft_model.config.to_dict(), f, indent=4)

mlflow.transformers.log_model(
    transformers_model='models/finetuned_llama3',
    artifact_path="models/finetuned_llama3",
    registered_model_name="huggingface-finetuned-model",
    task="text-generation",
    save_pretrained=True
)

This is because the config file we need is an attribute of the peft mode but if not in the folder that your finetuned model is saved in.

Reference thread

If I missed anything please let me know and I'd be happy to add it to this answer, or feel free to comment below with any additional information.

If you have any other questions, please let me know. Thank you again for your time and patience throughout this issue.

Please don’t forget to Accept Answer and Yes for "was this answer helpful" wherever the information provided helps you, this can be beneficial to other community members.

Thank You.

Share via

Azure ML job logging issues - Transformers model

3 answers

Your answer