Azure ML job logging issues - Transformers model

Mike Klinkhammer 0 Reputation points
2025-03-09T20:47:48.6166667+00:00

I am working in azure trying to run a job that calls a training notebook. I can train and even evaluate my model just fine within said notebook but when I try to log it at the end it throws errors. The error that I am seeing is

[0;31mHFValidationError[0m: Repo id must be in the form 'repo_name' or 'namespace/repo_name': './models/finetuned_llama3/'. Use repo_type argument if needed.

From some research it seems that this means that it is trying to pull straight from hugging face based on my artifact path. I know that the the model exists where I am referencing because I am logging the directory and can see it exists there. I have tried setting arguments and environment variables telling it not to look for a repo with no success.

Here is what my logging logic looks like:

job_model_path = 'models/finetuned_llama3' 
peft_model = AutoPeftModelForCausalLM.from_pretrained( 
	job_model_path, 
	config=LoraConfig( 
		r=lora_config_dict["r"], 
		lora_alpha=lora_config_dict["lora_alpha"], 
		target_modules=lora_config_dict["target_modules"], 
		lora_dropout=lora_config_dict["lora_dropout"], 
		bias=lora_config_dict["bias"], 
		task_type=lora_config_dict["task_type"] ), 
	device_map="cuda" 
) 

peft_model.model.config.quantization_config.use_exllama = True 
peft_model.model.config.quantization_config.exllama_config = {"version": 2} 

mlflow.transformers.log_model( 
	transformers_model={"model": peft_model, "tokenizer": tokenizer}, 	
	artifact_path="finetuned_llama3", # Ensure the artifact path is correct 
	registered_model_name="huggingface-finetuned-model", 
	task="text-generation" # Specify the task type here 
)

When I try to log the model in this manner in an ML studio notebook it works as expected so it’s something with how we configure the job

Being that the mlflow flavor is relatively new it has been hard to find a ton of stuff out there about it. I have tried to find other posts / forums about this issue but haven’t found anything that was helpful. GPT and Copilot seem to have no clue how to solve my issue either.

I’ve seen people say that my artifact path cannot look like a full URL so I have changed that variable many times from full URLs to relative ones. I have also played around with my ‘transformers_model’ argument inputs from referencing the objects to just inputting the path.

I am expecting this to log a model to the azure model registry.

For reference this is the model we are finetuning: (astronomer/Llama-3-8B-Instruct-GPTQ-8-Bit · Hugging Face)

I've posted this question on on the hugging face forums and stackoverflow but this seems like it is an azure specific issue.

Azure Machine Learning
Azure Machine Learning
An Azure machine learning service for building and deploying models.
3,332 questions
{count} votes

3 answers

Sort by: Most helpful
  1. Manas Mohanty 5,620 Reputation points Microsoft External Staff Moderator
    2025-03-12T12:41:00.2+00:00

    Hi Mike Klinkhammer

    Sorry for the delay in response.

    I found a relevant documentation on finetuning with QLora and PEFT which might be helpful to you.

    Process in above documentation suggests pulling the run id of last training job using mlflow.start_run(runid) and log the finetuned model from trainer.

    import mlflow
    # Get the ID of the MLflow Run that was automatically created above
    last_run_id = mlflow.last_active_run().info.run_id
    # Save a tokenizer without padding because it is only needed for training
    tokenizer_no_pad = AutoTokenizer.from_pretrained(base_model_id, add_bos_token=True)
    # If you interrupt the training, uncomment the following line to stop the MLflow run
    # mlflow.end_run()
    with mlflow.start_run(run_id=last_run_id):
      mlflow.log_params(peft_config.to_dict())
      mlflow.transformers.log_model(
          transformers_model={"model": trainer.model, "tokenizer": tokenizer_no_pad},
          prompt_template=prompt_template,
          signature=signature,
          artifact_path="model",  # This is a relative path to save model files within MLflow run
      )
    
    

    Relevant sections to reference

    define-a-peft-model

    Kick off a training job

    Save and log the PEFT models

    Hope it helps address the issues.

    Thank you.

    0 comments No comments

  2. Mike Klinkhammer 20 Reputation points
    2025-03-18T14:48:41.02+00:00
    0 comments No comments

  3. Manas Mohanty 5,620 Reputation points Microsoft External Staff Moderator
    2025-03-19T06:43:05.31+00:00

    Hi Mike Klinkhammer

    I'm glad that you were able to resolve your issue and thank you for posting your solution so that others experiencing the same thing can easily reference this! Since the Microsoft Q&A community has a policy that "The question author cannot accept their own answer. They can only accept answers by others ", I'll repost your solution in case you'd like to accept the answer.

    Ask: Facing the error "Repo id must be in the form 'repo_name' or 'namespace/repo_name': './models/finetuned_llama3/'. Use repo_typeargument if needed" while trying log a finetune model using mlflow

    Solution: The issue is resolved by adding extra step to save peft_model.config as json.

    with open("models/finetuned_llama3/config.json", "w") as f:
        json.dump(peft_model.config.to_dict(), f, indent=4)
    
    mlflow.transformers.log_model(
        transformers_model='models/finetuned_llama3',
        artifact_path="models/finetuned_llama3",
        registered_model_name="huggingface-finetuned-model",
        task="text-generation",
        save_pretrained=True
    )
    
    

    This is because the config file we need is an attribute of the peft mode but if not in the folder that your finetuned model is saved in.

    Reference thread

    If I missed anything please let me know and I'd be happy to add it to this answer, or feel free to comment below with any additional information.

    If you have any other questions, please let me know. Thank you again for your time and patience throughout this issue. 

    Please don’t forget to Accept Answer and Yes for "was this answer helpful" wherever the information provided helps you, this can be beneficial to other community members.

    Thank You.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.