How to serve fine tuned LLM model in Azure ?

Michael Dong 40 Reputation points Microsoft Employee
2024-12-08T23:21:14.93+00:00

I have a fine tuned LLama3.1 model and Qwen2.5-1.5B model, want to deploy them to Azure AI and use managed hosted way to serve them:

  1. How could I upload my fine tuned model to Azure ?
  2. How to find the fine tuned model and serve with managed compute ? https://learn.microsoft.com/en-us/azure/ai-studio/how-to/deploy-models-managed
Azure AI services
Azure AI services
A group of Azure services, SDKs, and APIs designed to make apps more intelligent, engaging, and discoverable.
3,602 questions
0 comments No comments
{count} votes

Accepted answer
  1. kothapally Snigdha 3,020 Reputation points Microsoft External Staff Moderator
    2024-12-09T07:18:16.9066667+00:00

    Hi Michael Dong

    Greetings & Welcome to the Microsoft Q&A forum! Thank you for sharing your query.

    To serve your fine-tuned Llama 3.1 model and Qwen 2.5-1.5B model in Azure using managed compute. Here are some steps to follow. can you please refer this

    • You can upload your fine-tuned model to Azure Machine Learning by using the Azure Machine Learning studio.
    • You will need to select the workspace where you want to deploy the model and choose the model from the studio's model catalog.
    • Find and Serve the Fine-Tuned Model with Managed Compute.
    • After uploading, navigate to the model's overview page in Azure Machine Learning studio.
    • Select the option to deploy the model and choose "Managed Compute."
    • Once the deployment is complete, you can access the endpoint's details page to obtain code samples for consuming the deployed model in your application. Hope this helps. Do let us know if you any further queries.

    If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.

    1 person found this answer helpful.

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.