Hi Michael Dong
Greetings & Welcome to the Microsoft Q&A forum! Thank you for sharing your query.
To serve your fine-tuned Llama 3.1 model and Qwen 2.5-1.5B model in Azure using managed compute. Here are some steps to follow. can you please refer this
- You can upload your fine-tuned model to Azure Machine Learning by using the Azure Machine Learning studio.
- You will need to select the workspace where you want to deploy the model and choose the model from the studio's model catalog.
- Find and Serve the Fine-Tuned Model with Managed Compute.
- After uploading, navigate to the model's overview page in Azure Machine Learning studio.
- Select the option to deploy the model and choose "Managed Compute."
- Once the deployment is complete, you can access the endpoint's details page to obtain code samples for consuming the deployed model in your application. Hope this helps. Do let us know if you any further queries.
If this answers your query, do click Accept Answer
and Yes
for was this answer helpful. And, if you have any further query do let us know.