Host LLM Webapp on Azure - What is the way to go?

Maximilian Weißenbacher (DE) 30 Reputation points
2024-01-10T15:05:11.6266667+00:00

Hi,

I am new to Azure but I want to host a Webapp on Azure. The Webapp is a RAG application and I am using a quantized model from "TheBloke" (Mixtral 8x7B, so I need some GPU power) at the moment and Streamlit as a UI.

Now I am not sure what is the best way to host such a web app. I saw on Azure Machine Learning, that I can use Model Endpoints of Mixtral. However in the model catalogue I wasn't able to find all Huggingface models I have used.

So would it be better to switch to a Virtual Machine and upload there the qunatized models? But I am not sure if the computing power is enough then. Also has someone experience with the cost of a similar application? For now, I only want to use the application for demo purposes, so there will only be a couple of people (<5) who will use the app.

Thanks for suggestions!

Azure Machine Learning
Azure Machine Learning
An Azure machine learning service for building and deploying models.
3,332 questions
Azure Virtual Machines
Azure Virtual Machines
An Azure service that is used to provision Windows and Linux virtual machines.
9,013 questions
{count} vote

Accepted answer
  1. YutongTie-MSFT 53,966 Reputation points Moderator
    2024-01-11T02:03:49.8566667+00:00

    @Maximilian Weißenbacher (DE) Thanks for reaching out to us, there are a few ways to do so you may want to take a look at them and let us know which one you are interested in. As you mentioned - Azure Machine Learning: You can deploy your quantized model as a web service using Azure Machine Learning. This will allow you to use the model endpoint in your web app. However, as you mentioned, not all Huggingface models may be available in the Azure Machine Learning model catalog. Additionally, deploying a model as a web service in Azure Machine Learning can be more complex than other options. Azure Virtual Machines: You can create a virtual machine in Azure and upload your quantized model to the virtual machine. This will give you more control over the environment and allow you to use GPU power if needed. However, you will need to manage the virtual machine yourself, which can be more time-consuming. Azure App Service: You can use Azure App Service to host your web app. This will allow you to deploy your web app quickly and easily, without having to manage the underlying infrastructure. However, you may need to use a different approach to use your quantized model with GPU power, such as using a separate API or service to handle the model.

    In terms of cost, the cost of hosting your web app on Azure will depend on a variety of factors, including the size and complexity of your app, the amount of traffic it receives, and the resources it requires. For a demo app with only a few users, the cost should be relatively low. You can use the Azure pricing calculator to estimate the cost of hosting your app on Azure.

    I hope this helps, let me know if you have further questions.

    Regards, Yutong

    -Please kindly accept the answer if you feel helpful to support the community, thanks a lot.

    0 comments No comments

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.