How much does it cost to deploy Llama2 on Azure?

Question

How much does it cost to deploy Llama2 on Azure?

Natsuki 5

Please tell me the price when deploying Llama2(Meta-LLM) on Azure.

[Condition]

・To make it cheap, deployment, configuration, and operation will be done by me.

・Since for what I will use LLama2 hasn't been decided yet(probably for summarizing or comparing documents), so that I would like to know the cost for only deploying Llama2 on Azure.

[Questions]

・What resources will be required to deploy and use Llama2 on Azure. Please tell me the minimum requirements per each model with some scenarios.

・Cost for the above.

・Any additional information or tips regarding deploying or using Llama2 on Azure will be appreciated.

Thank you.

1 answer

Your answer

Answer 1

AshokPeddakotla-MSFT 35,971 Moderator

Natsuki Greetings & Welcome to Microsoft Q&A forum!

Please tell me the price when deploying Llama2(Meta-LLM) on Azure.

Deploying Llama2 (Meta-LLM) on Azure will require virtual machines (VMs) to run the software and store the data.

The cost of deploying Llama2 on Azure will depend on several factors, such as the number and size of VMs, the storage capacity, and the data transfer costs.

To give you an idea of the cost, let's consider a scenario where you deploy Llama2 on a single VM with 4 cores, 8 GB of RAM, and 128 GB of storage. The estimated cost for this VM is around $0.16 per hour or $115 per month. However, this is just an estimate, and the actual cost may vary depending on the region, the VM size, and the usage.

In addition to the VM cost, you will also need to consider the storage cost for storing the data and any additional costs for data transfer. The storage cost will depend on the amount of data you plan to store and the type of storage you choose (e.g., standard or premium). The data transfer cost will depend on the amount of data you transfer in and out of Azure.

What resources will be required to deploy and use Llama2 on Azure. Please tell me the minimum requirements per each model with some scenarios.

Llama2 has several models available for different natural language processing tasks such as text summarization and comparison. The resource requirements for deploying and using Llama2 on Azure will depend on the specific model you plan to use and the size of the data you plan to process.

A NOTE about compute requirements when using Llama 2 models: Finetuning, evaluating and deploying Llama 2 models requires GPU compute of V100 / A100 SKUs. You can find the exact SKUs supported for each model in the information tooltip next to the compute selection field in the finetune/ evaluate / deploy wizards. You can view and request AzureML compute quota here.

User's image

Llama2 Comparison Model: This model is designed for text comparison tasks and requires a VM with at least 4 cores and 8 GB of RAM to run efficiently. However, the resource requirements may vary depending on the size of the input data and the desired output. For example, if you plan to compare large documents or process multiple documents in parallel, you may need a VM with more cores and RAM to handle the workload.
Llama2 Multi-Document Summarization Model: This model is designed for summarizing multiple documents and requires a VM with at least 8 cores and 16 GB of RAM to run efficiently. However, the resource requirements may vary depending on the size of the input data and the desired output. For example, if you plan to summarize a large number of documents or process multiple document sets in parallel, you may need a VM with more cores and RAM to handle the workload. Please note that these are general guidelines, and the resource requirements may vary depending on the specific use case and the size of the data. It is recommended to test the performance of Llama2 on a smaller VM size first and then scale up as needed.

Cost for the above.

The estimated cost for deploying Llama2 on a single VM with 4 cores, 8 GB of RAM, and 128 GB of storage is around $0.16 per hour or $115 per month. However, this is just an estimate, and the actual cost may vary depending on the region, the VM size, and the usage.

To reduce the cost, you can choose a smaller VM size or use Azure Spot VMs, which offer up to 90% cost savings compared to regular VMs. You can also use Azure Storage to store your data, which offers low-cost storage options.

Any additional information or tips regarding deploying or using Llama2 on Azure will be appreciated.

Please see Introducing Llama 2 on Azure and Deploy large language models responsibly with Azure AI for more details.

Do let me know if that helps or have any other queries.

If the response helped, please do click Accept Answer and Yes for was this answer helpful.

Doing so would help other community members with similar issue identify the solution. I highly appreciate your contribution to the community.

Natsuki 5 Reputation points

2023-10-31T08:25:52.4233333+00:00

Hi Ashok,

Thank you for your response.

I would like to ask questions below, please give me answers each.

①I have 2 tickets regarding the same content on this portal and got responses each, and it seems the price for minimum requirements to deploy Llama2 on Azure is different. Could you kindly tell us the reason for that and which one is correct?

(The responder on the other ticket is Dillon Silzer. URL: https://learn.microsoft.com/en-us/answers/questions/1410069/how-much-does-it-cost-to-deploy-llama2-on-azure)

②Is the reason for the above because they used different types of VM based on different scenarios?

③Could you please tell me the price for "Llama2 Multi-Document Summarization Model" as well like you gave for "text comparison" version? And is the estimated cost for an individual or multiple users' usage?

④Is there any clarified VM spec that is required as minimum to run Llama2 on Azure or does it just depend on how we use? FYI, the responder on the other ticket was claiming that the minimum VM spec is "'Standard_NC12s_v3' with 12 cores, 224GB RAM, 672GB storage. It costs 6.5$/h and 4K+ to run a month is it the only option to run llama 2 on azure."

⑤Please calculate the cost based on below scenario.

・5 test users use Llama2 on Azure for summarizing 10 pages NDA(around 15000 token) each for 10 times a day and for 20 days.

[Question]

・Which VM spec will be the minimum spec for this scenario?

・How much will it cost in total? Please specify the details of breakdown.

Thank you.

Natsuki
AshokPeddakotla-MSFT 35,971 Reputation points Moderator

2023-10-31T08:42:27.66+00:00

Natsuki Thanks for sharing the additional details. I will update you as earliest.
Natsuki 5 Reputation points

2023-11-01T02:19:30.4766667+00:00

Looking forward to hearing your answers.

Thank you.
AshokPeddakotla-MSFT 35,971 Reputation points Moderator

2023-11-01T17:03:16.6166667+00:00

Natsuki Please find the below answers to your queries.

①I have 2 tickets regarding the same content on this portal and got responses each, and it seems the price for minimum requirements to deploy Llama2 on Azure is different. Could you kindly tell us the reason for that and which one is correct? (The responder on the other ticket is Dillon Silzer. URL: https://learn.microsoft.com/en-us/answers/questions/1410069/how-much-does-it-cost-to-deploy-llama2-on-azure) ②Is the reason for the above because they used different types of VM based on different scenarios?

Your understanding is correct. The VM referenced in the above response is different.

As mentioned earlier, You can find the exact SKUs supported for each model in the information tooltip next to the compute selection field in the finetune/ evaluate / deploy wizards. You can view and request AzureML compute quota here.

③Could you please tell me the price for "Llama2 Multi-Document Summarization Model" as well like you gave for "text comparison" version? And is the estimated cost for an individual or multiple users' usage?

As mentioned earlier as well, the cost of deploying the Llama2 Multi-Document Summarization Model on Azure will depend on various factors such as the size of the dataset, the complexity of the models, and the number of users accessing the system. Therefore, it is difficult to provide an accurate estimate without knowing the specific requirements. However, you can use the Azure pricing calculator to estimate the cost of running a virtual machine on Azure based on your specific requirements. Please see Azure Machine Learning pricing and Linux Virtual Machines Pricing for more details and estimation.

④Is there any clarified VM spec that is required as minimum to run Llama2 on Azure or does it just depend on how we use? FYI, the responder on the other ticket was claiming that the minimum VM spec is "'Standard_NC12s_v3' with 12 cores, 224GB RAM, 672GB storage. It costs 6.5$/h and 4K+ to run a month is it the only option to run llama 2 on azure."

Please note that VM availability varies by regions. The minimum VM spec required to run Llama2 on Azure will depend on the size of the dataset and the complexity of the models you plan to use. As a general guideline, you can start with a virtual machine with at least 4 cores and 8 GB of RAM. However, for larger datasets and more complex models, you may need to use a more powerful VM such as the 'Standard_NC12s_v3' with 12 cores, 224GB RAM, 672GB storage.

You can see the below screenshot for more clarification.

⑤Please calculate the cost based on below scenario. ・5 test users use Llama2 on Azure for summarizing 10 pages NDA(around 15000 token) each for 10 times a day and for 20 days. [Question] ・Which VM spec will be the minimum spec for this scenario? ・How much will it cost in total? Please specify the details of breakdown.

Based on the scenario you provided, the minimum VM spec required would be a virtual machine with at least 4 cores and 8 GB of RAM. Assuming that each NDA document is around 15,000 tokens, the total number of tokens processed per day would be 150,000 (10 pages x 15,000 tokens x 10 times). Therefore, the total number of tokens processed over 20 days would be 3,000,000 (150,000 tokens/day x 20 days).

Using the Azure pricing calculator, the estimated cost for running a virtual machine with 4 cores and 8 GB of RAM for 20 days would be around $100. The estimated cost for storing 50 GB of data on Azure Storage for 20 days would be around $5. The estimated cost for data transfer for 5 users accessing the system would be negligible.

Therefore, the total estimated cost for running Llama2 on Azure for 5 test users for 20 days would be around $105. This estimate is based on the assumption that the users will only be summarizing 10 pages of NDA documents (around 15,000 tokens) each for 10 times a day. If the users process more data or use more complex models, the cost may increase accordingly.

If you still have questions regarding the cost estimation and billing related queries, you can always raise a free billing and subscription request @ https://aka.ms/azsupt?

Hope this helps. Do let me know if there are any further queries.
Natsuki 5 Reputation points

2023-11-06T02:13:36.63+00:00

Hi Ashok,

Thank you for your response.

I appreciate you answered each of my queries.

However, I still would like you to clarify if Azure really can run Llama2 with 4 cores and 8GB of RAM for the example usage I told before by taking screen captures that showing there will no warnings or errors appear in the configuration. Sorry for your inconvenience but I still do not have Llama2, I cannot confirm it myself.

Since in the screenshot you shared with me in the last response it seems the warning saying running the Llama2 configuration at least 'Standard_NC24s_v3' is required, I would like you to convince me that your assumption is feasible.

Also, about the example usage I told you before is a little different from your formula so let me rewrite that below. Please take this into account to make the formula next time.

[Example usage]

Users: 5

Document (NDA): 10 pages (around 15,000 tokens in total)

Times: 10 times a day

※Each user will conduct this 10 times a day

Period: 20 days (working days in a month)

Thank you for your support.

Have a great day.

Best,

Natsuki
Natsuki 5 Reputation points

2023-11-06T05:54:09.0866667+00:00

Additional elements to the above example scenario, please calculate it using Japanese region (Japan east).

The way to use Azure for this example scenario is like this below, so could you please show me the cost if I use Azure only for API linking of LLM to a software via Azure?

LLM(Llama2)

↓(API)

Azure

↓(API)

Software in which the client can use LLM(Llama2) to summarize documents or something.

Please calculate the cost based on the above scenarios.

Also, I would appreciate it if you could walk me through how to confirm the minimum required VM models to run Llama2. I have tried to look at the same page as you were showing me where you saw the information "'Standard_NC24s_v3' is required, I could not reach out to the page.

Thank you.

Natsuki

Share via

How much does it cost to deploy Llama2 on Azure?

1 answer

Your answer