I want to deploy a new Llama-2-70b-chat model model via Azure ML endpoint.

Mykola Kyrychenko 35 Reputation points
2023-07-21T07:11:52.5366667+00:00

Hello,
I'm planning to deploy the Llama-2-70b-chat model and want to integrate custom embeddings based on my data. I've read that A10, A100, or V100 GPUs are recommended for training. In the tutorial notebook is provided next:

sku_name = "Standard_NC24s_v3"  # Name of the sku(instance type) Check the model-list(can be found in the parent folder(inference)) to get the most optimal sku for your model (Default: Standard_DS2_v2)

In parent folder I could not see any info regarding optimal sku
Could you provide me with a list of supported compute instances suitable for this task? I'm interested in testing the performance of different instances with this model.
Thanks

Azure Machine Learning
Azure Machine Learning
An Azure machine learning service for building and deploying models.
3,244 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Maciej Skorupka 25 Reputation points
    2023-07-26T14:13:13.53+00:00

    I believe that recommended SKUs are shown in a tooltip when you try to select a compute for the Deploy in Azure ML Studio.

    "The allowed skus for this model are Standard_ND40rs_v2, Standard_ND96asr_v4, Standard_ND96amsr_v4"


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.