Testing LLMS

Stephen 85 Reputation points
2025-04-07T13:36:05.1666667+00:00

I am trying to test different LLMs to see which one best addresses our use case.

I was allocated quota on a Standard_NC24ads_A100_v4 with 24 cores and 220 GB RAM.

I was able to run my first LLM test on that. However, when I go to test other LLMs, they all seem to be only available for different SKUs even though the Standard_NC24ads_A100_v4 should be able to handle it.

What is the best way to test different LLMs? Do I have to go through the quota process for different machine SKUs essentially for each model I want to test?

Thanks,

Stephen Pillow

Azure AI services
Azure AI services
A group of Azure services, SDKs, and APIs designed to make apps more intelligent, engaging, and discoverable.
3,345 questions
{count} votes

Accepted answer
  1. Manas Mohanty 2,930 Reputation points Microsoft External Staff
    2025-04-07T15:30:39.68+00:00

    Hi Stephen

    All the models might not need high end GPU.

    Some of them are available with Serverless APIs, internal computes at Azure (OpenAI models) and some can run on even memory optimized computes.

    You can filter out model with Serverless APIs with "Deployment Option= Serverless API" filter.

    Yes, but you might to need to raise GPU for certain LLM models as the deployment requires.

    Reference- https://learn.microsoft.com/en-us/azure/machine-learning/concept-model-catalog?view=azureml-api-2

    Hope it addresses your query.

    Please accept this answer and say yes if it helped.

    Thank you.

    1 person found this answer helpful.
    0 comments No comments

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.