Why can I only deploy Falcon-7b on VM with minimum 440GB RAM?

Question

Why can I only deploy Falcon-7b on VM with minimum 440GB RAM?

Casey 65

According to the Hugging Face model card, Falcon-7b requires at least 16GB of RAM, suggesting that it can run on 16GB (or close to it).

But when I try to deploy an endpoint for it in the ML studio, the smallest available VM has 440 GB RAM, which seems excessive and unnecessarily expensive. I also don't have a quote for a VM of this size so it's preventing me from deploying a model that in theory I should be able to run on a much smaller VM.

How come these are the only available VM sizes?

romungi-MSFT 48,906 Reputation points Microsoft Employee Moderator

2023-07-19T08:28:22.87+00:00

@Casey Which screen of the studio are you referring to? I tried to create a real time endpoint for one of my models from Endpoints screen and it lists all available compute for the region.
Casey 65 Reputation points

2023-07-19T08:42:45.49+00:00

I'm looking at this screen, which comes from navigating to the Model catalog, selecting a model, then clicking "Deploy".

1 answer

Your answer

romungi-MSFT 48,906 Reputation points Microsoft Employee Moderator

2023-07-19T08:28:22.87+00:00

@Casey Which screen of the studio are you referring to? I tried to create a real time endpoint for one of my models from Endpoints screen and it lists all available compute for the region.
Casey 65 Reputation points

2023-07-19T08:42:45.49+00:00

I'm looking at this screen, which comes from navigating to the Model catalog, selecting a model, then clicking "Deploy".

Answer 1

romungi-MSFT 48,906 Microsoft Employee Moderator

@Casey Thanks for adding the screen shot. I think in this case it is because of the A100 GPU requirement to run the model. As you might have observed, the blog from tech community mentions the same.

These models need Nvidia A100 GPUs to run. You will need quota for one of the following Azure VM instance types that have the A100 GPU: "Standard_NC48ads_A100_v4", "Standard_NC96ads_A100_v4", "Standard_ND96asr_v4" or "Standard_ND96amsr_A100_v4".

If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.

Casey 65 Reputation points

2023-07-24T02:19:53.49+00:00

Okay - so there's no compute available with an A100 but less RAM? Again, 400GB is much more than required for these models so it adds unnecessary cost for our needs.
Casey 65 Reputation points

2023-07-24T02:25:43.3833333+00:00

I also don't think we require as many cores as this VM has.

Share via

Why can I only deploy Falcon-7b on VM with minimum 440GB RAM?

1 answer

Your answer