GPU Setup on Azure

Question

GPU Setup on Azure

Tom Liu 0

Hi, I am trying to boost the processing speed when I run an embedding model via an API. I think currently it is being processed with CPU resources on the cloud. How do I boost the speed by engaging GPU resources? Thank you.

Tom

Anusree Nashetty 4,375 Reputation points Microsoft External Staff Moderator

2025-02-18T08:22:49.5366667+00:00

Hi Tom Liu,

Did you get a chance to see the response provided by Vinodh247, Deepanshu katara and Mounika Reddy Anumandla. If you have any further queries, please do let us know. If any of the answer is helpful to you, please click "Upvote" and "Accept Answer" on the post.

3 answers

Your answer

Anusree Nashetty 4,375 Reputation points Microsoft External Staff Moderator

2025-02-18T08:22:49.5366667+00:00

Hi Tom Liu,

Did you get a chance to see the response provided by Vinodh247, Deepanshu katara and Mounika Reddy Anumandla. If you have any further queries, please do let us know. If any of the answer is helpful to you, please click "Upvote" and "Accept Answer" on the post.

Answer 1

Hi ,

Thanks for reaching out to Microsoft Q&A.

Choose a GPU-enabled Azure service (VM, AKS, AML, ACI)
Deploy your API on a GPU-enabled VM/container
Modify your ML code to leverage GPU (PyTorch, TensorFlow, etc.)
Test and monitor GPU usage with nvidia-smi
Enable auto-scaling for high-traffic workloads

Since you're running an embedding model via an API, you need to ensure that your service supports GPU acceleration. Here are common options:

Azure Machine Learning (AML) Compute Instance or Cluster (for ML workloads)
Azure Kubernetes Service (AKS) with GPU nodes (for scalable APIs)
Azure Virtual Machines (VMs) with GPU (for dedicated model inference)
Azure Container Instances (ACI) with GPU (for lightweight containerized inference)

Choose a GPU-Enabled VM

Azure provides various GPU VMs optimized for ML workloads. Choose an appropriate one based on your needs...

GPU VM Series	GPU Type	Use Case
NC-series	NVIDIA Tesla K80/V100	Deep Learning, Training
NC-series	NVIDIA Tesla K80/V100	Deep Learning, Training
ND-series	NVIDIA Tesla P40/P100	AI/ML, Training
NV-series	NVIDIA Tesla M60	Graphics, Inference
ND A100 v4	NVIDIA A100	High-performance AI

imv, for embedding model inference, NC, ND, or ND A100 series should be optimal.

Please feel free to click the 'Upvote' (Thumbs-up) button and 'Accept as Answer'. This helps the community by allowing others with similar queries to easily find the solution.

Answer 2

Hello Tom , Welcome to MS Q&A

I think you can choose virtual machine (VM) instance types that include GPU resources. For instance, Azure's NC, ND, and NV series VMs are equipped with NVIDIA GPUs, designed for high-performance tasks.

Each type has specific configurations regarding the number of GPUs, GPU memory, vCPUs, and CPU memory, catering to different computational needs.

For more detailed information, you can refer to the following Azure documentation:

Please let me know if you have further questions

Kindly accept answer if it helps

Thanks
Deepanshu

Answer 3

Hi Tom Liu,

Welcome to the Microsoft Q&A Platform. Thank you for posting your query here.

If you're using Azure VMs to run your embedding model, first, verify that your current VM is compatible with GPU scaling. For example, if you are using a standard A-series or D-series VM, you’ll need to switch to a GPU-enabled VM (e.g., NC-series, ND-series). Resize the VM to a GPU-enabled VM.

For more information on choosing the right SKU, you can use the following resources:

Sizes for VMs in Azure: This article lists all the VM sizes available in Azure.
Azure VM Selector: This tool helps you find the right VM SKU based on your workload type, OS and software, and deployment region.
GPU optimized VM sizes are specialized virtual machines available with single, multiple, or fractional GPUs. These sizes are designed for compute-intensive, graphics-intensive, and visualization workloads. https://learn.microsoft.com/en-us/azure/virtual-machines/sizes/overview?toc=%2Fazure%2Fvirtual-machines%2Fwindows%2Ftoc.json&tabs=breakdownseries%2Cgeneralsizelist%2Ccomputesizelist%2Cmemorysizelist%2Cstoragesizelist%2Cgpusizelist%2Cfpgasizelist%2Chpcsizelist#gpu-accelerated
To take advantage of the GPU capabilities of Azure N-series VMs backed by NVIDIA GPUs, you must install NVIDIA GPU drivers.https://learn.microsoft.com/en-us/azure/virtual-machines/windows/n-series-driver-setup

If you are using Azure Machine Learning to deploy and train your model, you can configure GPU clusters in AML to automatically scale your workload based on your processing requirements. Azure Machine Learning → Compute → Compute Clusters. Click + New to create a new compute cluster. Select a GPU-enabled VM size like Standard_NC6 or Standard_ND24s.

When running the model, ensure that your code is configured to use GPU. Most deep learning libraries like TensorFlow and PyTorch automatically use the GPU if it's available, but you should make sure to install the GPU versions of these libraries:

https://learn.microsoft.com/en-us/azure/machine-learning/concept-compute-target?view=azureml-api-2

Hope this helps!

Let me know if you have any further queries!

If the information provided is helpful to you, please click "Upvote" on the post to let us know.

Share via

GPU Setup on Azure

3 answers

Your answer