An Azure service that is used to provision Windows and Linux virtual machines.
Hi Tom Liu,
Welcome to the Microsoft Q&A Platform. Thank you for posting your query here.
If you're using Azure VMs to run your embedding model, first, verify that your current VM is compatible with GPU scaling. For example, if you are using a standard A-series or D-series VM, you’ll need to switch to a GPU-enabled VM (e.g., NC-series, ND-series). Resize the VM to a GPU-enabled VM.
For more information on choosing the right SKU, you can use the following resources:
- Sizes for VMs in Azure: This article lists all the VM sizes available in Azure.
- Azure VM Selector: This tool helps you find the right VM SKU based on your workload type, OS and software, and deployment region.
- GPU optimized VM sizes are specialized virtual machines available with single, multiple, or fractional GPUs. These sizes are designed for compute-intensive, graphics-intensive, and visualization workloads. https://learn.microsoft.com/en-us/azure/virtual-machines/sizes/overview?toc=%2Fazure%2Fvirtual-machines%2Fwindows%2Ftoc.json&tabs=breakdownseries%2Cgeneralsizelist%2Ccomputesizelist%2Cmemorysizelist%2Cstoragesizelist%2Cgpusizelist%2Cfpgasizelist%2Chpcsizelist#gpu-accelerated
- To take advantage of the GPU capabilities of Azure N-series VMs backed by NVIDIA GPUs, you must install NVIDIA GPU drivers.https://learn.microsoft.com/en-us/azure/virtual-machines/windows/n-series-driver-setup
If you are using Azure Machine Learning to deploy and train your model, you can configure GPU clusters in AML to automatically scale your workload based on your processing requirements. Azure Machine Learning → Compute → Compute Clusters. Click + New to create a new compute cluster. Select a GPU-enabled VM size like Standard_NC6 or Standard_ND24s.
When running the model, ensure that your code is configured to use GPU. Most deep learning libraries like TensorFlow and PyTorch automatically use the GPU if it's available, but you should make sure to install the GPU versions of these libraries:
https://learn.microsoft.com/en-us/azure/machine-learning/concept-compute-target?view=azureml-api-2
Hope this helps!
Let me know if you have any further queries!
If the information provided is helpful to you, please click "Upvote" on the post to let us know.