Install CUDA toolkit and drivers in VM

Question

Install CUDA toolkit and drivers in VM

Luis Vivas 20

I have been trying to install CUDA toolkit in an N serie VM. I am able to install the drivers but when trying to do NVCC --version, it does not work.

2 answers

Your answer

Answer 1

Here I answer myself, after having a having Azure MIcrosoft Support who helped me to set up the whole machine.

I want to mention that I need this machine to create and test AI models, in NLP and other deep learning stuff (Pytorch mainly). I have been trying this for weeks until I finally made it work.

Choose a VM in the N series. In Azure for now, N series are those that have GPU, remember that CUDA is a NVIDIA technology so make sure that the one you choose is NVIDIA and not AMD. I used Standard NC8as T4 v3. The C in the code, means computational if you choose V is intended for visual stuff. If you don´t find it in the list, it could mean that you do not have quota. Azure and the other guys are very restrictive with GPU, I assume due to the chips shortage.
Select a OS, I used Ubutnu 22.04, important to choose Security type as Standard. Azure recently set Trusted as default and it could cause problems when you add extensions such as the NVIDIA one. Or the drive you are going to install.
Once it is running, you need to install GCC, so you need: sudo apt-get install gcc
Then install make: sudo apt-get install make
Then install go to the CUDA toolkit website https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&Distribution=Ubuntu&target_version=22.04&target_type=runfile_local
I recommend that you choose local file. In other forums I saw they recommend network but due to an update, network was not working well. Please notice that this is CUDA 12, by the time you read this, there might be a different version, it should work as well (hopefully).
Run the commands as shown, it his case
wget https://developer.download.nvidia.com/compute/cuda/12.2.2/local_installers/cuda_12.2.2_535.104.05_linux.run
sudo sh cuda_12.2.2_535.104.05_linux.run
It should ask to accept terms and then ask to install. You do not need to change anything with this you are installing Driver 535.104.05 and toolkit 12.2.2
Now you should be able to see the driver. Rember that Drivers and tool kit are different things but the toolkit you are installing is going to install drivers.
If you type nvidia-smi, you should be able to see the driver and toolkit if you run anything using GPU the taks will be shown and the mermory usage will appear.
Now if you type nvcc --version, you should see the version. I had a problem and something to modify.
CUDA toolkit was installed but could not be found. If you go to the folder /usr/local/ you are going to find the folder cuda-12.2 so we need to edit the bash file in etc: sudo nano /etc/bash.bashrc and in the last line, add: export PATH=$PATH:/usr/local/cuda/bin
Save the file. reboot and the SSH page and enter again
if you do echo $PATH, you get this /usr/local/cuda/bin, what we did was to tell ubuntu to search in that folder.
Now if we type which nvcc we should get /usr/local/cuda/bin/nvcc
now type nvcc --version should give back something like this Copyright (c) 2005-2023 NVIDIA Corporation

Built on Tue_Aug_15_22:02:13_PDT_2023

Cuda compilation tools, release 12.2, V12.2.140

Build cuda_12.2.r12.2/compiler.33191640_0

I hope this save you time by working with Azure and GPU, it took me 2 weeks to solve it. I hope the new versions do not affect this solution and the in the future and the make an image that has everything at once so we Data Scientist do not have to deal with this.

Nikhil GN 5 Reputation points

2024-02-28T19:33:04.3066667+00:00

This was a savior honestly. Had been struggling for the past 4 days to install CUDA toolkit in an N series VM. Followed your instructions and now it's working fine.
kobulloc-MSFT 26,801 Reputation points Microsoft Employee Moderator

2024-02-28T22:25:31.55+00:00

Happy to hear that this has helped!

Answer 2

Hello, @Luis Vivas !

Thank you very much for following up with the process to install the CUDA toolkit on an N series VM. As you mentioned, this is a popular subject for data scientists and I know others will find this write up quite valuable.

I've upvoted your post but since there is currently a limitation in Microsoft Q&A that "The question author cannot accept their own answer. They can only accept answers by others ", I'll repost your solution in case you'd like to "Accept " the answer for additional visibility.

How do I install the CUDA toolkit on an N series VM?

Answer provided by Luix Vivas:

Here I answer myself, after having a having Azure MIcrosoft Support who helped me to set up the whole machine. I want to mention that I need this machine to create and test AI models, in NLP and other deep learning stuff (Pytorch mainly). I have been trying this for weeks until I finally made it work.

Choose a VM in the N series. In Azure for now, N series are those that have GPU, remember that CUDA is a NVIDIA technology so make sure that the one you choose is NVIDIA and not AMD. I used Standard NC8as T4 v3. The C in the code, means computational if you choose V is intended for visual stuff. If you don´t find it in the list, it could mean that you do not have quota. Azure and the other guys are very restrictive with GPU, I assume due to the chips shortage.

Select a OS, I used Ubutnu 22.04, important to choose Security type as Standard. Azure recently set Trusted as default and it could cause problems when you add extensions such as the NVIDIA one. Or the drive you are going to install.

Once it is running, you need to install GCC, so you need: sudo apt-get install gcc

Then install make: sudo apt-get install make

Then install go to the CUDA toolkit website https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&Distribution=Ubuntu&target_version=22.04&target_type=runfile_local

I recommend that you choose local file. In other forums I saw they recommend network but due to an update, network was not working well. Please notice that this is CUDA 12, by the time you read this, there might be a different version, it should work as well (hopefully).

Run the commands as shown, it his case

wget https://developer.download.nvidia.com/compute/cuda/12.2.2/local_installers/cuda_12.2.2_535.104.05_linux.run

sudo sh cuda_12.2.2_535.104.05_linux.run

It should ask to accept terms and then ask to install. You do not need to change anything with this you are installing Driver 535.104.05 and toolkit 12.2.2

Now you should be able to see the driver. Rember that Drivers and tool kit are different things but the toolkit you are installing is going to install drivers.

If you type nvidia-smi, you should be able to see the driver and toolkit if you run anything using GPU the taks will be shown and the mermory usage will appear.

Now if you type nvcc --version, you should see the version. I had a problem and something to modify.

CUDA toolkit was installed but could not be found. If you go to the folder /usr/local/ you are going to find the folder cuda-12.2 so we need to edit the bash file in etc: sudo nano /etc/bash.bashrc and in the last line, add: export PATH=$PATH:/usr/local/cuda/bin

Save the file. reboot and the SSH page and enter again

if you do echo $PATH, you get this /usr/local/cuda/bin, what we did was to tell ubuntu to search in that folder.

Now if we type which nvcc we should get /usr/local/cuda/bin/nvcc

now type nvcc --version should give back something like this Copyright (c) 2005-2023 NVIDIA Corporation Built on Tue_Aug_15_22:02:13_PDT_2023 Cuda compilation tools, release 12.2, V12.2.140 Build cuda_12.2.r12.2/compiler.33191640_0 I hope this save you time by working with Azure and GPU, it took me 2 weeks to solve it. I hope the new versions do not affect this solution and the in the future and the make an image that has everything at once so we Data Scientist do not have to deal with this.

Share via

Install CUDA toolkit and drivers in VM

2 answers

Your answer