Nvidia driver too old error when loading bart model onto CUDA, works on other models

Question

Nvidia driver too old error when loading bart model onto CUDA, works on other models

matsuo_basho 30

I'm getting an error loading a HuggingFace model on an AzureML GPU compute (using AzureML notebooks). Loading other models works, such as the first one in the example below (code input is really buggy, gave up trying to format it in codeblock properly after 10 minutes. And this is the company revolutionizing our world with AI, lol):

from transformers import AutoModelForCausalLM
device = "cuda"
checkpoint1 = "Salesforce/codegen-350M-mono"

this works!!

codegen = AutoModelForCausalLM.from_pretrained(checkpoint1, trust_remote_code=True).to(device)
checkpoint2 = "facebook/bart-large"

this doesn't

bart = AutoModelForCausalLM.from_pretrained(checkpoint2, trust_remote_code=True).to(device)

Error

RuntimeError: The NVIDIA driver on your system is too old (found version 11040). Please update your GPU driver by downloading and installing a new version from the URL: http://www.nvidia.com/Download/index.aspx Alternatively, go to: https://pytorch.org to install a PyTorch version that has been compiled with your version of the CUDA driver.

I understand driver + Pytorch aren't compatible, but why does it work for the codegen model? Is there something about this particular Bart model? Seems like I shouldn't have to re-install CUDA drivers to get this to work.

Relevant libraries:

transformers 4.34.0
torch 2.1.0
nvidia-cublas-cu12      12.1.3.1
nvidia-cuda-cupti-cu12      12.1.105
nvidia-cuda-nvrtc-cu12      12.1.105
nvidia-cuda-runtime-cu12    12.1.105
nvidia-cudnn-cu12       8.9.2.26
nvidia-cufft-cu12       11.0.2.54
nvidia-curand-cu12      10.3.2.106
nvidia-cusolver-cu12        11.4.5.107
nvidia-cusparse-cu12        12.1.0.106
nvidia-nccl-cu12        2.18.1
nvidia-nvjitlink-cu12       12.2.140
nvidia-nvtx-cu12        12.1.105

Ramr-msft 17,826 Reputation points

2023-10-16T05:05:56.7933333+00:00

@matsuo_basho Thanks for the question, AutoModelForCausalLM.from_pretrained method loads the model weights into memory, and if the model is larger (like facebook/bart-large), it might require a newer version of CUDA for efficient memory management.
matsuo_basho 30 Reputation points

2023-10-16T14:15:14.95+00:00

@Ramr-msft so are you saying that if I launch this on a compute with much more memory that would fit Bart, I shouldn't get this error?

1 answer

Your answer

Ramr-msft 17,826 Reputation points

2023-10-16T05:05:56.7933333+00:00

@matsuo_basho Thanks for the question, AutoModelForCausalLM.from_pretrained method loads the model weights into memory, and if the model is larger (like facebook/bart-large), it might require a newer version of CUDA for efficient memory management.
matsuo_basho 30 Reputation points

2023-10-16T14:15:14.95+00:00

@Ramr-msft so are you saying that if I launch this on a compute with much more memory that would fit Bart, I shouldn't get this error?

Answer 1

Ramr-msft 17,826

Thanks for the details. Installing pytorch through transformers extras probably not the best way to get compatible torch to your environment. Based on the cuda drivers in your base image you can try to install torch recommended way https://pytorch.org/. It should be fine with transformers that only pin a lower bound. Alternatively you can try more recent cuda image from nvcr.io if you have an option to specify it.

Share via

Nvidia driver too old error when loading bart model onto CUDA, works on other models

this works!!

this doesn't

1 answer

Your answer