I am using NCasT4 GPU (Tesla T4) and I am encountring the same issue
NVIDIA-SMI couldn't communicate with the NVIDIA driver
Hi,I meet some problems installing NVIDIA driver.
My server is Azure with Standard_NC48ads_A100_v4, with no NVIDIA driver.
So I followed the steps in https://learn.microsoft.com/zh-cn/azure/virtual-machines/linux/n-series-driver-setup.
But after download and install,I entered "nvidia-smi" and received
"NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver.Make sure that the latest NVIDIA driver is installed and running"
Can you help us with the problem?Or can you help us with the NVIDIA driver install?
Some info of my server are as follows:
lspci | grep -i NVIDIA
0001:00:00.0 3D controller: NVIDIA Corporation Device 20b5 (rev al)
0002:00:00.0 3D controller: NVIDIA Corporation Device 20b5 (rev al)
Lsmod | grep
(empty)
dkms status
nvidia,535.54.03,5.15.0-1040-azure, x86 64: installed
nvidia,535.54.03,5.15.0-1041-azure, x86_64: installed
nokutil --sb-state
ScureBoot enabled
nvcc --version
nvcc: NVIDIA (R)Cuda compiler driver
Copyright (c)2005-2019 NVIDIA Corporation
Built on Sun Jul_28_19:07:16_PDT_2019
Cuda compilation tools,release 10.1,V10.1.243
uname -a
Linux ai 5.15.0-1041-azure #48~20.04.1-Ubuntu SMP Wed Jun 21 15:03:04 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux