After installing nvidia driver on NV12ads A10 and rebooting, VM is inaccessible

塚田 真輝 10 Reputation points
2024-10-12T06:32:33.26+00:00
  • First, check if the GPU is attached.
azureuser@i-vm-a10:~$ lspci | grep -i NVIDIA 0002:00:00.0 3D controller: NVIDIA Corporation GA102GL [A10] (rev a1)
  • Install ubuntu-drivers utility
sudo apt update && sudo apt install -y ubuntu-drivers-common
  • Install nvidia-driver-550
azureuser@i-vm-a10:~$ sudo apt install -y nvidia-driver-550
  • reboot
sudo reboot
  • Accessing VMs via ssh
channel 0: open failed: connect failed: Connection refused
stdio forwarding failed
Connection closed by UNKNOWN port 65535

The following two base images were tried.

  • NVIDIA GPU-Optimized VMI with vGPU driver
  • Ubuntu Server 22.04 LTS
  • It is possible to reboot and re-access the system before putting in the driver.
Azure Virtual Machines
Azure Virtual Machines
An Azure service that is used to provision Windows and Linux virtual machines.
9,043 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Nikhil Duserla 8,100 Reputation points Microsoft External Staff Moderator
    2024-10-14T15:20:46.5966667+00:00

    Hi @塚田 真輝,

    Welcome to the Microsoft Q&A Platform! Thank you for your question and for providing the information.

    We understand from your query that you are experiencing an issue while trying to access with nvidia-smi. $ nvidia-smi NVIDIA-SMI has failed because it couldn’t communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

    At this time, you need to check whether the nvidia driver is correctly installed, or update to the latest version of nvidia-drivers.

    The Azure NVads A10 v5 VMs only support GRID 14.1(510.73) or higher driver versions. The vGPU driver for the A10 SKU is a unified driver that supports both graphics and compute workloads. Please refer to this: https://learn.microsoft.com/en-us/azure/virtual-machines/linux/n-series-driver-setup#supported-distributions-and-drivers

    Uninstall the vGPU driver to start from a clean slate. Reinstall the vGPU driver using the correct installation method for your system. Make sure to follow the official NVIDIA documentation for installing the vGPU driver on Ubuntu Server 22.04 LTS. https://docs.nvidia.com/datacenter/tesla/driver-installation-guide/index.html

    If you have any further queries, do let us know.

    1 person found this answer helpful.

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.