Is the NVadsA10_v5 series impossible to use Stable Diffusion Cuda?

유진 130 Reputation points
2024-04-24T02:07:43.0833333+00:00

Hello, I recently created a virtual machine with the Standard_NV36adms_A10_v5 product and tried to use Stable Diffusion.

However, entering the nvidia-smi command causes a "No devices we found error."

And while using the optimized image provided by Azure, the "NVIDIA-SMI has failed because it could't communicate with the NVIDIA driver. Make sure that the late NVIDIA driver is installed and running."

nvidia-smi
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

dpkg --list | grep nvidia
ii  libnvidia-cfg1-535-server:amd64                   535.161.08-0ubuntu2.20.04.1       amd64        NVIDIA binary OpenGL/GLX configuration library
ii  libnvidia-common-535-server                       535.161.08-0ubuntu2.20.04.1       all          Shared files used by the NVIDIA libraries
ii  libnvidia-compute-535-server:amd64                535.161.08-0ubuntu2.20.04.1       amd64        NVIDIA libcompute package
ii  libnvidia-compute-535-server:i386                 535.161.08-0ubuntu2.20.04.1       i386         NVIDIA libcompute package
ii  libnvidia-container-tools                         1.15.0-1                          amd64        NVIDIA container runtime library (command-line tools)
ii  libnvidia-container1:amd64                        1.15.0-1                          amd64        NVIDIA container runtime library
ii  libnvidia-decode-535-server:amd64                 535.161.08-0ubuntu2.20.04.1       amd64        NVIDIA Video Decoding runtime libraries
ii  libnvidia-decode-535-server:i386                  535.161.08-0ubuntu2.20.04.1       i386         NVIDIA Video Decoding runtime libraries
ii  libnvidia-encode-535-server:amd64                 535.161.08-0ubuntu2.20.04.1       amd64        NVENC Video Encoding runtime library
ii  libnvidia-encode-535-server:i386                  535.161.08-0ubuntu2.20.04.1       i386         NVENC Video Encoding runtime library
ii  libnvidia-extra-535-server:amd64                  535.161.08-0ubuntu2.20.04.1       amd64        Extra libraries for the NVIDIA Server Driver
ii  libnvidia-fbc1-535-server:amd64                   535.161.08-0ubuntu2.20.04.1       amd64        NVIDIA OpenGL-based Framebuffer Capture runtime library
ii  libnvidia-fbc1-535-server:i386                    535.161.08-0ubuntu2.20.04.1       i386         NVIDIA OpenGL-based Framebuffer Capture runtime library
ii  libnvidia-gl-535-server:amd64                     535.161.08-0ubuntu2.20.04.1       amd64        NVIDIA OpenGL/GLX/EGL/GLES GLVND libraries and Vulkan ICD
ii  libnvidia-gl-535-server:i386                      535.161.08-0ubuntu2.20.04.1       i386         NVIDIA OpenGL/GLX/EGL/GLES GLVND libraries and Vulkan ICD
ii  linux-modules-nvidia-535-server-5.15.0-1061-azure 5.15.0-1061.70~20.04.1+1          amd64        Linux kernel nvidia modules for version 5.15.0-1061
ii  linux-modules-nvidia-535-server-azure             5.15.0-1061.70~20.04.1+1          amd64        Extra drivers for nvidia-535-server for the azure flavour
ii  linux-objects-nvidia-535-server-5.15.0-1061-azure 5.15.0-1061.70~20.04.1+1          amd64        Linux kernel nvidia modules for version 5.15.0-1061 (objects)
ii  linux-signatures-nvidia-5.15.0-1061-azure         5.15.0-1061.70~20.04.1+1          amd64        Linux kernel signatures for nvidia modules for version 5.15.0-1061-azure
ii  nvidia-compute-utils-535-server                   535.161.08-0ubuntu2.20.04.1       amd64        NVIDIA compute utilities
ii  nvidia-container-runtime                          3.10.0-1                          all          NVIDIA container runtime
ii  nvidia-container-toolkit                          1.10.0-1                          amd64        NVIDIA container runtime hook
ii  nvidia-driver-535-server                          535.161.08-0ubuntu2.20.04.1       amd64        NVIDIA Server Driver metapackage
hi  nvidia-fabricmanager-510                          510.73.08-1                       amd64        Fabric Manager for NVSwitch based systems.
ii  nvidia-firmware-535-server-535.161.08             535.161.08-0ubuntu2.20.04.1       amd64        Firmware files used by the kernel module
ii  nvidia-kernel-common-535-server                   535.161.08-0ubuntu2.20.04.1       amd64        Shared files used with the kernel module
ii  nvidia-kernel-source-535-server                   535.161.08-0ubuntu2.20.04.1       amd64        NVIDIA kernel source package
ii  nvidia-utils-535-server                           535.161.08-0ubuntu2.20.04.1       amd64        NVIDIA Server Driver support binaries
ii  xserver-xorg-video-nvidia-535-server              535.161.08-0ubuntu2.20.04.1       amd64        NVIDIA binary Xorg driver
azureuser@vm-rg-ailab-p-us-a10v5-vram24-003:~$ nvidia-smi
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

You have also tried reinstalling the CUDA Driver, but the virtual machine does not restart.

While trying to determine the cause, the Docs document confirmed that the NVadsA10_v5 series only supports Grid Drivers.

User's image

By any chance, is Stable Diffusion Cuda impossible to use because NVadsA10_v5 series only supports Grid Driver?

If so, is it possible to use Stable Diffusion Cuda for NCadsA10_v4 series that is not in the document?

Azure Virtual Machines
Azure Virtual Machines
An Azure service that is used to provision Windows and Linux virtual machines.
7,160 questions
{count} votes

1 answer

Sort by: Most helpful
  1. kobulloc-MSFT 23,496 Reputation points Microsoft Employee
    2024-04-24T23:19:35.3166667+00:00

    Hello, @유진 !

    What drivers can I use on NVads A10 v5 VMs? What about A10 VMs?

    I reached out to the virtual machine team and you are correct:

    • In the case of the NVads A10 v5 VMs, only GRID 14.1(510.73) or higher driver versions are supported.
    • In the case of the A10 SKU, Azure uses a GPU-P implementation which uses a unified driver that supports both CUDA and GRID functionalities. We publish the supported driver; after that during CUDA package installation you should skip the driver option.

    This means that you would not use the NVads A10 v5 VM for CUDA dependent workloads. In theory, the A10 unified driver for other VMs should support CUDA workloads.


    I hope this has been helpful! Your feedback is important so please take a moment to accept answers.

    If you still have questions, please let us know what is needed in the comments so the question can be answered. Thank you for helping to improve Microsoft Q&A!

    User's image