question

DevAzure-7832 avatar image
0 Votes"
DevAzure-7832 asked DevAzure-7832 edited

Azure Tesla V100 Driver Problem

Hello

I wish you a happy day.


There is a driver problem in virtual machine instances on Azure. Tesla v100 graphics cards cannot be assigned by default. Detailed information is attached.

84146-image.png


84108-image.png


azure-virtual-machines
image.png (147.9 KiB)
image.png (159.7 KiB)
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

OlgaOS-msft avatar image
0 Votes"
OlgaOS-msft answered OlgaOS-msft edited

@DevAzure-7832 Apologies for delay in response and all the inconvenience caused because of the issue.

Please correct me if I am wrong. You are not able to install NVIDIA Tesla V100 driver on the N-Series Azure VM. Could you please share your VM SKU/OS? What steps are you following in your set up?

I have looked into several documents, the driver will be installed after installing the NVIDIA GPU Driver Extension. Not sure if you have gone through these documents before:

As example, NCv3-series:

To take advantage of the GPU capabilities of Azure N-series VMs, NVIDIA GPU drivers must be installed.

The NVIDIA GPU Driver Extension installs appropriate NVIDIA CUDA or GRID drivers on an N-series VM. Install or manage the extension using the Azure portal or tools such as Azure PowerShell or Azure Resource Manager templates. See the NVIDIA GPU Driver Extension documentation for supported operating systems and deployment steps. For general information about VM extensions, see Azure virtual machine extensions and features.

If you choose to install NVIDIA GPU drivers manually, see N-series GPU driver setup for Windows or N-series GPU driver setup for Linux for supported operating systems, drivers, installation, and verification steps.

NVIDIA GPU Driver Extension for Windows
NVIDIA GPU Driver Extension for Linux

Hope it helps!!!

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

DevAzure-7832 avatar image
0 Votes"
DevAzure-7832 answered DevAzure-7832 edited

Hello @olgaoos thank you very much, how are you?

The series I use are as follows ...

NC24s v3 (4X Tesla V100 GPUS) OS: Windows 10

I have tried installing the drivers two ways.

1) I downloaded the drivers from the NVIDIA official site. The result is negative.

2) There is a "Extensions" menu on the Azure portal. I installed the drivers through this menu. The result is negative.

The devices are showing up fine. But when I want to set it as the default graphics device I get the above error. So you can understand I cannot use Tesla V100 devices at full performance.

I will be grateful if you help me.

Yours sincerely.

· 5
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

@DevAzure-7832 I didn't confirm it yet. Have one guess in mind. From the Docs I could see "NVIDIA GPU Driver Extension" supported on the Windows 10 . However, I don't see Windows 10 in the list of the "Supported operating systems and drivers | NVIDIA Tesla (CUDA) drivers"
84542-image.png


0 Votes 0 ·
image.png (133.6 KiB)

I think I'm talking complicated. Sorry.

I installed the drivers without any problems. However, the GPU is not set by default.

Do you understand me?

84562-image.png


84437-image.png


0 Votes 0 ·
image.png (147.9 KiB)
image.png (159.7 KiB)

Yes. I understand you were able to install the drivers, however the "verification" says it's not supported. I could be wrong and still not 100% confident. My point what this driver may not be supported on the Windows 10.... while you still is able to install it... based on the Public page I shared before, NVIDIA Tesla (CUDA) drivers only supported on Windows Server 2019 and Windows Server 2016.

I am referring to that table.

NVIDIA Tesla (CUDA) drivers for NC, NCv2, NCv3, NCasT4_v3, ND, and NDv2-series VMs (optional for NV-series) are supported only on the operating systems listed in the following table:

OS Driver
Windows Server 2019 451.82 (.exe)
Windows Server 2016 451.82 (.exe)




0 Votes 0 ·

Hello @DevAzure-7832.

I was able to confirm internally Windows 10 is not among the windows OS that support CUDA in Azure.

Does that answer your query?

Thanks!


0 Votes 0 ·
Show more comments