how to run tensorflow 2.12 in Azure ML studio on a GPU compute

Question

how to run tensorflow 2.12 in Azure ML studio on a GPU compute

Peter Tribout 25

When I try to "model.fit(...)" with Tensorflow 2.12 on a GPU compute in Azure ML studio I get errors related to the CUDA drivers(NVIDIA).

The provided kernel "Python 38 Tensorflow Pytorch" has:

tensorflow 2.12 - > OK
CUDA driver 11.4 (cfr nvidia-smi) -> Not OK -> this needs to be = 12.0

When I run the same notebook on Colab, all is fine !

On stackoverflow they advice to or lower to TF-2.4 (not ok because too low) or upgrade CUDA drivers (I tried but did not succeed)

Are there other GPU compute architecture that do support TF2.12 or any other advice

thx Peter

Accepted answer

0 additional answers

Your answer

Answer 1

@Peter Tribout

Welcome to Q&A and thank you for posting your questions here.

You were asking how to run tensorflow 2.12 in Azure ML studio on a GPU compute, and if there are other GPU compute architecture that do support TF2.12 or any other advice.

To answer your question: To run Tensorflow 2.12 on a GPU compute in Azure ML studio, you need to make sure that the CUDA driver version is 12.0. The provided kernel “Python 38 Tensorflow Pytorch” has tensorflow 2.12 and CUDA driver 11.4 which is not compatible with Tensorflow 2.12.

You can follow the instructions provided in this Microsoft Learn article to update the CUDA driver version to 12.0.

You can also learn how to train and deploy a TensorFlow model using Azure Machine Learning Python SDK v2 in this Microsoft Learn article.

If you want to learn more about distributed training with Azure Machine Learning SDK (v2) supported frameworks such as TensorFlow, you can check out this Microsoft Learn article.

About other GPU, I’m not sure about other GPU compute architectures that support TensorFlow 2.12. However, TensorFlow 2.12 can be run on a single GPU with no code changes required.

You can use tf.config.list_physical_devices('GPU') to confirm that TensorFlow is using the GPU.

The simplest way to run on multiple GPUs, on one or many machines, is using Distribution Strategies. You can read more through the below links:

https://www.tensorflow.org/guide/gpu

https://docs.nvidia.com/deeplearning/frameworks/tensorflow-release-notes/rel-22-12.html

I hope that helps! Let me know if you have any other questions.

If this answer solves your issue, please vote for it so other community members know that this is a quality answer.

Regards,

Sina

Peter Tribout 25 Reputation points

2023-06-01T20:08:18.4166667+00:00

Thx for your answer, I will give it a try :-)

Can you please give the correct link on how to upgrade to CUDA 12.0 because the link is missing.

You can follow the instructions provided in this [Microsoft Learn article](https://learn.microsoft.com/en-us/answers/questions/1296117/how-to-run-tensorflow-2-12-in-azure-ml-studio-on-a.html) to update the CUDA driver version to 12.0.

thx Peter
Sina Salam 22,031 Reputation points Volunteer Moderator

2023-06-01T20:23:39.3266667+00:00

@Peter Tribout

Sorry for the missing link.

To update the CUDA driver version to 12.0, you can follow the instructions provided by Microsoft Learn.

https://learn.microsoft.com/en-us/windows/ai/directml/gpu-cuda-in-wsl

Also, the instructions include installing the GPU driver, installing WSL, and getting started with NVIDIA CUDA.

In addition, to update the CUDA driver, you can remove any CUDA PPAs and the nvidia-cuda-toolkit if installed, remove all NVIDIA drivers, and then update the system. Alternatively, you can install a newer CUDA toolkit, which will have a newer GPU driver bundled with it, or retrieve a driver from NVIDIA's website and install it.

I hope this helps.

Regards,

Sina

Share via

how to run tensorflow 2.12 in Azure ML studio on a GPU compute

0 additional answers

Your answer