I am experimenting with constructing some DNNs in a notebook running in Azure Machine Learning Studio. In order to speed up model training in tensorflow/keras I want to utilize the GPU of my compute instance. However, upon importing tensorflow in my notebook, I get the following error:
I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
I tensorflow/compiler/xla/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory
W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory
W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.
I am running my notebook on a STANDARD_NC4AS_T4_V3 compute instance, which does have a GPU (also confirmed by running the nvidia-smi
command in the terminal, showing CUDA version 11.4).
I am using the vanilla Python 3.8 - Pytorch and Tensorflow
environment that comes with the compute instance. I have not attempted to install any additional packages in the environment. I have tried the same also with the remaining environments that are available by default on the compute instance.
The installed version of tensorflow = 2.11
, and the available version of CUDA=11.4
, which should be compatible as far as I can tell.
Please advice how can I enable GPU training on such a compute instance?
Is there a way to rebuild tensorflow with the right compiler flags (and what are they)?