I received the following warnings related to libnvinfer.so.7
and libnvinfer_plugin.so.7
W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: xxxxx
W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: xxxxx
W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.
- Download TAR Package NVIDIA TensorRT from here
$ tar xf TensorRT-8.5.2.2.Linux.x86_64-gnu.cuda-11.8.cudnn8.6.tar.gz
$ mkdir -p ~/local/opt
$ mv TensorRT-8.5.2.2 ~/local/opt
$ cd local/TensorRT-8.5.2.2
$ ln -s libnvinfer.so.8.5.2 libnvinfer.so.7
$ ln -s libnvinfer_plugin.so.8.5.2 libnvinfer_plugin.so.7
- Add following lines to
~/.bashrc
or~/.bashenv
(For bash users)
export LD_LIBRARY_PATH=~/local/opt/TensorRT-8.5.2.2/lib:$LD_LIBRARY_PATH
export TensorRT_ROOT=~/local/opt/TensorRT-8.5.2.2
It has been a long day of setting up CUDA and CuDNN and lastly I wanted to try to install the right version of TensorRT which is 7 (for tf v2.11) so that errors were resolved before me hushing them. (with setting the environment variable TF_CPP_MIN_LOG_LEVEL to 3 because it gives an error about NUMA which I won't trail because it seems to be an issue with kernel compilation and I don't want to mess with WSL kernel yet) The 7th version is not in the package repository of Ubuntu 22 and Nvidia's archives only distributed deb packages of 7 up to Ubuntu 18.04. Nevertheless, I tried installing both the repository version and Nvidia's archive package but they didn't work.
I looked at your method and said, "This shouldn't work right?" and continued searching until I was about to give up. Just before giving up I opened this page, said why not, and linked the files with the same names (versioning differs) of TensorRT 10 distributed from the package manager to the required files. And it worked. Tensorflow was not complaining about it anymore. But did it really work? I am not sure but here is some training times with the same data and the same model on the same machine:
There was no GPU usage at all at the 1st one (obv), some but not to the throttling level in the 2nd one and max usage at the 3rd one that I had to put my convertible laptop into tent mode to let it cool. There could be a lot of other factors affecting the train time but I think the difference is big enough to say that it works.
So thanks for this gist!!!
What did I do
To this point it was more of a story like text so here is what I did: