nvidia-smi
is very useful, in fact you can hack together a little nvtop
with just watch -n 1 "nvidia-smi"
. Note, however, that the CUDA version in the top right corner of the output is the highest CUDA version supported, NOT installed.
You get also get your GPU's name with nvidia-smi --query-gpu=gpu_name --format=csv
(credit to this interesting forumn thread)
And to get your CUDA driver version use nvcc --version
nvidia publishs a lot of linux docker images with different versions of CUDA installed on dockerhub
for more and older versions/ combos, you can go straight to their own docker registry
$ workon myenv
(myenv) $ python -m torch.utils.collect_env
This post had a lot of solutions, but the one that really works was this script
What seems to be the most import environmental variables are CUDA_HOME
(torch
will use this), LD_LIBRARY_PATH
and PATH
. You can set this up in your ~/.bashrc
per this gist:
export CUDA_HOME="/usr/local/cuda"
export LD_LIBRARY_PATH=${CUDA_HOME}/lib64:$LD_LIBRARY_PATH
export PATH="$PATH:$CUDA_HOME/bin"
To see what CUDA versions are available ls /usr/local/cuda*
. The path /usr/local/cuda
can in fact be a symlink created with suda ln -sfT /usr/local/cuda-11.7 /usr/local/cuda
for example to set it to version 11.7