This guide tries to make sense of installing NVIDIA CUDA on Ubuntu.
Disclaimer: Installing CUDA is a somewhat tedious and can be a problematic process. This guide worked for me, though if you have an unusual configuration you might need additional preparations to make this work. My machines are mostly blank Ubuntu machines.
For reference NVIDIA's official guides are here for CUDA and cuDNN.
Last updated: 2019-07-27
- Ubuntu 16.04
- NVIDIA driver 396.37
- CUDA 9.2 Patch 1
- cuDNN v7.2.1
- Blacklist Nouveau drivers:
sudo -i
rm /etc/modprobe.d/blacklist-nouveau.conf
echo 'blacklist nouveau' >> /etc/modprobe.d/blacklist-nouveau.conf
echo 'options nouveau modeset=0' >> /etc/modprobe.d/blacklist-nouveau.conf
update-initramfs -u
exit
- If you have previously installed CUDA using the runfile you need to remove it. Check by running and look for a folder named
cuda-X.X
ls /usr/local/
If you have CUDA installed execute the follow after replacing X.X with your version.
cd /usr/local/cuda-X.X/
sudo ./bin/uninstall_cuda.X.X.pl
sudo /usr/bin/nvidia-uninstall
- Purge all nvidia driver:
dpkg --get-selections | grep nvidia
sudo apt remove --purge [packages ..]
- Purge any CUDA leftovers:
dpkg --get-selections | grep cuda
sudo apt purge '*cuda*'
- Purge any cuDNN leftovers:
dpkg --get-selections | grep libcudnn
sudo apt purge '*cudnn*'
This guide uses the local deb file and installs the held back version with drivers. In English: this guide installs the Debian package along with graphic drivers in a way that disables automatic update.
-
Download the drivers from here.
-
Install the packages using:
sudo dpkg -i cuda-repo-ubuntu1604-9-2-local_9.2.148-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu1604-9-2-148-local-patch-1_1.0-1_amd64.deb
sudo apt-key add /var/cuda-repo-9-2-local/7fa2af80.pub
sudo apt update
sudo apt install cuda-9-2
-
Download cuDNN (requires a NVIDIA Developer account) from here: cuDNN v7.2.1 Runtime Library for Ubuntu16.04 and cuDNN v7.2.1 Developer Library for Ubuntu16.04.
-
Install the packages:
sudo dpkg -i libcudnn7_7.2.1.38-1+cuda9.2_amd64.deb
sudo dpkg -i libcudnn7-dev_7.2.1.38-1+cuda9.2_amd64.deb
sudo apt install libcudnn7-dev libcudnn7
- Setup shell environment. Add the followin lines to
~/.bashrc
or similar. Restart your terminal/shell afterwards.
export CUDA_HOME=/usr/local/cuda-9.2
export LD_LIBRARY_PATH=${CUDA_HOME}/lib64
PATH=${CUDA_HOME}/bin:${PATH}
export PATH
- Restart the computer.
nvcc -V
Should output:
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Tue_Jun_12_23:07:04_CDT_2018
Cuda compilation tools, release 9.2, V9.2.148
Run:
nvidia-smi
Should output something similar to:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 396.37 Driver Version: 396.37 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 TITAN Xp Off | 00000000:04:00.0 Off | N/A |
| 23% 30C P0 61W / 250W | 0MiB / 12196MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
If you want to test CUDA futher run the samples:
cp -r /usr/local/cuda-9.2/samples $HOME
cd $HOME/samples
make
./bin/x86_64/linux/release/deviceQuery
./bin/x86_64/linux/release/bandwidthTest
They should should output:
CUDA Device Query (Runtime API) version (CUDART static linking)
[...]
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 9.2, CUDA Runtime Version = 9.2, NumDevs = 1
Result = PASS
[CUDA Bandwidth Test] - Starting...
Running on...
[...]
result = PASS
The root cause of this can be unclear and just reinstalling the Nvidia drivers might not help.
Here are som suggestions:
- Creating a new Xorg.conf, see this old guide.
- Checking the default display manager, making sure it's correctly configured and perhaps reinstalling it:
cat etc/X11/default-display-manager
, more here. - If you have multiple displays it might help to temporarily only use one display until the problem is resolved.