scott@ksf16u:~$ lspci | grep -i vga
00:02.0 VGA compatible controller: Intel Corporation Skylake Integrated Graphics (rev 06)
scott@ksf16u:~$ lspci | grep -i 3D
02:00.0 3D controller: NVIDIA Corporation GM107M [GeForce GTX 960M] (rev a2)
Objective is to allow NVidia GTX 960M to work as accelerated GPU with Tensorflow 1.8 (current as of 6/10/18). This supports Eager execution for dynamic graphs.
Key References for iGPU for Display and GPU for CUDA
TL;DR - https://gist.github.com/hemenkapadia/9a36d4310b2bf6a945636f05d4a046c7 !
- Best Reference (even though it's for 18.04) - Procedure worked very well on a recently updated (June 2018) 16.04 install on a Dell 7559 with a GTX 960M. Following https://connorkuehl.github.io/dell-inspiron-7559-linux-guide/, I also added "acpi_backlight=native acpi_osi=" to GRUB_CMDLINE_LINUX otherwise the machine hung on boot (originally I had "acpi=off" but that interfered with bbswitch). I also used CUDA 9.0 (as required by Tensorflow 1.8), with the 390.67 NVIDIA driver (as there was an install issue (build failed) with the packaged 384.81 on my system).
- Key apci kernel params to allow booting
- Didn't work for me but fairly good:
- Other setup tips
Caveat
- Kernel upgrade will require the you reinstall the NVidia driver only according to instructions (summarized at botton)
Build the install USB
- Download Ubuntu 16.04 iso file
- Download Rufus
- Find empty USB drive 4GB
- Use Rufus to make Ubuntu iso into bootable USB drive
Disable Secure Boot
- Reboot
- Interupt Dell boot screen w F2 press
- Find Secure Boot under UEFI in BIOS and disable
Install
- Ensure there is unallocated disk 135GB+ for a new install
- Interupt Dell boot screen w F12 press
- Boot from USB
- From install/run menu 'Ubuntu', press 'e' to edit startup script
- After 'quiet splash ' add 'acpi=off' to prevent hang on reboot
- Install side by side with Win 10 (e.g. sda6=ubuntu, sda7=ubuntu sawp)
Grub
- sudo nano /etc/default/grub
- GRUB_CMDLINE_LINUX_DEFAULT > After 'quiet splash ' add 'acpi=off' to prevent hang next reboot
- sudo update-grub
Updates
- sudo apt-get update
- sudo apt-get upgrade
- Change OS Date/Time: timedatectl set-local-rtc 1 --adjust-system-clock
Video Display
- Need to update to match first ref above!
- Laptop Screen: 3840 x 2160 (needs scale x2 or run as 1920x1080)
- Dell 17" Screen: 1024 x 768
- Use Displays appl & set laptop to 1920 x 1080 Dell screen, use scale=2
NVidia Driver Prep
- Need to update to match first ref above!
- GRUB_CMDLINE_LINUX_DEFAULT > Add 'nouveau.modeset=0' to prevent Nouveau loading into kernel
- sudo update-grub
- Create modprobe config (sudo gedit /etc/modprobe.d/disable-nouveau.conf) & add lines:
- blacklist nouveau
- options nouveau modeset=0
- sudo update-initramfs -u
- Reboot!
- Download matching set of run scripts from NVidia site, e.g.
- NVIDIA-Linux-x86_64-390.67.run
- cuda_9.1.85_387.26_linux.run
- Patches # 1-3
- cuda_9.0.176_384.81_linux.run
- Patches # 1-2
- Tensorflow 1.8 binaries require cuda 9.0
- libcudnn7_7.1.4.18-1+cuda9.0_amd64.deb
- Get matching 9.0 cuDNN
- Stop X/Graphics
- reboot
- at Grub menu, hit 'e' to edit before booting linux
- add a '3' to the default linux kernel params
- hit F10 to boot to terminal only (no X Server)
- hit CTRL+Alt+1 to start terminal (if necessary)
Nvidia Install
- After prep & reboot to terminal only, manually install NVidia driver using no-opengl-files flag to prevent clash:
- sudo sh NVIDIA-Linux-x86_64-390.67.run --no-opengl-files --no-drm
- reboot to terminal (as above)
- Manually install CUDA, responding 'No' when asked to install Nvidia drivers:
- sudo sh cuda_9.1.85_387.26_linux.run
- reboot
- Install cuDNN deb package
- sudo dpkg -1 yourcuDNNfile.deb
- sudo apt-get install -f
- Post-Install
- reboot
Verify GPU On
- cat /proc/acpi/bbswitch
- sudo tee /proc/acpi/bbswitch <<< ON
- modprobe nvidia
- lsmod | grep nvidia
- sudo nvidia-smi
- Verify Tensorflow can use GPU, e.g.
- source activate tf18
- ...
Turn GPU Off
- modprobe -r nvidia
- lsmod | grep nvidia (should see nothing)
- sudo tee /proc/acpi/bswitch <<< OFF
- cat /proc/acpi/bswitch
Tensorflow
- https://www.tensorflow.org/install/install_linux#NVIDIARequirements
- https://www.tensorflow.org/install/install_linux#python_36
Kernel upgrade
- Add instructions for reinstall NVidia Driver on kernel change
AOC USB Monitor
- Picked up an inexpensive AOC USB Monitor that several pals at CB&I Fab Finance and Project Controls had used in multi-week conference room sessions. Useful for having a lightweight, second screen when travelling.
- Worked out of the box on Win 10
- For our Ubuntu GPU rig (recall how we restricted options for graphics drivers to load on demand)
- Install the DisplayLink Driver: http://www.displaylink.com/downloads/ubuntu
- I made following config changes to allow the evdi device on load:
- Add "evdi" to /etc/modules, e.g.
- sudo gedit /etc/modules
- Add "options evdi initial_device_count=1" to /etc/modprobe.d/evdi.conf
- See: https://support.displaylink.com/knowledgebase/articles/1843660-screen-freezes-after-opening-an-application-only
- Add "evdi" to /etc/modules, e.g.
#install
sudo dpkg -i intel-graphics-update-tool_2.0.2_amd64.deb
sudo apt-get -f install
Notes from install:
W: Possible missing firmware /lib/firmware/i915/kbl_guc_ver9_14.bin for module i915 W: Possible missing firmware /lib/firmware/i915/bxt_guc_ver8_7.bin for module i915
If you are upgrading from one release to another with this PPA activated, please install the ppa-purge package and use it to downgrade everything in here beforehand. sudo ppa-purge ppa:ubuntu-x-swat/updates will do it.
#uninstall
sudo ppa-purge ppa:ubuntu-x-swat/updates
sudo shutdown -r now
best tutorial ever, although i spend nearly 3 weeks to implement all of this with no error. on my Dell 559 4K Display; ubuntu 1804.