Skip to content

Instantly share code, notes, and snippets.

@denguir
denguir / cuda_install.md
Last active September 19, 2024 14:47
Installation procedure for CUDA & cuDNN

How to install CUDA & cuDNN on Ubuntu 22.04

Install NVIDIA drivers

Update & upgrade

sudo apt update && sudo apt upgrade

Remove previous NVIDIA installation

@mingfeima
mingfeima / pytorch_performance_profiling.md
Last active September 7, 2024 06:21
How to do performance profiling on PyTorch

(Internal Tranining Material)

Usually the first step in performance optimization is to do profiling, e.g. to identify performance hotspots of a workload. This gist tells basic knowledge of performance profiling on PyTorch, you will get:

  • How to find the bottleneck operator?
  • How to trace source file of a particular operator?
  • How do I indentify threading issues? (oversubscription)
  • How do I tell a specific operator is running efficiently or not?

This tutorial takes one of my recent projects - pssp-transformer as an example to guide you through path of PyTorch CPU peformance optimization. Focus will be on Part 1 & Part 2.