We have 4 tiers of packages to install. They are tiered because a particular tier depends on packages of previous tiers. Depending on the need, the last 2 tiers can be optional. Here is the landscape of the package tiers:
Other NVIDIA deep learning packages, such as TensorRT and NCCL, are not covered here.
Debian has older versions of the NVIDIA packages. But installation is a breeze. As of Debian Bullseye, there is no cuDNN. The package nvidia-cudnn is being tested.
NVIDIA provides the latest versions. NVIDIA has good documentation on CUDA installation, which describes the installation of both the graphics drivers and the CUDA toolkit. NVIDIA also has detailed documention on cuDNN installation. Note you must register with NVIDIA to download and install cuDNN. In the cuDNN documentation, you can clearly see the 2 prerequisites: graphics drivers and CUDA.
NVIDIA installation supports Debian Bullseye 11.2 (kernel 5.10).
Conda provides CUDA toolkit and cuDNN. Note they requires compatible versions of the graphics driver to function. In fact, conda has multiple channels providing CUDA toolkit and cuDNN. The default channel has cudatoolkit and cudnn. The conda-forge channel has newer versions of cudatoolkit and cudnn. The NVIDIA channel has cuda and cudnn.
First, choose the version of the graphics driver that is compatible with the GPUs at hand. For example, for 2070 Super, the graphics driver from buster-backports
or later is needed. For 3080 TI, bullseye
or later is needed.
Debian Release | NVIDIA graphics driver | Supported GPUs | Note |
---|---|---|---|
bullseye-backports | nvidia-driver 470.103.01 | supported devices | |
bullseye | nvidia-driver 470.161.03 | supported devices | 3070 ti, 3080 ti |
buster-backports | nvidia-driver 470.141.03 | supported devices | 20xx super, 30xx, 3060 ti |
buster | nvidia-driver 418.226.00 | supported devices | 20xx, 20xx ti |
Second, it is critical that CUDA is supported by a compatible graphics driver. Here is a table copied from NVIDIA’s release nots of CUDA toolkit components:
CUDA Toolkit | Linux x86_64 Driver Version | Windows x86_64 Driver Version |
---|---|---|
CUDA 11.6 Update 1 | >=510.47.03 | >=511.65 |
CUDA 11.6 GA | >=510.39.01 | >=511.23 |
CUDA 11.5 Update 2 | >=495.29.05 | >=496.13 |
CUDA 11.5 Update 1 | >=495.29.05 | >=496.13 |
CUDA 11.5 GA | >=495.29.05 | >=496.04 |
CUDA 11.4 Update 4 | >=470.82.01 | >=472.50 |
CUDA 11.4 Update 3 | >=470.82.01 | >=472.50 |
CUDA 11.4 Update 2 | >=470.57.02 | >=471.41 |
CUDA 11.4 Update 1 | >=470.57.02 | >=471.41 |
CUDA 11.4.0 GA. | >=470.42.01 | >=471.11 |
CUDA 11.3.1 Update 1 | >=465.19.01 | >=465.89 |
CUDA 11.3.0 GA | >=465.19.01 | >=465.89 |
CUDA 11.2.2 Update 2 | >=460.32.03 | >=461.33 |
CUDA 11.2.1 Update 1 | >=460.32.03 | >=461.09 |
CUDA 11.2.0 GA | >=460.27.03 | >=460.82 |
CUDA 11.1.1 Update 1 | >= 455.32 | >= 456.81 |
CUDA 11.1 GA | >= 455.23 | >= 456.38 |
CUDA 11.0.3 Update 1 | >= 450.51.06 | >= 451.82 |
CUDA 11.0.2 GA | >= 450.51.05 | >= 451.48 |
CUDA 11.0.1 RC | >= 450.36.06 | >= 451.22 |
CUDA 10.2.89 | >= 440.33 | >= 441.22 |
CUDA 10.1.105 | >= 418.39 | >= 418.96 |
CUDA 10.0.130 | >= 410.48 | >= 411.31 |
CUDA 9.2 (9.2.148 Update 1) | >= 396.37 | >= 398.26 |
CUDA 9.2 (9.2.88) | >= 396.26 | >= 397.44 |
CUDA 9.1 (9.1.85) | >= 390.46 | >= 391.29 |
CUDA 9.0 (9.0.76) | >= 384.81 | >= 385.54 |
CUDA 8.0 (8.0.61 GA2) | >= 375.26 | >= 376.51 |
CUDA 8.0 (8.0.44) | >= 367.48 | >= 369.30 |
CUDA 7.5 (7.5.16) | >= 352.31 | >= 353.66 |
CUDA 7.0 (7.0.28) | >= 346.46 | >= 347.62 |
When installing CUDA and cuDNN, you may need to lock down the versions to obtain compatibility.
NVIDIA installs into the kernel tree. In order to do that, Linux headers are needed. It is important we install the exact version of Linux headers. Thus this better be done manually and separately.
First do a quick verification before install:
lspci | grep -i nvidia
uname -r
and architecture uname -m
To list the linux-headers packages already installed:
sudo dpkg -l | grep 'linux-headers'
Then to install the Linux headers:
sudo apt-get install linux-headers-$(uname -r | sed 's/[^-]*-[^-]*-//')
The command uname -r | sed 's/[^-]*-[^-]*-//'
output amd64
. The package linux-headers-amd64
is the architecture-specific meta-package. The package manager points it to the package of the correct kernel version, for example, linux-headers-4.19.0-10-amd64
. So in the list of packages to be installed, double check there is linux-headers-4.19.0-10-amd64
where the 4.19.0-10-amd64
part should match the kernel of your system.
sudo apt-get install dkms
The dkms package is singled out to make it clear that NVIDIA installs into the kernel tree. From the Ubuntu documentation, “This DKMS (Dynamic Kernel Module Support) package provides support for installing supplementary versions of kernel modules. The package compiles and installs into the kernel tree.” It turns out this package is also required by other software such as VirtulBox, Docker. Thus locking it down as a manual install.
Note the package nvidia-driver
requires non-free software enabled in /etc/apt/sources.list
.
# nvidia-driver 470.103
sudo apt-get -t bullseye-backports install nvidia-driver nvidia-smi nvidia-persistenced
Or
# nvidia-driver 460.91
sudo apt-get install nvidia-driver nvidia-smi nvidia-persistenced
The nvidia-driver
metapackage has nvidia-kernel-dkms
, which should be installed and uninstalled together with other NVIDIA packages. That is to say, do not install nvidia-kernel-dkms
by itself.
The nvidia-driver
metapackage has a hard dependency to xserver-xorg-video-nvidia
which in turn depends on xserver-xorg-core
. Installing nvidia-driver
pulls in the X server. However, just xserver-xorg-core
is incomplete; it is missing the input drivers. This is addressed at the step of installing Gnome by installing the meta package xserver-xorg
.
In the end, restart to replace nouveau with nvidia. You will be prompted during installation if a reboot is needed.
To verify, nvidia-smi
.
This step is optional as CUDA toolkit can be provided by anaconda.
# nvidia-cuda-toolkit 11.2.2
sudo apt-get install nvidia-cuda-toolkit
Alternatively, without installing the nvcc compiler (which is included in nvidia-cuda-toolkit
):
# nvidia-cuda-dev 11.2.2
sudo apt-get install nvidia-cuda-dev
Here is the CUDA toolkit package tree:
nvidia-cuda-toolkit
|
|-----> nvidia-cuda-dev
| |
| |-----> libcudart: CUDA runtime
| |
| |-----> libcublas: cuBLAS
| |
| |-----> libnvblas: nvBLAS
| |
| |-----> libcufft: cuFFT
| |
| |-----> libcufftw: cuFFTW
| |
| |-----> libcurand: cuRAND
| |
| |-----> libcusolver: cuSOLVER, LAPACK-like functions
| |
| |-----> libcusparse8.0: cuSPARSE
| |
| |=====> libcuda1 (already a hard dependency)
| | |
| | |=====> nvidia-cuda-mps (not installed)
| |
| |-----> libnvvm3 (library used by NVCC)
|
|-----> libnvvm3
|
|-----> nvidia-opencl-dev
|
|-----> nvidia-profiler
|
|=====> nvidia-cuda-gdb
|
|=====> nvidia-cuda-doc
To verify, nvcc --version
should display the CUDA version.
conda
is not alreay installedconda create --name numba python=3.9
conda activate numba
conda install cudatoolkit cudnn numba
from numba import cuda
cuda.detect()
It should list the CUDA devices, e.g. ‘GeForce GTX 3080 Ti’.
conda create --name tf python=3.9
conda activate tf
conda install tensorflow-gpu
import tensorflow as tf
tf.config.list_physical_devices()
conda create --name torch python=3.9
conda activate torch
conda install pytorch cudatoolkit=10.2 -c pytorch
import torch
torch.cuda.is_available()
Next step: Gnome