Cuda: Handle Conflicting Installation Methods

Posted 2022-12-12 This is bill

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了Cuda: Handle Conflicting Installation Methods相关的知识，希望对你有一定的参考价值。

彻底卸载 cuda：https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#handle-uninstallation

NVIDIA CUDA Installation Guide for Linux

1. Introduction

CUDA® is a parallel computing platform and programming model invented by NVIDIA. It enables dramatic increases in computing performance by harnessing the power of the graphics processing unit (GPU).

CUDA was developed with several design goals in mind:

Provide a small set of extensions to standard programming languages, like C, that enable a straightforward implementation of parallel algorithms. With CUDA C/C++, programmers can focus on the task of parallelization of the algorithms rather than spending time on their implementation.
Support heterogeneous computation where applications use both the CPU and GPU. Serial portions of applications are run on the CPU, and parallel portions are offloaded to the GPU. As such, CUDA can be incrementally applied to existing applications. The CPU and GPU are treated as separate devices that have their own memory spaces. This configuration also allows simultaneous computation on the CPU and GPU without contention for memory resources.

CUDA-capable GPUs have hundreds of cores that can collectively run thousands of computing threads. These cores have shared resources including a register file and a shared memory. The on-chip shared memory allows parallel tasks running on these cores to share data without sending it over the system memory bus.

This guide will show you how to install and check the correct operation of the CUDA development tools.

1.1. System Requirements

To use CUDA on your system, you will need the following installed:

CUDA-capable GPU
A supported version of Linux with a gcc compiler and toolchain
NVIDIA CUDA Toolkit (available at http://developer.nvidia.com/cuda-downloads)

The CUDA development environment relies on tight integration with the host development environment, including the host compiler and C runtime libraries, and is therefore only supported on distribution versions that have been qualified for this CUDA Toolkit release.

Table 1. Native Linux Distribution Support in CUDA 10.1 Update 1
Distribution	Kernel*	GCC	GLIBC	ICC	PGI	XLC	CLANG
x86_64
RHEL 7.6	3.10	4.8.5	2.17	19.0	18.x, 19.x	NO	8.0.0
RHEL 6.10	2.6.32	4.4.7	2.12
CentOS 7.6	3.10	4.8.5	2.17
CentOS 6.10	2.6.32	4.4.7	2.12
Fedora 29	4.16	8.0.1	2.27
OpenSUSE Leap 15.0	4.15.0	7.3.1	2.26
SLES 15.0	4.12.14	7.2.1	2.26
SLES 12.4	4.12.14	4.8.5	2.22
Ubuntu 18.10	4.18.0	8.2.0	2.28
Ubuntu 18.04.2 (**)	4.15.0	7.3.0	2.27
Ubuntu 16.04.6 (**)	4.4	5.4.0	2.23
Ubuntu 14.04.6 (**)	3.13	4.8.4	2.19	—	—	—	—
POWER8(***)
RHEL 7.6	3.10	4.8.5	2.17	NO	18.x, 19.x	13.1.x, 16.1.x	8.0.0
Ubuntu 18.04.1	4.15.0	7.3.0	2.27	NO	18.x, 19.x	13.1.x, 16.1.x	8.0.0
POWER9(****)
Ubuntu 18.04.1	4.15.0	7.3.0	2.27	NO	18.x, 19.x	13.1.x, 16.1.x	8.0.0
RHEL 7.6 IBM Power LE	4.14.0	4.8.5	2.17	NO	18.x, 19.x	13.1.x, 16.1.x	8.0.0

(*) For specific kernel versions supported on Red Hat Enterprise Linux, visit https://access.redhat.com/articles/3078. For a list of kernel versions including the release dates for SUSE Linux Enterprise Server is available at https://wiki.microfocus.com/index.php/SUSE/SLES/Kernel_versions.

(**) For Ubuntu LTS on x86-64, both the HWE kernel (e.g. 4.13.x for 16.04.4) and the server LTS kernel (e.g. 4.4.x for 16.04) are supported in CUDA 10.1. Visit https://wiki.ubuntu.com/Kernel/Support for more information.

(***) Only the Tesla GP100 GPU is supported for CUDA 10.1 on POWER8.

(****) Only the Tesla GV100 GPU is supported for CUDA 10.1 on POWER9.

1.2. About This Document

This document is intended for readers familiar with the Linux environment and the compilation of C programs from the command line. You do not need previous experience with CUDA or experience with parallel computation. Note: This guide covers installation only on systems with X Windows installed.

Note: Many commands in this document might require superuser privileges. On most distributions of Linux, this will require you to log in as root. For systems that have enabled the sudo package, use the sudo prefix for all necessary commands.

2. Pre-installation Actions

Some actions must be taken before the CUDA Toolkit and Driver can be installed on Linux:

Verify the system has a CUDA-capable GPU.
Verify the system is running a supported version of Linux.
Verify the system has gcc installed.
Verify the system has the correct kernel headers and development packages installed.
Download the NVIDIA CUDA Toolkit.
Handle conflicting installation methods.

Note: You can override the install-time prerequisite checks by running the installer with the -override flag. Remember that the prerequisites will still be required to use the NVIDIA CUDA Toolkit.

2.1. Verify You Have a CUDA-Capable GPU

To verify that your GPU is CUDA-capable, go to your distribution's equivalent of System Properties, or, from the command line, enter:

$ lspci | grep -i nvidia

If you do not see any settings, update the PCI hardware database that Linux maintains by entering update-pciids (generally found in /sbin) at the command line and rerun the previous lspci command.

If your graphics card is from NVIDIA and it is listed in http://developer.nvidia.com/cuda-gpus, your GPU is CUDA-capable.

The Release Notes for the CUDA Toolkit also contain a list of supported products.

2.2. Verify You Have a Supported Version of Linux

The CUDA Development Tools are only supported on some specific distributions of Linux. These are listed in the CUDA Toolkit release notes.

To determine which distribution and release number you're running, type the following at the command line:

$ uname -m && cat /etc/*release

You should see output similar to the following, modified for your particular system:

x86_64
Red Hat Enterprise Linux Workstation release 6.0 (Santiago)

The x86_64 line indicates you are running on a 64-bit system. The remainder gives information about your distribution.

2.3. Verify the System Has gcc Installed

The gcc compiler is required for development using the CUDA Toolkit. It is not required for running CUDA applications. It is generally installed as part of the Linux installation, and in most cases the version of gcc installed with a supported version of Linux will work correctly.

To verify the version of gcc installed on your system, type the following on the command line:

$ gcc --version

If an error message displays, you need to install the development tools from your Linux distribution or obtain a version of gcc and its accompanying toolchain from the Web.

2.4. Verify the System has the Correct Kernel Headers and Development Packages Installed

The CUDA Driver requires that the kernel headers and development packages for the running version of the kernel be installed at the time of the driver installation, as well whenever the driver is rebuilt. For example, if your system is running kernel version 3.17.4-301, the 3.17.4-301 kernel headers and development packages must also be installed.

While the Runfile installation performs no package validation, the RPM and Deb installations of the driver will make an attempt to install the kernel header and development packages if no version of these packages is currently installed. However, it will install the latest version of these packages, which may or may not match the version of the kernel your system is using. Therefore, it is best to manually ensure the correct version of the kernel headers and development packages are installed prior to installing the CUDA Drivers, as well as whenever you change the kernel version.

The version of the kernel your system is running can be found by running the following command:

$ uname -r

This is the version of the kernel headers and development packages that must be installed prior to installing the CUDA Drivers. This command will be used multiple times below to specify the version of the packages to install. Note that below are the common-case scenarios for kernel usage. More advanced cases, such as custom kernel branches, should ensure that their kernel headers and sources match the kernel build they are running.

Note: If you perform a system update which changes the version of the linux kernel being used, make sure to rerun the commands below to ensure you have the correct kernel headers and kernel development packages installed. Otherwise, the CUDA Driver will fail to work with the new kernel.

RHEL/CentOS

The kernel headers and development packages for the currently running kernel can be installed with:

$ sudo yum install kernel-devel-$(uname -r) kernel-headers-$(uname -r)

Fedora

The kernel headers and development packages for the currently running kernel can be installed with:

$ sudo dnf install kernel-devel-$(uname -r) kernel-headers-$(uname -r)

OpenSUSE/SLES

Use the output of the uname command to determine the running kernel's version and variant:

$ uname -r
3.16.6-2-default

In this example, the version is 3.16.6-2 and the variant is default. The kernel headers and development packages can then be installed with the following command, replacing <variant> and <version> with the variant and version discovered from the previous uname command:

$ sudo zypper install kernel-<variant>-devel=<version>

Ubuntu

The kernel headers and development packages for the currently running kernel can be installed with:

$ sudo apt-get install linux-headers-$(uname -r)

2.5. Choose an Installation Method

The CUDA Toolkit can be installed using either of two different installation mechanisms: distribution-specific packages (RPM and Deb packages), or a distribution-independent package (runfile packages). The distribution-independent package has the advantage of working across a wider set of Linux distributions, but does not update the distribution's native package management system. The distribution-specific packages interface with the distribution's native package management system. It is recommended to use the distribution-specific packages, where possible.

Note: Standalone installers are not provided for architectures other than the x86_64 release. For both native as well as cross development, the toolkit must be installed using the distribution-specific installer. See the CUDA Cross-Platform Installationsection for more details.

2.6. Download the NVIDIA CUDA Toolkit

The NVIDIA CUDA Toolkit is available at http://developer.nvidia.com/cuda-downloads.

Choose the platform you are using and download the NVIDIA CUDA Toolkit

The CUDA Toolkit contains the CUDA driver and tools needed to create, build and run a CUDA application as well as libraries, header files, CUDA samples source code, and other resources.

Download Verification

The download can be verified by comparing the MD5 checksum posted at http://developer.nvidia.com/cuda-downloads/checksums with that of the downloaded file. If either of the checksums differ, the downloaded file is corrupt and needs to be downloaded again.

To calculate the MD5 checksum of the downloaded file, run the following:

$ md5sum <file>

2.7. Handle Conflicting Installation Methods

Before installing CUDA, any previously installations that could conflict should be uninstalled. This will not affect systems which have not had CUDA installed previously, or systems where the installation method has been preserved (RPM/Deb vs. Runfile). See the following charts for specifics.

Table 2. CUDA Toolkit Installation Compatibility Matrix
		Installed Toolkit Version == X.Y		Installed Toolkit Version != X.Y
		RPM/Deb	run	RPM/Deb	run
Installing Toolkit Version X.Y	RPM/Deb	No Action	Uninstall Run	No Action	No Action
Installing Toolkit Version X.Y	run	Uninstall RPM/Deb	Uninstall Run	No Action	No Action

Table 3. NVIDIA Driver Installation Compatibility Matrix
		Installed Driver Version == X.Y		Installed Driver Version != X.Y
		RPM/Deb	run	RPM/Deb	run
Installing Driver Version X.Y	RPM/Deb	No Action	Uninstall Run	No Action	Uninstall Run
Installing Driver Version X.Y	run	Uninstall RPM/Deb	No Action	Uninstall RPM/Deb	No Action

Use the following command to uninstall a Toolkit runfile installation:

$ sudo /usr/local/cuda-X.Y/bin/uninstall_cuda_X.Y.pl

Use the following command to uninstall a Driver runfile installation:

$ sudo /usr/bin/nvidia-uninstall

Use the following commands to uninstall a RPM/Deb installation:

$ sudo yum remove <package_name>                      # Redhat/CentOS
$ sudo dnf remove <package_name>                      # Fedora
$ sudo zypper remove <package_name>                   # OpenSUSE/SLES
$ sudo apt-get --purge remove <package_name>          # Ubuntu

3. Package Manager Installation

Basic instructions can be found in the Quick Start Guide. Read on for more detailed instructions.

3.1. Overview

The Package Manager installation interfaces with your system's package management system. When using RPM or Deb, the downloaded package is a repository package. Such a package only informs the package manager where to find the actual installation packages, but will not install them.

If those packages are available in an online repository, they will be automatically downloaded in a later step. Otherwise, the repository package also installs a local repository containing the installation packages on the system. Whether the repository is available online or installed locally, the installation procedure is identical and made of several steps.

Distribution-specific instructions detail how to install CUDA:

Finally, some helpful package manager capabilities are detailed.

These instructions are for native development only. For cross-platform development, see the CUDA Cross-Platform Environment section.

Note: The package "cuda-core" has been deprecated in CUDA 9.1. Please use "cuda-compiler" instead.