Cuda: Handle Conflicting Installation Methods

Posted This is bill

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Cuda: Handle Conflicting Installation Methods相关的知识,希望对你有一定的参考价值。

彻底卸载 cuda:https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#handle-uninstallation

 

NVIDIA CUDA Installation Guide for Linux

1. Introduction

CUDA® is a parallel computing platform and programming model invented by NVIDIA. It enables dramatic increases in computing performance by harnessing the power of the graphics processing unit (GPU).

CUDA was developed with several design goals in mind:

  • Provide a small set of extensions to standard programming languages, like C, that enable a straightforward implementation of parallel algorithms. With CUDA C/C++, programmers can focus on the task of parallelization of the algorithms rather than spending time on their implementation.
  • Support heterogeneous computation where applications use both the CPU and GPU. Serial portions of applications are run on the CPU, and parallel portions are offloaded to the GPU. As such, CUDA can be incrementally applied to existing applications. The CPU and GPU are treated as separate devices that have their own memory spaces. This configuration also allows simultaneous computation on the CPU and GPU without contention for memory resources.

CUDA-capable GPUs have hundreds of cores that can collectively run thousands of computing threads. These cores have shared resources including a register file and a shared memory. The on-chip shared memory allows parallel tasks running on these cores to share data without sending it over the system memory bus.

This guide will show you how to install and check the correct operation of the CUDA development tools.

1.1. System Requirements

To use CUDA on your system, you will need the following installed:

The CUDA development environment relies on tight integration with the host development environment, including the host compiler and C runtime libraries, and is therefore only supported on distribution versions that have been qualified for this CUDA Toolkit release.

Table 1. Native Linux Distribution Support in CUDA 10.1 Update 1
DistributionKernel*GCCGLIBCICCPGIXLCCLANG
x86_64
RHEL 7.63.104.8.52.1719.018.x, 19.xNO8.0.0
RHEL 6.102.6.324.4.72.12
CentOS 7.63.104.8.52.17
CentOS 6.102.6.324.4.72.12
Fedora 294.168.0.12.27
OpenSUSE Leap 15.04.15.07.3.12.26
SLES 15.04.12.147.2.12.26
SLES 12.44.12.144.8.52.22
Ubuntu 18.104.18.08.2.02.28
Ubuntu 18.04.2 (**)4.15.07.3.02.27
Ubuntu 16.04.6 (**)4.45.4.02.23
Ubuntu 14.04.6 (**)3.134.8.42.19
POWER8(***)
RHEL 7.63.104.8.52.17NO18.x, 19.x13.1.x, 16.1.x8.0.0
Ubuntu 18.04.14.15.07.3.02.27NO18.x, 19.x13.1.x, 16.1.x8.0.0
POWER9(****)
Ubuntu 18.04.14.15.07.3.02.27NO18.x, 19.x13.1.x, 16.1.x8.0.0
RHEL 7.6 IBM Power LE4.14.04.8.52.17NO18.x, 19.x13.1.x, 16.1.x8.0.0

(*) For specific kernel versions supported on Red Hat Enterprise Linux, visit https://access.redhat.com/articles/3078. For a list of kernel versions including the release dates for SUSE Linux Enterprise Server is available at https://wiki.microfocus.com/index.php/SUSE/SLES/Kernel_versions.

(**) For Ubuntu LTS on x86-64, both the HWE kernel (e.g. 4.13.x for 16.04.4) and the server LTS kernel (e.g. 4.4.x for 16.04) are supported in CUDA 10.1. Visit https://wiki.ubuntu.com/Kernel/Support for more information.

(***) Only the Tesla GP100 GPU is supported for CUDA 10.1 on POWER8.

(****) Only the Tesla GV100 GPU is supported for CUDA 10.1 on POWER9.

1.2. About This Document

This document is intended for readers familiar with the Linux environment and the compilation of C programs from the command line. You do not need previous experience with CUDA or experience with parallel computation. Note: This guide covers installation only on systems with X Windows installed.

Note: Many commands in this document might require superuser privileges. On most distributions of Linux, this will require you to log in as root. For systems that have enabled the sudo package, use the sudo prefix for all necessary commands.

2. Pre-installation Actions

Some actions must be taken before the CUDA Toolkit and Driver can be installed on Linux:

  • Verify the system has a CUDA-capable GPU.
  • Verify the system is running a supported version of Linux.
  • Verify the system has gcc installed.
  • Verify the system has the correct kernel headers and development packages installed.
  • Download the NVIDIA CUDA Toolkit.
  • Handle conflicting installation methods.

Note: You can override the install-time prerequisite checks by running the installer with the -override flag. Remember that the prerequisites will still be required to use the NVIDIA CUDA Toolkit.

2.1. Verify You Have a CUDA-Capable GPU

To verify that your GPU is CUDA-capable, go to your distribution's equivalent of System Properties, or, from the command line, enter:

$ lspci | grep -i nvidia

If you do not see any settings, update the PCI hardware database that Linux maintains by entering update-pciids (generally found in /sbin) at the command line and rerun the previous lspci command.

If your graphics card is from NVIDIA and it is listed in http://developer.nvidia.com/cuda-gpus, your GPU is CUDA-capable.

The Release Notes for the CUDA Toolkit also contain a list of supported products.

2.2. Verify You Have a Supported Version of Linux

The CUDA Development Tools are only supported on some specific distributions of Linux. These are listed in the CUDA Toolkit release notes.

To determine which distribution and release number you're running, type the following at the command line:

$ uname -m && cat /etc/*release

You should see output similar to the following, modified for your particular system:

x86_64
Red Hat Enterprise Linux Workstation release 6.0 (Santiago)

The x86_64 line indicates you are running on a 64-bit system. The remainder gives information about your distribution.

2.3. Verify the System Has gcc Installed

The gcc compiler is required for development using the CUDA Toolkit. It is not required for running CUDA applications. It is generally installed as part of the Linux installation, and in most cases the version of gcc installed with a supported version of Linux will work correctly.

To verify the version of gcc installed on your system, type the following on the command line:

$ gcc --version

If an error message displays, you need to install the development tools from your Linux distribution or obtain a version of gcc and its accompanying toolchain from the Web.

2.4. Verify the System has the Correct Kernel Headers and Development Packages Installed

The CUDA Driver requires that the kernel headers and development packages for the running version of the kernel be installed at the time of the driver installation, as well whenever the driver is rebuilt. For example, if your system is running kernel version 3.17.4-301, the 3.17.4-301 kernel headers and development packages must also be installed.

While the Runfile installation performs no package validation, the RPM and Deb installations of the driver will make an attempt to install the kernel header and development packages if no version of these packages is currently installed. However, it will install the latest version of these packages, which may or may not match the version of the kernel your system is using. Therefore, it is best to manually ensure the correct version of the kernel headers and development packages are installed prior to installing the CUDA Drivers, as well as whenever you change the kernel version.

The version of the kernel your system is running can be found by running the following command:

$ uname -r

This is the version of the kernel headers and development packages that must be installed prior to installing the CUDA Drivers. This command will be used multiple times below to specify the version of the packages to install. Note that below are the common-case scenarios for kernel usage. More advanced cases, such as custom kernel branches, should ensure that their kernel headers and sources match the kernel build they are running.

Note: If you perform a system update which changes the version of the linux kernel being used, make sure to rerun the commands below to ensure you have the correct kernel headers and kernel development packages installed. Otherwise, the CUDA Driver will fail to work with the new kernel.

RHEL/CentOS

The kernel headers and development packages for the currently running kernel can be installed with:

$ sudo yum install kernel-devel-$(uname -r) kernel-headers-$(uname -r)

Fedora

The kernel headers and development packages for the currently running kernel can be installed with:

$ sudo dnf install kernel-devel-$(uname -r) kernel-headers-$(uname -r)

OpenSUSE/SLES

Use the output of the uname command to determine the running kernel's version and variant:

$ uname -r
3.16.6-2-default

In this example, the version is 3.16.6-2 and the variant is default. The kernel headers and development packages can then be installed with the following command, replacing <variant> and <version> with the variant and version discovered from the previous uname command:

$ sudo zypper install kernel-<variant>-devel=<version>

Ubuntu

The kernel headers and development packages for the currently running kernel can be installed with:

$ sudo apt-get install linux-headers-$(uname -r)

2.5. Choose an Installation Method

The CUDA Toolkit can be installed using either of two different installation mechanisms: distribution-specific packages (RPM and Deb packages), or a distribution-independent package (runfile packages). The distribution-independent package has the advantage of working across a wider set of Linux distributions, but does not update the distribution's native package management system. The distribution-specific packages interface with the distribution's native package management system. It is recommended to use the distribution-specific packages, where possible.

Note: Standalone installers are not provided for architectures other than the x86_64 release. For both native as well as cross development, the toolkit must be installed using the distribution-specific installer. See the CUDA Cross-Platform Installationsection for more details.

2.6. Download the NVIDIA CUDA Toolkit

The NVIDIA CUDA Toolkit is available at http://developer.nvidia.com/cuda-downloads.

Choose the platform you are using and download the NVIDIA CUDA Toolkit

The CUDA Toolkit contains the CUDA driver and tools needed to create, build and run a CUDA application as well as libraries, header files, CUDA samples source code, and other resources.

Download Verification

The download can be verified by comparing the MD5 checksum posted at http://developer.nvidia.com/cuda-downloads/checksums with that of the downloaded file. If either of the checksums differ, the downloaded file is corrupt and needs to be downloaded again.

To calculate the MD5 checksum of the downloaded file, run the following:

$ md5sum <file>

2.7. Handle Conflicting Installation Methods

Before installing CUDA, any previously installations that could conflict should be uninstalled. This will not affect systems which have not had CUDA installed previously, or systems where the installation method has been preserved (RPM/Deb vs. Runfile). See the following charts for specifics.

Table 2. CUDA Toolkit Installation Compatibility Matrix
 Installed Toolkit Version == X.YInstalled Toolkit Version != X.Y
RPM/DebrunRPM/Debrun
Installing Toolkit Version X.YRPM/DebNo ActionUninstall RunNo ActionNo Action
runUninstall RPM/DebUninstall RunNo ActionNo Action
Table 3. NVIDIA Driver Installation Compatibility Matrix
 Installed Driver Version == X.YInstalled Driver Version != X.Y
RPM/DebrunRPM/Debrun
Installing Driver Version X.YRPM/DebNo ActionUninstall RunNo ActionUninstall Run
runUninstall RPM/DebNo ActionUninstall RPM/DebNo Action

Use the following command to uninstall a Toolkit runfile installation:

$ sudo /usr/local/cuda-X.Y/bin/uninstall_cuda_X.Y.pl

Use the following command to uninstall a Driver runfile installation:

$ sudo /usr/bin/nvidia-uninstall

Use the following commands to uninstall a RPM/Deb installation:

$ sudo yum remove <package_name>                      # Redhat/CentOS
$ sudo dnf remove <package_name>                      # Fedora
$ sudo zypper remove <package_name>                   # OpenSUSE/SLES
$ sudo apt-get --purge remove <package_name>          # Ubuntu

3. Package Manager Installation

Basic instructions can be found in the Quick Start Guide. Read on for more detailed instructions.

3.1. Overview

The Package Manager installation interfaces with your system's package management system. When using RPM or Deb, the downloaded package is a repository package. Such a package only informs the package manager where to find the actual installation packages, but will not install them.

If those packages are available in an online repository, they will be automatically downloaded in a later step. Otherwise, the repository package also installs a local repository containing the installation packages on the system. Whether the repository is available online or installed locally, the installation procedure is identical and made of several steps.

Distribution-specific instructions detail how to install CUDA:

Finally, some helpful package manager capabilities are detailed.

These instructions are for native development only. For cross-platform development, see the CUDA Cross-Platform Environment section.

Note: The package "cuda-core" has been deprecated in CUDA 9.1. Please use "cuda-compiler" instead.

3.2. Redhat/CentOS

  1. Perform the pre-installation actions.
  2. Satisfy third-party package dependency
    • Satisfy DKMS dependency: The NVIDIA driver RPM packages depend on other external packages, such as DKMS and libvdpau. Those packages are only available on third-party repositories, such as EPEL. Any such third-party repositories must be added to the package manager repository database before installing the NVIDIA driver RPM packages, or missing dependencies will prevent the installation from proceeding.
    • Enable optional repos:

      On RHEL 7 Linux only, execute the following steps to enable optional repositories.

      • On x86_64 workstation:
        $ subscription-manager repos --enable=rhel-7-workstation-optional-rpms
      • On POWER9 system:
        $ subscription-manager repos --enable=rhel-7-for-power-9-optional-rpms
      • On x86_64 server:
        $ subscription-manager repos --enable=rhel-7-server-optional-rpms 
  3. Address custom xorg.conf, if applicable

    The driver relies on an automatically generated xorg.conf file at /etc/X11/xorg.conf. If a custom-built xorg.conf file is present, this functionality will be disabled and the driver may not work. You can try removing the existing xorg.conf file, or adding the contents of /etc/X11/xorg.conf.d/00-nvidia.conf to the xorg.conf file. The xorg.conf file will most likely need manual tweaking for systems with a non-trivial GPU configuration.

  4. Install repository meta-data
    $ sudo rpm --install cuda-repo-<distro>-<version>.<architecture>.rpm
  5. Clean Yum repository cache
    $ sudo yum clean expire-cache
  6. Install CUDA
    $ sudo yum install cuda
    If the i686 libvdpau package dependency fails to install, try using the following steps to fix the issue:
    $ yumdownloader libvdpau.i686
    $ sudo rpm -U --oldpackage libvdpau*.rpm
  7. Add libcuda.so symbolic link, if necessary

    The libcuda.so library is installed in the /usr/lib,64/nvidia directory. For pre-existing projects which use libcuda.so, it may be useful to add a symbolic link from libcuda.so in the /usr/lib,64 directory.

  8. Perform the post-installation actions.

3.3. Fedora

  1. Perform the pre-installation actions.
  2. Address custom xorg.conf, if applicable

    The driver relies on an automatically generated xorg.conf file at /etc/X11/xorg.conf. If a custom-built xorg.conf file is present, this functionality will be disabled and the driver may not work. You can try removing the existing xorg.conf file, or adding the contents of /etc/X11/xorg.conf.d/00-nvidia.conf to the xorg.conf file. The xorg.conf file will most likely need manual tweaking for systems with a non-trivial GPU configuration.

  3. Satisfy Akmods dependency

    The NVIDIA driver RPM packages depend on the Akmods framework which is provided by the RPMFusion free repository. The RPMFusion free repository must be added to the package manager repository database before installing the NVIDIA driver RPM packages, or missing dependencies will prevent the installation from proceeding.

  4. Install repository meta-data
    $ sudo rpm --install cuda-repo-<distro>-<version>.<architecture>.rpm
  5. Clean DNF repository cache
    $ sudo dnf clean expire-cache
  6. Install CUDA
    $ sudo dnf install cuda
    The CUDA driver installation may fail if the RPMFusion non-free repository is enabled. In this case, CUDA installations should temporarily disable the RPMFusion non-free repository:
    $ sudo dnf --disablerepo="rpmfusion-nonfree*" install cuda
    If a system has installed both packages with the same instance of dnf, some driver components may be missing. Such an installation can be corrected by running:
    $ sudo dnf install cuda-drivers
    If the i686 libvdpau package dependency fails to install, try using the following steps to fix the issue:
    $ dnf download libvdpau.i686
    $ sudo rpm -U --oldpackage libvdpau*.rpm
    It may be necessary to rebuild the grub configuration files, particularly if you use a non-default partition scheme. If so, then run this below command, and reboot the system:
    $ sudo grub2-mkconfig -o /boot/grub2/grub.cfg
    Remember to reboot the system.
  7. Add libcuda.so symbolic link, if necessary

    The libcuda.so library is installed in the /usr/lib,64/nvidia directory. For pre-existing projects which use libcuda.so, it may be useful to add a symbolic link from libcuda.so in the /usr/lib,64 directory.

  8. Perform the post-installation actions.

3.4. SLES

  1. Perform the pre-installation actions.
  2. On SLES12 SP4, install the Mesa-libgl-devel Linux packages before proceeding. See Mesa-libGL-devel.
  3. Install repository meta-data
    $ sudo rpm --install cuda-repo-<distro>-<version>.<architecture>.rpm
  4. Refresh Zypper repository cache
    $ sudo zypper refresh
  5. Install CUDA
    $ sudo zypper install cuda
  6. Add the user to the video group
    $ sudo usermod -a -G video <username>
  7. Install CUDA Samples GL dependencies

    The CUDA Samples package on SLES does not include dependencies on GL and X11 libraries as these are provided in the SLES SDK. These packages must be installed separately, depending on which samples you want to use.

  8. Perform the post-installation actions.

3.5. OpenSUSE

  1. Perform the pre-installation actions.
  2. Install repository meta-data
    $ sudo rpm --install cuda-repo-<distro>-<version>.<architecture>.rpm
  3. Refresh Zypper repository cache
    $ sudo zypper refresh
  4. Install CUDA
    $ sudo zypper install cuda
  5. Add the user to the video group
    $ sudo usermod -a -G video <username>
  6. Perform the post-installation actions.

3.6. Ubuntu

  1. Perform the pre-installation actions.
  2. Install repository meta-data
    $ sudo dpkg -i cuda-repo-<distro>_<version>_<architecture>.deb
  3. Installing the CUDA public GPG key

    When installing using the local repo:

    $ sudo apt-key add /var/cuda-repo-<version>/7fa2af80.pub

    When installing using network repo on Ubuntu 18.04/18.10:

    $ sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/<distro>/<architecture>/7fa2af80.pub

    When installing using network repo on Ubuntu 16.04:

    $ sudo apt-key adv --fetch-keys http://developer.download.nvidia.com/compute/cuda/repos/<distro>/<architecture>/7fa2af80.pub
  4. Update the Apt repository cache
    $ sudo apt-get update
  5. Install CUDA
    $ sudo apt-get install cuda
  6. Perform the post-installation actions.

3.7. Additional Package Manager Capabilities

Below are some additional capabilities of the package manager that users can take advantage of.

3.7.1. Available Packages

The recommended installation package is the cuda package. This package will install the full set of other CUDA packages required for native development and should cover most scenarios.

The cuda package installs all the available packages for native developments. That includes the compiler, the debugger, the profiler, the math libraries,... For x86_64 patforms, this also include NSight Eclipse Edition and the visual profiler It also includes the NVIDIA driver package.

On supported platforms, the cuda-cross-armhf, cuda-cross-aarch64, and cuda-cross-ppc64el packages install all the packages required for cross-platform development to ARMv7, ARMv8, and POWER8, respectively. The libraries and header files of the target architecture's display driver package are also installed to enable the cross compilation of driver applications. The cuda-cross-<arch> packages do not install the native display driver.

The packages installed by the packages above can also be installed individually by specifying their names explicitly. The list of available packages be can obtained with:

$ yum --disablerepo="*" --enablerepo="cuda*" list available    # RedHat
$ dnf --disablerepo="*" --enablerepo="cuda*" list available    # Fedora
$ zypper packages -r cuda                                      # OpenSUSE & SLES
$ cat /var/lib/apt/lists/*cuda*Packages | grep "Package:"      # Ubuntu

3.7.2. Package Upgrades

The cuda package points to the latest stable release of the CUDA Toolkit. When a new version is available, use the following commands to upgrade the toolkit and driver:

$ sudo yum install cuda                                        # RedHat
$ sudo dnf install cuda                                        # Fedora
$ sudo zypper install cuda                                     # OpenSUSE & SLES
$ sudo apt-get install cuda                                    # Ubuntu

The cuda-cross-<arch> packages can also be upgraded in the same manner.

The cuda-drivers package points to the latest driver release available in the CUDA repository. When a new version is available, use the following commands to upgrade the driver:

$ sudo yum install cuda-drivers                                # RedHat
$ sudo dnf install cuda-drivers                                # Fedora
$ sudo zypper install cuda-drivers \\
                      nvidia-gfxG04-kmp-default                # OpenSUSE & SLES
$ sudo apt-get install cuda-drivers                            # Ubuntu

Some desktop environments, such as GNOME or KDE, will display an notification alert when new packages are available.

To avoid any automatic upgrade, and lock down the toolkit installation to the X.Y release, install the cuda-X-Y or cuda-cross-<arch>-X-Y package.

Side-by-side installations are supported. For instance, to install both the X.Y CUDA Toolkit and the X.Y+1 CUDA Toolkit, install the cuda-X.Y and cuda-X.Y+1 packages.

3.7.3. Meta Packages

Meta packages are RPM/Deb packages which contain no (or few) files but have multiple dependencies. They are used to install many CUDA packages when you may not know the details of the packages you want. Below is the list of meta packages.

Table 4. Meta Packages Available for CUDA 10.1
Meta PackagePurpose
cudaInstalls all CUDA Toolkit and Driver packages. Handles upgrading to the next version of the cuda package when it's released.
cuda-10-1Installs all CUDA Toolkit and Driver packages. Remains at version 10.1 until an additional version of CUDA is installed.
cuda-toolkit-10-1Installs all CUDA Toolkit packages required to develop CUDA applications. Does not include the driver.
cuda-tools-10-1Installs all CUDA command line and visual tools.
cuda-runtime-10-1Installs all CUDA Toolkit packages required to run CUDA applications, as well as the Driver packages.
cuda-compiler-10-1Installs all CUDA compiler packages.
cuda-libraries-10-1Installs all runtime CUDA Library packages.
cuda-libraries-dev-10-1Installs all development CUDA Library packages.
cuda-driversInstalls all Driver packages. Handles upgrading to the next version of the Driver packages when they're released.

4. Runfile Installation

Basic instructions can be found in the Quick Start Guide. Read on for more detailed instructions.

This section describes the installation and configuration of CUDA when using the standalone installer. The standalone installer is a ".run" file and is completely self-contained.

4.1. Overview

The Runfile installation installs the NVIDIA Driver, the CUDA Toolkit, and CUDA Samples, via an interactive ncurses-based interface.

The installation steps are listed below. Distribution-specific instructions for disabling the Nouveau drivers, and the steps for verifying device node creation, are also provided.

Finally, the advanced options for the installer and the uninstallation steps are detailed below.

The Runfile installation does not include support for cross-platform development. For cross-platform development, see the CUDA Cross-Platform Environment section.

4.2. Installation

  1. Perform the pre-installation actions.

  2. Disable the Nouveau drivers.

  3. Reboot into text mode (runlevel 3).

    This can usually be accomplished by adding the number "3" to the end of the system's kernel boot parameters.

    Since the NVIDIA drivers are not yet installed, the text terminals may not display correctly. Temporarily adding "nomodeset" to the system's kernel boot parameters may fix this issue.

    Consult your system's bootloader documentation for information on how to make the above boot parameter changes.

    The reboot is required to completely unload the Nouveau drivers and prevent the graphical interface from loading. The CUDA driver cannot be installed while the Nouveau drivers are loaded or while the graphical interface is active.

  4. Verify that the Nouveau drivers are not loaded. If the Nouveau drivers are still loaded, consult your distribution's documentation to see if further steps are needed to disable Nouveau.

  5. Run the installer and follow the on-screen prompts:
    $ sudo sh cuda_<version>_linux.run

    See Installer UI for navigating the ncurses-based installer UI.

    As of CUDA 10.1 some libraries will be installed in the system standard locations rather than in the Toolkit installation directory. Depending on your distribution these installed locations can be either: /usr/lib/x84_64-linux-gnu, or /usr/lib64, or /usr/lib. See the Advanced Options for how to change this location.

    The default installation locations for the toolkit and samples are:
    ComponentDefault Installation Directory
    CUDA Toolkit/usr/local/cuda-10.1
    CUDA Samples$(HOME)/NVIDIA_CUDA-10.1_Samples

    The /usr/local/cuda symbolic link points to the location where the CUDA Toolkit was installed. This link allows projects to use the latest CUDA Toolkit without any configuration file update.

    The installer must be executed with sufficient privileges to perform some actions. When the current privileges are insufficient to perform an action, the installer will ask for the user's password to attempt to install with root privileges. Actions that cause the installer to attempt to install with root privileges are:
    • installing the CUDA Driver
    • installing the CUDA Toolkit to a location the user does not have permission to write to
    • installing the CUDA Samples to a location the user does not have permission to write to
    • creating the /usr/local/cuda symbolic link

    Running the installer with sudo, as shown above, will give permission to install to directories that require root permissions. Directories and files created while running the installer with sudo will have root ownership.

    If installing the driver, the installer will also ask if the openGL libraries should be installed. If the GPU used for display is not an NVIDIA GPU, the NVIDIA openGL libraries should not be installed. Otherwise, the openGL libraries used by the graphics driver of the non-NVIDIA GPU will be overwritten and the GUI will not work. If performing a silent installation, the --no-opengl-libs option should be used to prevent the openGL libraries from being installed. See the Advanced Options section for more details.

    If the GPU used for display is an NVIDIA GPU, the X server configuration file, /etc/X11/xorg.conf, may need to be modified. In some cases, nvidia-xconfig can be used to automatically generate a xorg.conf file that works for the system. For non-standard systems, such as those with more than one GPU, it is recommended to manually edit the xorg.conf file. Consult the xorg.conf documentation for more information.

    Note: Installing Mesa may overwrite the /usr/lib/libGL.so that was previously installed by the NVIDIA driver, so a reinstallation of the NVIDIA driver might be required after installing these libraries.

  6. Reboot the system to reload the graphical interface.

  7. Verify the device nodes are created properly.

  8. Perform the post-installation actions.

4.3. Installer UI

The installer UI has three main states:

  1. EULA Acceptance.
    1. Scroll through the EULA using the arrow keys, the page up/down keys, or a scroll wheel.
  2. Component Selection.
    1. Navigate the menu using the arrow keys. The left/right keys will expand/collapse sub-elements.
    2. Select or deselect items to install by pressing the spacebar or enter key with the cursor on that item.
    3. With the cursor over an item with advanced options available, press 'A' to see that options menu. This is currently available for CUDA Toolkit and CUDA Samples items only.
  3. Advanced Options.
    1. Options such as setting the install path for a specific component are available here.

4.4. Disabling Nouveau

To install the Display Driver, the Nouveau drivers must first be disabled. Each distribution of Linux has a different method for disabling Nouveau.

The Nouveau drivers are loaded if the following command prints anything:

$ lsmod | grep nouveau

4.4.1. Fedora

  1. Create a file at /usr/lib/modprobe.d/blacklist-nouveau.conf with the following contents:
    blacklist nouveau
    options nouveau modeset=0
  2. Regenerate the kernel initramfs:
    $ sudo dracut --force
  3. Run the below command:
    $ sudo grub2-mkconfig -o /boot/grub2/grub.cfg
  4. Reboot the system.

4.4.2. RHEL/CentOS

  1. Create a file at /etc/modprobe.d/blacklist-nouveau.conf with the following contents:
    blacklist nouveau
    options nouveau modeset=0
  2. Regenerate the kernel initramfs:
    $ sudo dracut --force

4.4.3. OpenSUSE

  1. Create a file at /etc/modprobe.d/blacklist-nouveau.conf with the following contents:
    blacklist nouveau
    options nouveau modeset=0
  2. Regenerate the kernel initrd:
    $ sudo /sbin/mkinitrd

4.4.4. SLES

No actions to disable Nouveau are required as Nouveau is not installed on SLES.

4.4.5. Ubuntu

  1. Create a file at /etc/modprobe.d/blacklist-nouveau.conf with the following contents:
    blacklist nouveau
    options nouveau modeset=0
  2. Regenerate the kernel initramfs:
    $ sudo update-initramfs -u

4.5. Device Node Verification

Check that the device files/dev/nvidia* exist and have the correct (0666) file permissions. These files are used by the CUDA Driver to communicate with the kernel-mode portion of the NVIDIA Driver. Applications that use the NVIDIA driver, such as a CUDA application or the X server (if any), will normally automatically create these files if they are missing using the setuidnvidia-modprobe tool that is bundled with the NVIDIA Driver. However, some systems disallow setuid binaries, so if these files do not exist, you can create them manually by using a startup script such as the one below:

#!/bin/bash

/sbin/modprobe nvidia

if [ "$?" -eq 0 ]; then
  # Count the number of NVIDIA controllers found.
  NVDEVS=`lspci | grep -i NVIDIA`
  N3D=`echo "$NVDEVS" | grep "3D controller" | wc -l`
  NVGA=`echo "$NVDEVS" | grep "VGA compatible controller" | wc -l`

  N=`expr $N3D + $NVGA - 1`
  for i in `seq 0 $N`; do
    mknod -m 666 /dev/nvidia$i c 195 $i
  done

  mknod -m 666 /dev/nvidiactl c 195 255

else
  exit 1
fi

/sbin/modprobe nvidia-uvm

if [ "$?" -eq 0 ]; then
  # Find out the major device number used by the nvidia-uvm driver
  D=`grep nvidia-uvm /proc/devices | awk 'print $1'`

  mknod -m 666 /dev/nvidia-uvm c $D 0
else
  exit 1
fi

4.6. Advanced Options

ActionOptions UsedExplanation
Silent Installation--silentRequired for any silent installation. Performs an installation with no further user-input and minimal command-line output based on the options provided below. Silent installations are useful for scripting the installation of CUDA. Using this option implies acceptance of the EULA. The following flags can be used to customize the actions taken during installation. At least one of --driver, --uninstall, --toolkit, and --samples must be passed if running with non-root permissions.
--driverInstall the CUDA Driver.
--toolkitInstall the CUDA Toolkit.
--toolkitpath=<path>Install the CUDA Toolkit to the <path> directory. If not provided, the default path of /usr/local/cuda-10.1 is used.
--samplesInstall the CUDA Samples.
--samplespath=<path>Install the CUDA Samples to the <path> directory. If not provided, the default path of $(HOME)/NVIDIA_CUDA-10.1_Samples is used.
--defaultroot=<path>Install libraries to the <path> directory. If the <path> is not provided, then the default path of your distribution is used. This only applies to the libraries installed outside of the CUDA Toolkit path.
Extraction--extract=<path>

Extracts to the <path> the following: the driver runfile, the raw files of the toolkit and samples to <path>.

This is especially useful when one wants to install the driver using one or more of the command-line options provided by the driver installer which are not exposed in this installer.

Overriding Installation Checks--overrideIgnores compiler, third-party library, and toolkit detection checks which would prevent the CUDA Toolkit and CUDA Samples from installing.
No OpenGL Libraries--no-opengl-libsPrevents the driver installation from installing NVIDIA's GL libraries. Useful for systems where the display is driven by a non-NVIDIA GPU. In such systems, NVIDIA's GL libraries could prevent X from loading properly.
No man pages--no-man-pageDo not install the man pages under /usr/share/man.
Overriding Kernel Source--kernel-source-path=<path>Tells the driver installation to use <path> as the kernel source directory when building the NVIDIA kernel module. Required for systems where the kernel source is installed to a non-standard location.
Running nvidia-xconfig--run-nvidia-xconfigTells the driver installation to run nvidia-xconfig to update the system X configuration file so that the NVIDIA X driver is used. The pre-existing X configuration file will be backed up.
No nvidia-drm kernel module--no-drmDo not install the nvidia-drm kernel module. This option should only be used to work around failures to build or install the nvidia-drm kernel module on systems that do not need the provided features.
Custom Temporary Directory Selection--tmpdir=<path>Performs any temporary actions within <path> instead of /tmp. Useful in cases where /tmp cannot be used (doesn't exist, is full, is mounted with 'noexec', etc.).
Show Installer Options--helpPrints the list of command-line options to stdout.

4.7. Uninstallation

To uninstall the CUDA Toolkit, run the uninstallation script provided in the bin directory of the toolkit. By default, it is located in /usr/local/cuda-10.1/bin:

$ sudo /usr/local/cuda-10.1/bin/cuda-uninstaller

To uninstall the NVIDIA Driver, run nvidia-uninstall:

$ sudo /usr/bin/nvidia-uninstall

To enable the Nouveau drivers, remove the blacklist file created in the Disabling Nouveau section, and regenerate the kernel initramfs/initrd again as described in that section.

5. Cluster Management Packages

5.1. Overview

Cluster management packages are provided as an alternative set of RPM and Deb packages intended to be used by deployment management tools as standalone packages. These packages are available for RHEL 6, RHEL 7, Ubuntu 14.04, and Ubuntu 16.04 on the x86_64 architecture. There are three parts to the cluster management packages: the CUDA toolkit packages, the NVIDIA driver packages, and the README.

The cluster management toolkit packages are split into a runtime package, cuda-cluster-runtime-10-1, and a development package, cuda-cluster-devel-10-1. Th

以上是关于Cuda: Handle Conflicting Installation Methods的主要内容,如果未能解决你的问题,请参考以下文章

CUDA 错误:调用 `cublasCreate(handle)` 时出现 CUBLAS_STATUS_ALLOC_FAILED

RuntimeError:CUDA 错误:仅使用 GPU 调用 `cublasSgemm(handle)` 时出现 CUBLAS_STATUS_EXECUTION_FAILED

Error Code 1: Cuda Runtime (invalid resource handle)

Error Code 1: Cuda Runtime (invalid resource handle)

RuntimeError: CUDA error: CUBLAS_STATUS_NOT_INITIALIZED when calling `cublasCreate(handle)`

解决Check failed: cudnnSetTensorNdDescriptor(handle_.get(), elem_type, nd, dims.data(), strides.data()