示例 deviceQuery cuda 程序
Posted
技术标签:
【中文标题】示例 deviceQuery cuda 程序【英文标题】:Sample deviceQuery cuda program 【发布时间】:2019-07-07 04:01:10 【问题描述】:我有一台配置了 NVIDIA GeForce1080 GTX 和 CentOS 7 作为操作系统的 Intel Xeon 机器。我已经安装了 NVIDIA-driver 410.93 和 cuda-toolkit 10.0。编译 cuda-samples 后,我尝试运行 ./deviceQuery。 但它会像这样抛出
./deviceQuery Starting...
CUDA Device Query (Runtime API) version (CUDART static linking)
cudaGetDeviceCount returned 30
-> unknown error
Result = FAIL
一些命令输出
lspci | grep VGA
01:00.0 VGA compatible controller: NVIDIA Corporation GP104 [GeForce GTX 1080] (rev a1)
nvidia-smi
Wed Feb 13 16:08:07 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.93 Driver Version: 410.93 CUDA Version: 10.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 1080 Off | 00000000:01:00.0 On | N/A |
| 0% 54C P0 46W / 240W | 175MiB / 8119MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 6275 G /usr/bin/X 94MiB |
| 0 7268 G /usr/bin/gnome-shell 77MiB |
+-----------------------------------------------------------------------------+
nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:01_CDT_2018
Cuda compilation tools, release 10.0, V10.0.13
PATH 和 LD_LIBRARY_PATH
PATH =/usr/local/cuda-10.0/bin:/usr/local/cuda/bin:/usr/local/bin:/usr/local/sbin:
LD_LIBRARY_PATH = /usr/local/cuda-10.0/lib64:/usr/local/cuda/lib64:
lsmod | grep 英伟达
nvidia_drm 39819 3
nvidia_modeset 1036573 6 nvidia_drm
nvidia 16628708 273 nvidia_modeset
drm_kms_helper 179394 1 nvidia_drm
drm 429744 6 drm_kms_helper,nvidia_drm
ipmi_msghandler 56032 2 ipmi_devintf,nvidia
lsmod | grep nvidia-uvm 没有输出
dmesg | grep NVRM
[ 8.237489] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 410.93 Thu Dec 20 17:01:16 CST 2018 (using threaded interrupts)
这个问题是否与 modprobe 或 nvidia-uvm 有关? 我在 NVIDIA-devtalk 论坛上问过这个问题,但还没有回复。 请给一些建议。 提前致谢。
【问题讨论】:
错误信息不是很明确...你在视频组吗? 这里的编程问题是? 【参考方案1】:我调试了它。问题是 nvidia-driver(410.93) 和 cuda 之间的版本不匹配(驱动程序 410.48 带有 cuda 运行文件)。自动删除所有驱动程序并从头重新安装。删除 /var/lib/dkms/nvidia/* 中的所有链接文件。 现在它工作正常。并且 nvidia-uvm 也加载了。
lsmod | grep 英伟达
nvidia_uvm 786031 0
nvidia_drm 39819 3
nvidia_modeset 1048491 6 nvidia_drm
nvidia 16805034 274 nvidia_modeset,nvidia_uvm
drm_kms_helper 179394 1 nvidia_drm
drm 429744 6 drm_kms_helper,nvidia_drm
ipmi_msghandler 56032 2 ipmi_devintf,nvidia
nvidia-smi
Fri Feb 15 11:46:24 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.48 Driver Version: 410.48 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 1080 Off | 00000000:01:00.0 On | N/A |
| 0% 45C P8 10W / 240W | 242MiB / 8119MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 6063 G /usr/bin/X 120MiB |
| 0 7502 G /usr/bin/gnome-shell 119MiB |
+-----------------------------------------------------------------------------+
nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:01_CDT_2018
Cuda compilation tools, release 10.0, V10.0.130
./deviceQuery
./deviceQuery Starting...
CUDA Device Query (Runtime API) version (CUDART static linking)
Detected 1 CUDA Capable device(s)
Device 0: "GeForce GTX 1080"
CUDA Driver Version / Runtime Version 10.0 / 10.0
CUDA Capability Major/Minor version number: 6.1
Total amount of global memory: 8119 MBytes (8513585152 bytes)
(20) Multiprocessors, (128) CUDA Cores/MP: 2560 CUDA Cores
GPU Max Clock rate: 1797 MHz (1.80 GHz)
Memory Clock rate: 5005 Mhz
Memory Bus Width: 256-bit
L2 Cache Size: 2097152 bytes
Maximum Texture Dimension Size (x,y,z) 1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
Maximum Layered 1D Texture Size, (num) layers 1D=(32768), 2048 layers
Maximum Layered 2D Texture Size, (num) layers 2D=(32768, 32768), 2048 layers
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per multiprocessor: 2048
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Concurrent copy and kernel execution: Yes with 2 copy engine(s)
Run time limit on kernels: Yes
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Disabled
Device supports Unified Addressing (UVA): Yes
Device supports Compute Preemption: Yes
Supports Cooperative Kernel Launch: Yes
Supports MultiDevice Co-op Kernel Launch: Yes
Device PCI Domain ID / Bus ID / location ID: 0 / 1 / 0
Compute Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 10.0, CUDA Runtime Version = 10.0, NumDevs = 1
Result = PASS
【讨论】:
以上是关于示例 deviceQuery cuda 程序的主要内容,如果未能解决你的问题,请参考以下文章
为啥 torch.version.cuda 和 deviceQuery 报告不同的版本?