TensorFlow GPU:cudnn 是可选的吗?无法打开 CUDA 库 libcudnn.so

Posted

技术标签:

【中文标题】TensorFlow GPU:cudnn 是可选的吗?无法打开 CUDA 库 libcudnn.so【英文标题】:TensorFlow GPU: is cudnn optional? Couldn't open CUDA library libcudnn.so 【发布时间】:2016-08-19 16:36:42 【问题描述】:

我安装的是tensorflow-0.8.0 GPU版本,tensorflow-0.8.0-cp27-none-linux_x86_64.whl。它说它需要 CUDA 工具包 7.5 和 CuDNN v4。

# Ubuntu/Linux 64-bit, GPU enabled. Requires CUDA toolkit 7.5 and CuDNN v4.  For
# other versions, see "Install from sources" below.

但是,我不小心忘记安装 CuDNN v4,但除了错误消息“无法打开 CUDA 库 libcudnn.so”外,它工作正常。但它的工作原理是“创建 TensorFlow 设备 (/gpu:0)”。

没有 CuDNN 的消息

I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:99] Couldn't open CUDA library libcudnn.so. LD_LIBRARY_PATH: /usr/local/cuda/lib64:
I tensorflow/stream_executor/cuda/cuda_dnn.cc:1562] Unable to load cuDNN DSO
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcurand.so locally
('Extracting', 'MNIST_data/train-images-idx3-ubyte.gz')
/usr/lib/python2.7/gzip.py:268: VisibleDeprecationWarning: converting an array with ndim > 0 to an index will result in an error in the future
  chunk = self.extrabuf[offset: offset + size]
/home/ubuntu/TensorFlow-Tutorials/input_data.py:42: VisibleDeprecationWarning: converting an array with ndim > 0 to an index will result in an error in the future
  data = data.reshape(num_images, rows, cols, 1)
('Extracting', 'MNIST_data/train-labels-idx1-ubyte.gz')
('Extracting', 'MNIST_data/t10k-images-idx3-ubyte.gz')
('Extracting', 'MNIST_data/t10k-labels-idx1-ubyte.gz')
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:900] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 0 with properties:
name: GRID K520
major: 3 minor: 0 memoryClockRate (GHz) 0.797
pciBusID 0000:00:03.0
Total memory: 4.00GiB
Free memory: 3.95GiB
I tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0
I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0:   Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:755] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GRID K520, pci bus id: 0000:00:03.0)
I tensorflow/core/common_runtime/gpu/pool_allocator.cc:244] PoolAllocator: After 1704 get requests, put_count=1321 evicted_count=1000 eviction_rate=0.757002 and unsatisfied allocation rate=0.870305
I tensorflow/core/common_runtime/gpu/pool_allocator.cc:256] Raising pool_size_limit_ from 100 to 110
I tensorflow/core/common_runtime/gpu/pool_allocator.cc:244] PoolAllocator: After 1704 get requests, put_count=1812 evicted_count=1000 eviction_rate=0.551876 and unsatisfied allocation rate=0.536972
I tensorflow/core/common_runtime/gpu/pool_allocator.cc:256] Raising pool_size_limit_ from 256 to 281

后来,我安装了 CuDNN,但看不出有什么不同。

带有 CuDNN 的消息

I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcurand.so locally
('Extracting', 'MNIST_data/train-images-idx3-ubyte.gz')
/usr/lib/python2.7/gzip.py:268: VisibleDeprecationWarning: converting an array with ndim > 0 to an index will result in an error in the future
  chunk = self.extrabuf[offset: offset + size]
/home/ubuntu/TensorFlow-Tutorials/input_data.py:42: VisibleDeprecationWarning: converting an array with ndim > 0 to an index will result in an error in the future
  data = data.reshape(num_images, rows, cols, 1)
('Extracting', 'MNIST_data/train-labels-idx1-ubyte.gz')
('Extracting', 'MNIST_data/t10k-images-idx3-ubyte.gz')
('Extracting', 'MNIST_data/t10k-labels-idx1-ubyte.gz')
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:900] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 0 with properties:
name: GRID K520
major: 3 minor: 0 memoryClockRate (GHz) 0.797
pciBusID 0000:00:03.0
Total memory: 4.00GiB
Free memory: 3.95GiB
I tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0
I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0:   Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:755] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GRID K520, pci bus id: 0000:00:03.0)
I tensorflow/core/common_runtime/gpu/pool_allocator.cc:244] PoolAllocator: After 1704 get requests, put_count=1321 evicted_count=1000 eviction_rate=0.757002 and unsatisfied allocation rate=0.870305
I tensorflow/core/common_runtime/gpu/pool_allocator.cc:256] Raising pool_size_limit_ from 100 to 110
I tensorflow/core/common_runtime/gpu/pool_allocator.cc:244] PoolAllocator: After 1704 get requests, put_count=1811 evicted_count=1000 eviction_rate=0.552181 and unsatisfied allocation rate=0.537559
I tensorflow/core/common_runtime/gpu/pool_allocator.cc:256] Raising pool_size_limit_ from 256 to 281

那么有/没有 CuDNN 有什么区别?

【问题讨论】:

某些情况下的性能改进。 @PavanYalamanchili 谢谢!你有什么情况可以改善吗?如果是这样,TF不应该给我们一个明确的错误并停止运行吗? cudnn 不公开。看起来 TensorFlow 可能正在使用回退算法,以防用户的系统上没有 cudnn。它没有理由出错。 【参考方案1】:

cuDNN 用于加速一些 TensorFlow 操作,例如卷积。我在您的日志文件中注意到您正在对 MNIST 数据集进行训练。 TensorFlow 提供的参考 MNIST 模型围绕 2 个全连接层和一个 softmax 构建。因此TensorFlow在训练这个模型时不会尝试调用cuDNN。

我不确定当 cuDNN 不可用时 TensorFlow 是否会自动回退到较慢的卷积算法。如果不是这样,您始终可以在运行 TensorFlow 之前通过将 TF_USE_CUDNN 环境变量设置为 0 来禁用 cuDNN。

【讨论】:

仅供参考,截至 2017 年 4 月:Conv2D for GPU is not currently supported without cudnn【参考方案2】:

解决方案当您使用 MNIST 数据集时,如果您遇到与 CUDNN 相关的错误,请尝试此操作

import sys

config = tf.ConfigProto()
config.gpu_options.allow_growth = True
sess = tf.Session(config=config)

然后继续你的代码

model.fit(training_images, training_labels, epochs=10, callbacks=[callbacks])

并且拟合应该可以完美运行,没有任何错误/异常

【讨论】:

以上是关于TensorFlow GPU:cudnn 是可选的吗?无法打开 CUDA 库 libcudnn.so的主要内容,如果未能解决你的问题,请参考以下文章

已安装 Tensorflow-gpu、CUDA 和 cudnn,但发现 GPU 设备但未使用 [重复]

tensorflow只能在装有gpu的机器上运行

Tensorflow—gpu报错

TensorFlow各个GPU版本CUDA和cuDNN对应版本

TensorFlow各个GPU版本CUDA和cuDNN对应版本

win10安装CUDA CUDNN tensorflow-gpu