TensorFlow中的cudnn编译配置

Posted 2023-04-15

技术标签:

【中文标题】TensorFlow中的cudnn编译配置【英文标题】：cudnn compile configuration in TensorFlow 【发布时间】：2016-10-06 09:27:14 【问题描述】：

Ubuntu 14.04，CUDA 版本 7.5.18，tensorflow 的夜间构建

在 tensorflow 中运行tf.nn.max_pool() 操作时，出现以下错误：

E tensorflow/stream_executor/cuda/cuda_dnn.cc:286] 加载 cudnn 库：5005，但源代码是针对 4007 编译的。如果使用二进制文件安装，升级您的 cudnn 库以匹配。如果从来源，确保加载的库与您的版本匹配在编译配置期间指定。

W tensorflow/stream_executor/stream.cc:577] 试图执行 DNN 在不支持 DNN 的情况下使用 StreamExecutor 进行操作

Traceback（最近一次调用最后一次）：

...

如何在tensorflow的编译配置中指定我的cudnn版本？

【问题讨论】：

我很困惑这个错误甚至意味着什么，这是否意味着你有一个版本的 cudnn 但 tensorflow 期望一个不同的版本或者它是什么意思？ 【参考方案1】：

进入TensorFlow源码目录，然后执行配置文件：/.configure。

这是来自TensorFlow documentation 的示例：

$ ./configure
Please specify the location of python. [Default is /usr/bin/python]:
Do you wish to build TensorFlow with GPU support? [y/N] y
GPU support will be enabled for TensorFlow

Please specify which gcc nvcc should use as the host compiler. [Default is
/usr/bin/gcc]: /usr/bin/gcc-4.9

Please specify the Cuda SDK version you want to use, e.g. 7.0. [Leave
empty to use system default]: 7.5

Please specify the location where CUDA 7.5 toolkit is installed. Refer to
README.md for more details. [default is: /usr/local/cuda]: /usr/local/cuda

Please specify the Cudnn version you want to use. [Leave empty to use system
default]: 4.0.4

Please specify the location where the cuDNN 4.0.4 library is installed. Refer to
README.md for more details. [default is: /usr/local/cuda]: /usr/local/cudnn-r4-rc/

Please specify a list of comma-separated Cuda compute capabilities you want to
build with. You can find the compute capability of your device at:
https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your
build time and binary size. [Default is: \"3.5,5.2\"]: 3.5

Setting up Cuda include
Setting up Cuda lib64
Setting up Cuda bin
Setting up Cuda nvvm
Setting up CUPTI include
Setting up CUPTI lib64
Configuration finished

【讨论】：

所以没有办法不用自己用 bazel 重新编译 tensorflow，以更改 cudnn 版本？我无法配置我当前的安装？运行./configure时，你必须用bazel重新编译。也许还有另一种我不知道的方式！一个问题：如果是二进制安装怎么解决？ "如果使用二进制安装，请升级您的 cudnn 库以匹配" 如果用pip安装，有什么办法解决这个问题？是一样的吗？【参考方案2】：

好像你已经安装了 cudnn 5。运行时需要设置./configure

Please specify the Cudnn version you want to use. [Leave empty to use system
default]: 5

【讨论】：

我可以在当前实现中运行./configure，还是需要使用 Bazel 重新编译？ @GunnarNielsen 你需要从源代码重新编译如果用pip安装，有什么办法解决这个问题？是一样的吗？【参考方案3】：

加我的 2 美分：在我的情况下（TF0.12.1，从 pip 安装到 anaconda，没有 sudo 权限）安装了 CuDNNv5，但不是默认设置。

设置export LD_LIBRARY_PATH="/usr/local/lib/cuda-8.0/lib64:/usr/local/lib/cudann5/lib64/" 解决了这个问题

【讨论】：

【参考方案4】：

我也遇到了这样的不兼容问题：

Loaded runtime CuDNN library: 5005 (compatibility version 5000) but source wascompiled with 5110 (compatibility version 5100).  If using a binary install, upgrade your CuDNNlibrary to match.  If building fromsources, make sure the library loaded at runtime matches a compatible versionspecified during compile configuration.

所以我下载了 CuDNN 5.1（与 CUDA8.0 兼容）并用它替换了 5.0，然后一切顺利。

警告：来自 nvidia 的 CuDNN 不可用，但您可以从其他共享中找到它。

【讨论】：

以上是关于TensorFlow中的cudnn编译配置的主要内容，如果未能解决你的问题，请参考以下文章