非 OK 状态:GpuLaunchKernel(...) 状态:内部:没有可在设备上执行的内核映像
Posted
技术标签:
【中文标题】非 OK 状态:GpuLaunchKernel(...) 状态:内部:没有可在设备上执行的内核映像【英文标题】:Non-OK-status: GpuLaunchKernel(...) status: Internal: no kernel image is available for execution on the device 【发布时间】:2020-11-25 04:58:40 【问题描述】:我使用 CUDA Toolkit 10.1 CUDNN 7.6.0 (Windows 10) 在 tensorflow 2.1.0 Anaconda 上运行我的代码,但它返回了一个问题
F .\tensorflow/core/kernels/random_op_gpu.h:232] Non-OK-status: GpuLaunchKernel(FillPhiloxRandomKernelLaunch<Distribution>, num_blocks, block_size, 0, d.stream(), gen, data, size, dist) status: Internal: no kernel image is available for execution on the device
我的 GPU:GT940MX 计算能力 5.0
我已经运行 nvcc -V 并返回:
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Fri_Feb__8_19:08:26_Pacific_Standard_Time_2019
Cuda compilation tools, release 10.1, V10.1.105
这是完整的结果:
2020-08-05 10:05:48.368012: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll
2020-08-05 10:06:00.488544: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library nvcuda.dll
2020-08-05 10:06:48.153611: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce 940MX computeCapability: 5.0
coreClock: 0.8605GHz coreCount: 4 deviceMemorySize: 2.00GiB deviceMemoryBandwidth: 37.33GiB/s
2020-08-05 10:06:48.164731: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll
2020-08-05 10:06:48.245826: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cublas64_10.dll
2020-08-05 10:06:48.296245: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cufft64_10.dll
2020-08-05 10:06:48.338860: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library curand64_10.dll
2020-08-05 10:06:48.439393: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusolver64_10.dll
2020-08-05 10:06:48.489830: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusparse64_10.dll
2020-08-05 10:06:48.941872: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudnn64_7.dll
2020-08-05 10:06:48.946651: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1858] Adding visible gpu devices: 0
2020-08-05 10:06:48.951881: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations: AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2020-08-05 10:06:48.979077: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x23d29b660d0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-08-05 10:06:48.985680: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2020-08-05 10:06:48.990616: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce 940MX computeCapability: 5.0
coreClock: 0.8605GHz coreCount: 4 deviceMemorySize: 2.00GiB deviceMemoryBandwidth: 37.33GiB/s
2020-08-05 10:06:49.003356: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll
2020-08-05 10:06:49.009869: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cublas64_10.dll
2020-08-05 10:06:49.014858: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cufft64_10.dll
2020-08-05 10:06:49.020699: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library curand64_10.dll
2020-08-05 10:06:49.028876: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusolver64_10.dll
2020-08-05 10:06:49.033607: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusparse64_10.dll
2020-08-05 10:06:49.039192: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudnn64_7.dll
2020-08-05 10:06:49.045288: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1858] Adding visible gpu devices: 0
2020-08-05 10:06:49.218497: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1257] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-08-05 10:06:49.223536: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1263] 0
2020-08-05 10:06:49.226857: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1276] 0: N
2020-08-05 10:06:49.230413: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1402] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1460 MB memory) -> physical GPU (device: 0, name: GeForce 940MX, pci bus id: 0000:01:00.0, compute capability: 5.0)
2020-08-05 10:06:49.244107: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x23d301b8fa0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-08-05 10:06:49.250377: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): GeForce 940MX, Compute Capability 5.0
2020-08-05 10:06:49.446601: F .\tensorflow/core/kernels/random_op_gpu.h:232] Non-OK-status: GpuLaunchKernel(FillPhiloxRandomKernelLaunch<Distribution>, num_blocks, block_size, 0, d.stream(), gen, data, size, dist) status: Internal: no kernel image is available for execution on the device
有哪些问题以及如何解决?
【问题讨论】:
你构建的tensorflow版本不支持你的GPU @talonmies 我已经满足 TensorFlow 要求(Cuda 计算能力 >3.5) 我面临着完全相同的问题。 @talonmies 请提供信息,如果可以的话,哪个 tensorflow 版本将兼容,因为在 tensorflow 网站上,据说 2.3 版可以与 CUDA 10.1 和 cuDNN 7.6 兼容? Tensorflow 的特定版本在理论上是否可以支持您的 GPU 并不重要。这是人们是否比构建您已安装的二进制版本选择编译以支持您的 GPU。这个问题需要向构建你所拥有的二进制文件的人提出。 @talonmies 我可以通过使用 bazel 从源代码构建它来运行 tensorflow 吗? 【参考方案1】:看起来这是 Python 3.8 和 Tensorflow 2.3 的问题。我用 python 3.7 尝试了 tensorflow 2.3.0,但它在 python 3.7 中返回错误,因为 python38.dll(我不记得确切的错误,我已经删除了 env),无论如何我在 anaconda env 上使用了 python 3.7 并安装带有 pip 的 tensorflow 2.1.0 并且可以正常工作。
我也在github上发布了这个问题,这个问题在githubhttps://github.com/tensorflow/tensorflow/issues/42052得到了回答
【讨论】:
感谢您的提示!以下组合似乎对我有用... GPU:GeForce GTX 750 Ti,python 3.7.8,tf 版本 2.1.1,cuda-v7.6.5.32【参考方案2】:根据下面的屏幕截图,Tensorflow Versions 2.1, 2.2 and 2.3
适用于 cuDNN 版本 7.4 但 cuDNN version of your GPU is 7.6
。
这很可能是错误的原因。
解决方案是将您的 GPU
的 cuDNN Version
降级。
cuDNN
的现有版本可以通过 Windows Control Panel
使用 Programs and Features widget
卸载。
可以安装新版本的cuDNN,如NVIDIA Installation Guide所示。
另外,请参考此Github Issue 以了解有关如何降级 cuDNN 版本的更多信息。
以上截图取自Tensorflow Documentation。
【讨论】:
但是没有 cuDNN 7.4 支持 CUDA 10.1 @Tensorflow 支持 您可以尝试将 cuDNN 版本降级到 7.4。 我尝试了 CUDA 10.1 的 cuDNN 7.4(因为 CUDA 10.1 没有 cuDNN 7.4),它返回 same 问题F .\tensorflow/core/kernels/random_op_gpu.h:232] Non-OK-status: GpuLaunchKernel(FillPhiloxRandomKernelLaunch<Distribution>, num_blocks, block_size, 0, d.stream(), gen, data, size, dist) status: Internal: no kernel image is available for execution on the device (D:\Tensorflow\anaconda) PS D:\Tensorflow\TensorFlow-2.x-YOLOv3>
我试过CUDA 10.1的cuDNN 7.4(因为CUDA 10.1没有cuDNN 7.4)==>这句话有点混乱。你能改写一下吗。谢谢!
对不起,我的意思是:我尝试了 CUDA 10.0 的 cuDNN 7.4(因为 CUDA 10.1 没有 cuDNN 7.4),它返回相同的问题 @Tensorflow 支持【参考方案3】:
我有同样的问题,我的 cuDNN 是 8.0.2。 正如您所说,CUDA 10.1 没有 cuDNN 7.4。 所以,我为 CUDA 10.1 尝试了 cuDNN 7.5 并且它有效!!!! 希望我的经验可以帮助别人。 :)
【讨论】:
【参考方案4】:似乎某些 cuDNN 仅受某些特定版本的 tensorflow 支持。
作为 Window 用户,我就是这样做的:
-
Check which version that which Tensorflow and CUDA version combinations are compatible(可以点击左侧其他操作系统)
正如 Rock Jefferson 所说,您可以将 cuDNN 7.5 用于 CUDA 10.1。它对我有用。
Download here
试试看。希望对你有用。
【讨论】:
我在使用 CUDA 10.1 和 cuDNN 7.6.5(列出的与 10.1 兼容的最新版本)时遇到了这个问题。我尝试降级到 cuDNN 7.5.1,但遇到了同样的问题。 7.5.0 也一样。以上是关于非 OK 状态:GpuLaunchKernel(...) 状态:内部:没有可在设备上执行的内核映像的主要内容,如果未能解决你的问题,请参考以下文章
Spring RestTemplate 与任何非 200 OK 响应交换 POST HttpClientException
Breeze 错误消息“;”在 Chrome 中状态为 OK