尝试使用 tensorflow 运行教程 CNN 时出现 cuDNN_STATUS_ALLOC_FAILED
Posted
技术标签:
【中文标题】尝试使用 tensorflow 运行教程 CNN 时出现 cuDNN_STATUS_ALLOC_FAILED【英文标题】:cuDNN_STATUS_ALLOC_FAILED when trying to run a tutorial CNN with tensorflow 【发布时间】:2021-06-09 01:32:30 【问题描述】:我正在尝试使用卷积神经网络 (CNN) 运行一个简单的 Python 脚本。每次我运行脚本时都会遇到以下错误消息
2021-03-10 19:47:03.832061: E tensorflow/stream_executor/cuda/cuda_dnn.cc:328] Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED
Traceback (most recent call last):
File "CNN_trial.py", line 17, in <module>
outputs = tf.nn.conv2d(images,filters,strides = 1,padding = "SAME")
File "D:\miniconda3\envs\tensorflow\lib\site-packages\tensorflow\python\util\dispatch.py", line 201, in wrapper
return target(*args, **kwargs)
File "D:\miniconda3\envs\tensorflow\lib\site-packages\tensorflow\python\ops\nn_ops.py", line 2158, in conv2d_v2
return conv2d(input, # pylint: disable=redefined-builtin
File "D:\miniconda3\envs\tensorflow\lib\site-packages\tensorflow\python\util\dispatch.py", line 201, in wrapper
return target(*args, **kwargs)
File "D:\miniconda3\envs\tensorflow\lib\site-packages\tensorflow\python\ops\nn_ops.py", line 2264, in conv2d
return gen_nn_ops.conv2d(
File "D:\miniconda3\envs\tensorflow\lib\site-packages\tensorflow\python\ops\gen_nn_ops.py", line 942, in conv2d
return conv2d_eager_fallback(
File "D:\miniconda3\envs\tensorflow\lib\site-packages\tensorflow\python\ops\gen_nn_ops.py", line 1031, in conv2d_eager_fallback
_result = _execute.execute(b"Conv2D", 1, inputs=_inputs_flat, attrs=_attrs,
File "D:\miniconda3\envs\tensorflow\lib\site-packages\tensorflow\python\eager\execute.py", line 59, in quick_execute
tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
tensorflow.python.framework.errors_impl.UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above. [Op:Conv2D]
我的系统如下 视窗 10
AMD 锐龙 7 3700x
16GB 内存
英伟达 RTX 2060
Python 3.8.5
张量流 2.4.1
我的完整代码:
from sklearn.datasets import load_sample_image
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
china = load_sample_image("china.jpg")/255
flower = load_sample_image("flower.jpg")/255
images = np.array([china,flower])
batch_size, height,width,channels = images.shape
filters = np.zeros(shape=(7,7,channels,2),dtype=np.float32)
filters[:,3,:,0] = 1
filters[3,:,:,1] = 1
outputs = tf.nn.conv2d(images,filters,strides = 1,padding = "SAME")
plt.imshow(outputs[0,:,:,1],cmap = "gray")
plt.show()
【问题讨论】:
您可能还有另一个代码实例仍在运行。所以原始实例仍在使用你必须终止的 GPU 来做你想做的事 那么我应该关闭所有其他可能正在运行代码的应用程序吗?我正在使用 VS 代码,但我没有打开另一个 IDE 来运行任何代码。 @BrainE 一个 VS 代码窗口可以打开多个终端 我关闭了所有其他 VS 代码窗口,除了我正在使用的那个窗口,仍然得到同样的错误 【参考方案1】:看来我需要设置内存增长。通过将以下两行添加到脚本的开头。我至少让它运行起来了。
devices = tf.config.list_physical_devices('GPU')
tf.config.experimental.set_memory_growth(devices[0],True)
【讨论】:
以上是关于尝试使用 tensorflow 运行教程 CNN 时出现 cuDNN_STATUS_ALLOC_FAILED的主要内容,如果未能解决你的问题,请参考以下文章
CNN入门mnist数据集运行环境搭建(安装Python,Pycharm,Anaconda,Tensorflow,CNN代码)