Tensorflow-gpu 获取卷积算法失败

Posted

技术标签:

【中文标题】Tensorflow-gpu 获取卷积算法失败【英文标题】:Tensorflow-gpu failed to get convolution algorithm 【发布时间】:2019-06-18 20:30:36 【问题描述】:

我正在尝试制作一个卷积神经网络来分析微软的猫狗数据集。我在 Windows 10 上使用 tensorflow-gpu 1.12.0、jupyter notebook 和 anaconda。我的 GPU 是 GTX 1080。我安装了 CUDA 和 cuDNN,我很确定我设置正确。我检查了版本。这是我的代码(我在 jupyter 的不同单元格中有它)。

import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Activation, Flatten, Conv2D, MaxPooling2D
import pickle


import sys
print(sys.executable)
print(tf.__version__)


gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction = 0.4)
session = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options))
print('GPU Settings set')


X = pickle.load(open('X.pickle','rb')) # Brings in the "pictures" of the training set
y = pickle.load(open('y.pickle','rb')) # Brings in the answers


X = X/255.0 # Normalizes the model so each number is between 0 and 1

print('Data Loaded')

model = Sequential()

model.add(Conv2D(64, (3,3), input_shape = X.shape[1:]))
model.add(Activation("relu"))
model.add(MaxPooling2D(pool_size=(2,2)))

model.add(Conv2D(64, (3,3)))
model.add(Activation("relu"))
model.add(MaxPooling2D(pool_size=(2,2)))

model.add(Flatten())
model.add(Dense(64))

model.add(Dense(1))
model.add(Activation('sigmoid'))

model.compile(loss="binary_crossentropy", optimizer='adam', metrics = ['accuracy'])

model.fit(X, y, batch_size=25, epochs=3, validation_split=0.1)

我得到这个错误:

Train on 22451 samples, validate on 2495 samples
Epoch 1/3
---------------------------------------------------------------------------
UnknownError                              Traceback (most recent call last)
<ipython-input-6-9cef6147c3c5> in <module>
     17 model.compile(loss="binary_crossentropy", optimizer='adam', metrics = ['accuracy'])
     18 
---> 19 model.fit(X, y, batch_size=25, epochs=3, validation_split=0.1)

~\Anaconda3\envs\learning\lib\site-packages\tensorflow\python\keras\engine\training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, max_queue_size, workers, use_multiprocessing, **kwargs)
   1637           initial_epoch=initial_epoch,
   1638           steps_per_epoch=steps_per_epoch,
-> 1639           validation_steps=validation_steps)
   1640 
   1641   def evaluate(self,

~\Anaconda3\envs\learning\lib\site-packages\tensorflow\python\keras\engine\training_arrays.py in fit_loop(model, inputs, targets, sample_weights, batch_size, epochs, verbose, callbacks, val_inputs, val_targets, val_sample_weights, shuffle, initial_epoch, steps_per_epoch, validation_steps)
    213           ins_batch[i] = ins_batch[i].toarray()
    214 
--> 215         outs = f(ins_batch)
    216         if not isinstance(outs, list):
    217           outs = [outs]

~\Anaconda3\envs\learning\lib\site-packages\tensorflow\python\keras\backend.py in __call__(self, inputs)
   2984 
   2985     fetched = self._callable_fn(*array_vals,
-> 2986                                 run_metadata=self.run_metadata)
   2987     self._call_fetch_callbacks(fetched[-len(self._fetches):])
   2988     return fetched[:len(self.outputs)]

~\Anaconda3\envs\learning\lib\site-packages\tensorflow\python\client\session.py in __call__(self, *args, **kwargs)
   1437           ret = tf_session.TF_SessionRunCallable(
   1438               self._session._session, self._handle, args, status,
-> 1439               run_metadata_ptr)
   1440         if run_metadata:
   1441           proto_data = tf_session.TF_GetBuffer(run_metadata_ptr)

~\Anaconda3\envs\learning\lib\site-packages\tensorflow\python\framework\errors_impl.py in __exit__(self, type_arg, value_arg, traceback_arg)
    526             None, None,
    527             compat.as_text(c_api.TF_Message(self.status.status)),
--> 528             c_api.TF_GetCode(self.status.status))
    529     # Delete the underlying status object from memory otherwise it stays alive
    530     # as there is a reference to status from this from the traceback due to

UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
     [[node conv2d_3/Conv2D = Conv2D[T=DT_FLOAT, _class=["loc:@training_2/Adam/gradients/conv2d_3/Conv2D_grad/Conv2DBackpropFilter"], data_format="NCHW", dilations=[1, 1, 1, 1], padding="VALID", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](training_2/Adam/gradients/conv2d_3/Conv2D_grad/Conv2DBackpropFilter-0-TransposeNHWCToNCHW-LayoutOptimizer, conv2d_3/Conv2D/ReadVariableOp)]]
     [[node loss_2/activation_7_loss/broadcast_weights/assert_broadcastable/AssertGuard/Assert/Switch/_329 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_321_l...ert/Switch", tensor_type=DT_BOOL, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

【问题讨论】:

您可能需要添加from tensorflow.keras.layers import LSTM,但您是否尝试过在不使用per_process_gpu_memory_fraction 的情况下运行它?注释掉第 8-10 行代码并测试它是否运行,尽管它可能会更慢。 @Suleiman 我试过没有 per_process_gpu_memory_fraction、per_process_gpu_memory_fraction 和 LSTM,以及 per_process_gpu_memory_fraction 和 LSTM,但仍然是 sam 错误。 您尝试过:config.gpu_options.allow_growth = True 吗?这解决了我的问题。 【参考方案1】:

希望这个link可以解决你的问题,因为你安装的cnDNN版本和tensorflow编译的cuDNN版本不兼容。

复制一个新的 CUDNN 库,它应该可以工作

【讨论】:

该命令对我不起作用,我确保获得正确的 cuDNN 版本

以上是关于Tensorflow-gpu 获取卷积算法失败的主要内容,如果未能解决你的问题,请参考以下文章

获取卷积算法失败。这可能是因为 cuDNN 初始化失败,

Colab: UnknownError: 使用 TF 2.5 时获取卷积算法失败

tensorflow-gpu 不适用于 Blas GEMM 启动失败

安装tensorflow-gpu2.0(windows)

AI相关 TensorFlow -卷积神经网络 踩坑日记之一

TF-搭建tensorflow-gpu GPU环境