OOM when allocating tensor with shape[96,3,299,299] and type float on /job:localhost/replica:0/task:

Posted lixiaolun

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了OOM when allocating tensor with shape[96,3,299,299] and type float on /job:localhost/replica:0/task:相关的知识,希望对你有一定的参考价值。

 

单个GPU启动任务时报OOM的错误:

tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[96,3,299,299] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[Node: InceptionV3/InceptionV3/Conv2d_1a_3x3/Conv2D-0-TransposeNHWCToNCHW-LayoutOptimizer = Transpose[T=DT_FLOAT, Tperm=DT_INT32, _device="/job:localhost/replica:0/task:0/device:GPU:0"](fifo_queue_Dequeue/_1557, PermConstNHWCToNCHW-LayoutOptimizer)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

[[Node: train_op/_1567 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_6943_train_op", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

报错GPU内存不足,就使用2个GPU,使用2个GPU的时候,发现有一块GPU是使用率空闲的,但是内存是满的。添加如下代码:

from keras import backend as K
config = tf.ConfigProto()
config.gpu_options.allow_growth=True
sess = tf.Session(config=config)
K.set_session(sess)

参考:https://github.com/keras-team/keras/issues/6031

 

以上是关于OOM when allocating tensor with shape[96,3,299,299] and type float on /job:localhost/replica:0/task:的主要内容,如果未能解决你的问题,请参考以下文章

OOM when allocating tensor with shape[96,3,299,299] and type float on /job:localhost/replica:0/task:

Resource exhausted: OOM when allocating tensor with shape[3,3,384,384] and type float on /job:localh

OP_REQUIRES failed at conv_ops.cc:386 : Resource exhausted: OOM when allocating tensor with shape..

RQ: redis.exceptions.ResponseError: Command # 3 ... of pipeline 导致错误:OOM command not allowed when us

ORA-04030: out of process memory when trying to allocate 152 bytes (Logminer LCR c,krvtadc)

声明一个返回特定类型函数指针的函数