TensorFlow 为任何程序分配所有内存
Posted
技术标签:
【中文标题】TensorFlow 为任何程序分配所有内存【英文标题】:Tensorflow allocating all memory for any program 【发布时间】:2018-09-22 23:34:04 【问题描述】: 操作系统平台和发行版(例如,Linux Ubuntu 16.04):linux Ubuntu 16.04 TensorFlow 安装自(源代码或二进制文件):二进制文件 TensorFlow 版本(使用下面的命令):v1.4.0-rc1 Python 版本:3.5.5 CUDA/cuDNN 版本:CUDA 8.0 / cuDNN 6 GPU 型号和内存:nvidia gtx 1080我是 TensorFlow 的新手。所以这很容易成为我看不到的一些愚蠢的安装错误。
我打开python测试TF安装:
import tensorflow as tf
from tensorflow.python.client import device_lib
print(device_lib.list_local_devices())
导致:
I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX
2018-04-11 21:39:44.830140: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Found device 0 with properties:
name: GeForce GTX 1080 major: 6 minor: 1 memoryClockRate(GHz): 1.8475
pciBusID: 0000:01:00.0
totalMemory: 7.92GiB freeMemory: 78.94MiB
2018-04-11 21:39:44.830178: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0, compute capability: 6.1)
2018-04-11 21:39:44.832231: E tensorflow/stream_executor/cuda/cuda_driver.cc:936] failed to allocate 78.94M (82771968 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-04-11 21:39:44.834394: E tensorflow/stream_executor/cuda/cuda_driver.cc:936] failed to allocate 71.04M (74494976 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-04-11 21:39:44.835825: E tensorflow/stream_executor/cuda/cuda_driver.cc:936] failed to allocate 63.94M (67045632 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-04-11 21:39:44.837560: E tensorflow/stream_executor/cuda/cuda_driver.cc:936] failed to allocate 57.55M (60341248 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-04-11 21:39:44.839233: E tensorflow/stream_executor/cuda/cuda_driver.cc:936] failed to allocate 51.79M (54307328 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-04-11 21:39:44.841757: E tensorflow/stream_executor/cuda/cuda_driver.cc:936] failed to allocate 46.61M (48876800 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-04-11 21:39:44.843632: E tensorflow/stream_executor/cuda/cuda_driver.cc:936] failed to allocate 41.95M (43989248 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-04-11 21:39:44.845588: E tensorflow/stream_executor/cuda/cuda_driver.cc:936] failed to allocate 37.76M (39590400 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-04-11 21:39:44.847229: E tensorflow/stream_executor/cuda/cuda_driver.cc:936] failed to allocate 33.98M (35631360 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-04-11 21:39:44.849278: E tensorflow/stream_executor/cuda/cuda_driver.cc:936] failed to allocate 30.58M (32068352 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-04-11 21:39:44.850967: E tensorflow/stream_executor/cuda/cuda_driver.cc:936] failed to allocate 27.52M (28861696 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
[name: "/device:CPU:0"
device_type: "CPU"
memory_limit: 268435456
locality
incarnation: 6037705122138393497
, name: "/device:GPU:0"
device_type: "GPU"
memory_limit: 82771968
locality
bus_id: 1
incarnation: 11403601020071115295
physical_device_desc: "device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0, compute capability: 6.1"
]
【问题讨论】:
您真正的问题是什么?是否要分配内存?或者你想限制没有优先级? 【参考方案1】:假设您的问题是“为什么 Tensorflow 分配所有可用的 GPU 内存,即使我的程序使用更少的内存就足够了?”,那么答案是他们这样做是为了减少 GPU 内存碎片.您可以使用config.gpu_options.allow_growth
和config.gpu_options.per_process_gpu_memory_fraction
等一些设置来更改此默认行为,以减少Tensorflow 的内存消耗,但代价是允许发生一些潜在的内存碎片。详细解释见Tensorflow Programmer's Guide Using GPU chapter。
【讨论】:
以上是关于TensorFlow 为任何程序分配所有内存的主要内容,如果未能解决你的问题,请参考以下文章
Tensorflow:将 allow_growth 设置为 true 仍然会分配我所有 GPU 的内存
Tensorflow分配内存:分配38535168超过系统内存的10%