tensorflow内存消耗不断增加
Posted
技术标签:
【中文标题】tensorflow内存消耗不断增加【英文标题】:tensorflow memory consumption keeps increasing 【发布时间】:2021-03-10 12:20:17 【问题描述】:我目前正在优化tensorflow.keras
中的 CNN 超参数,我正在迭代地创建模型、训练它们、记录结果并抓取它们。这工作了几个小时,让我可以训练超过 30 个模型而不会失败。但是,如果我运行的时间足够长,每次迭代都会消耗越来越多的内存,从而导致崩溃。有没有办法缓解这种情况
示例 sn-p:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Activation, Flatten
from tensorflow.keras.layers import Conv1D, MaxPooling1D
import datetime
import time
verbose, epochs, batch_size = 1, 15, 32
CONV_QUANTS = [2,4,6]
DENSE_QUANTS = [0,1,2]
DENSE_SIZES = [16,32,64]
KERNAL_SIZES = [3,9,15]
FILT_QUANTS = [16,32,64]
POOL_SIZES = [2,4,6]
testName = 'test_'.format(round(time.time()))
for convQuant in CONV_QUANTS:
for denseQuant in DENSE_QUANTS:
for denseSize in DENSE_SIZES:
for kernalSize in KERNAL_SIZES:
for filtQuant in FILT_QUANTS:
for poolSize in POOL_SIZES:
#defining name
name = 'conv_dense_dSize_kSize_filtQuant_pSize_dt'.format(convQuant,
denseQuant,
denseSize,
kernalSize,
filtQuant,
poolSize,
datetime.datetime.now().strftime("%Y%m%d-%H%M%S"))
print(name)
#defining log
logdir = os.path.join("logs",testName,name)
tensorboard_callback = tf.keras.callbacks.TensorBoard(logdir, histogram_freq=1)
#initializing model
model = Sequential()
#input convolutional layer
model.add(Conv1D(filters=filtQuant, kernel_size=kernalSize, activation='relu', input_shape = trainX[0].shape))
model.add(Dropout(0.1))
model.add(MaxPooling1D(pool_size=poolSize))
#additional convolutional layers
for _ in range(convQuant-1):
model.add(Conv1D(filters=filtQuant, kernel_size=kernalSize, activation='relu'))
model.add(Dropout(0.1))
model.add(MaxPooling1D(pool_size=poolSize))
#dense layers
model.add(Flatten())
for _ in range(denseQuant):
model.add(Dense(denseSize, activation='relu'))
model.add(Dropout(0.5))
#output
model.add(Dense(2, activation='softmax'))
#training
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(trainX, trainy, epochs=epochs, batch_size=batch_size, verbose=verbose, validation_data=(testX, testy), callbacks=[tensorboard_callback])
#calculating accuracy
_, accuracy = model.evaluate(testX, testy, batch_size=batch_size, verbose=0)
accuracy = accuracy * 100.0
print('accuracy: '.format(accuracy))
【问题讨论】:
【参考方案1】:如果您在循环中创建多个模型,则此全局状态会随着时间的推移消耗越来越多的内存,您可能需要清除它。调用 clear_session() 释放全局状态:这有助于避免旧模型和层造成混乱,尤其是在内存有限的情况下。
for _ in range(100):
# Without `clear_session()`, each iteration of this loop will
# slightly increase the size of the global state managed by Keras
model = tf.keras.Sequential([tf.keras.layers.Dense(10) for _ in range(10)])
for _ in range(100):
# With `clear_session()` called at the beginning,
# Keras starts with a blank state at each iteration
# and memory consumption is constant over time.
tf.keras.backend.clear_session()
model = tf.keras.Sequential([tf.keras.layers.Dense(10) for _ in range(10)])
关于这个库的更多细节可以找到here
【讨论】:
以上是关于tensorflow内存消耗不断增加的主要内容,如果未能解决你的问题,请参考以下文章
在 TensorFlow 中的 GPU 之间平均分配 RNN 内存消耗
java - 如何在java中的堆上单独获取所有对象消耗的运行时内存