如何将 Tensorflow 数据集 API 与训练和验证集一起使用
Posted
技术标签:
【中文标题】如何将 Tensorflow 数据集 API 与训练和验证集一起使用【英文标题】:How to use Tensorflow dataset API with training and validation sets 【发布时间】:2018-05-01 14:11:44 【问题描述】:手头的简单任务:运行 N 个 epoch 的训练,在每个 epoch 之后计算准确的验证准确度。时期大小可以等于完整的训练集或一些预定义的迭代次数。在验证期间,每个验证集输入都必须被评估一次。
将 one_shot_iterators、可初始化迭代器和/或用于该任务的句柄混合在一起的最佳方法是什么?
这是我认为它应该如何工作的脚手架:
def build_training_dataset():
pass
def build_validation_dataset():
pass
def construct_train_op(dataset):
pass
def magic(iterator):
pass
USE_CUSTOM_EPOCH_SIZE = True
CUSTOM_EPOCH_SIZE = 60
MAX_EPOCHS = 100
training_dataset = build_training_dataset()
validation_dataset = build_validation_dataset()
# Magic goes here to build a nice one-instance dataset
dataset = magic(training_dataset, validation_dataset)
train_op = construct_train_op(dataset)
# Run N epochs in which the training dataset is traversed, followed by the
# validation dataset.
with tf.Session() as sess:
for epoch in MAX_EPOCHS:
# train
if USE_CUSTOM_EPOCH_SIZE:
for _ in range(CUSTOM_EPOCH_SIZE):
sess.run(train_op)
else:
while True:
# I guess smth like this:
try:
sess.run(train_op)
except tf.errors.OutOfRangeError:
break # we are done with the epoch
# validation
validation_predictions = []
while True:
try:
np.append(validation_predictions, sess.run(train_op)) # but for validation this time
except tf.errors.OutOfRangeError:
print('epoch %d finished with accuracy: %f' % (epoch validation_predictions.mean()))
break
【问题讨论】:
【参考方案1】:由于解决方案比我预期的要复杂得多,因此它分为 2 个和平:
0) 两个示例共享的辅助代码:
USE_CUSTOM_EPOCH_SIZE = True
CUSTOM_EPOCH_SIZE = 60
MAX_EPOCHS = 100
TRAIN_SIZE = 500
VALIDATION_SIZE = 145
BATCH_SIZE = 64
def construct_train_op(batch):
return batch
def build_train_dataset():
return tf.data.Dataset.range(TRAIN_SIZE) \
.map(lambda x: x + tf.random_uniform([], -10, 10, tf.int64)) \
.batch(BATCH_SIZE)
def build_test_dataset():
return tf.data.Dataset.range(VALIDATION_SIZE) \
.batch(BATCH_SIZE)
1) 对于等于训练数据集大小的 epoch:
# datasets construction
training_dataset = build_train_dataset()
validation_dataset = build_test_dataset()
# handle constructions. Handle allows us to feed data from different dataset by providing a parameter in feed_dict
handle = tf.placeholder(tf.string, shape=[])
iterator = tf.data.Iterator.from_string_handle(handle, training_dataset.output_types, training_dataset.output_shapes)
next_element = iterator.get_next()
train_op = construct_train_op(next_element)
training_iterator = training_dataset.make_initializable_iterator()
validation_iterator = validation_dataset.make_initializable_iterator()
with tf.Session() as sess:
training_handle = sess.run(training_iterator.string_handle())
validation_handle = sess.run(validation_iterator.string_handle())
for epoch in range(MAX_EPOCHS):
#train
sess.run(training_iterator.initializer)
total_in_train = 0
while True:
try:
train_output = sess.run(train_op, feed_dict=handle: training_handle)
total_in_train += len(train_output)
except tf.errors.OutOfRangeError:
assert total_in_train == TRAIN_SIZE
break # we are done with the epoch
# validation
validation_predictions = []
sess.run(validation_iterator.initializer)
while True:
try:
pred = sess.run(train_op, feed_dict=handle: validation_handle)
validation_predictions = np.append(validation_predictions, pred)
except tf.errors.OutOfRangeError:
assert len(validation_predictions) == VALIDATION_SIZE
print('Epoch %d finished with accuracy: %f' % (epoch, np.mean(validation_predictions)))
break
2) 对于自定义 epoch 大小:
# datasets construction
training_dataset = build_train_dataset().repeat() # CHANGE 1
validation_dataset = build_test_dataset()
# handle constructions. Handle allows us to feed data from different dataset by providing a parameter in feed_dict
handle = tf.placeholder(tf.string, shape=[])
iterator = tf.data.Iterator.from_string_handle(handle, training_dataset.output_types, training_dataset.output_shapes)
next_element = iterator.get_next()
train_op = construct_train_op(next_element)
training_iterator = training_dataset.make_one_shot_iterator() # CHANGE 2
validation_iterator = validation_dataset.make_initializable_iterator()
with tf.Session() as sess:
training_handle = sess.run(training_iterator.string_handle())
validation_handle = sess.run(validation_iterator.string_handle())
for epoch in range(MAX_EPOCHS):
#train
# CHANGE 3: no initiazation, not try/catch
for _ in range(CUSTOM_EPOCH_SIZE):
train_output = sess.run(train_op, feed_dict=handle: training_handle)
# validation
validation_predictions = []
sess.run(validation_iterator.initializer)
while True:
try:
pred = sess.run(train_op, feed_dict=handle: validation_handle)
validation_predictions = np.append(validation_predictions, pred)
except tf.errors.OutOfRangeError:
assert len(validation_predictions) == VALIDATION_SIZE
print('Epoch %d finished with accuracy: %f' % (epoch, np.mean(validation_predictions)))
break
【讨论】:
您确定需要在每个 epoch 重新初始化训练迭代器(方法 1)吗?以上是关于如何将 Tensorflow 数据集 API 与训练和验证集一起使用的主要内容,如果未能解决你的问题,请参考以下文章
Tensorflow:如何查找 tf.data.Dataset API 对象的大小
如何在 tensorflow 对象检测 API 中使用“忽略”类?