两个tf.data.Dataset可以共存并由tf.cond()控制

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了两个tf.data.Dataset可以共存并由tf.cond()控制相关的知识,希望对你有一定的参考价值。

我在我的Graph中设置了两个Dataset管道,用于train / test = 9:1设置,并且控制流量为tf.cond。我遇到了一个问题,即在训练期间,每个步骤都会激活两个管道。测试装置在列车组之前耗尽,因为它在训练期间较少。

OutOfRangeError(参见上面的回溯):序列结束

首先,将输入管道嵌套在一个函数中:

def input_pipeline(*args):
    ...
    # construct iterator
    it = batch.make_initializable_iterator()
    iter_init_op = it.initializer

    # get next img and label
    X_it, y_it = it.get_next()
    inputs = {'img': X_it, 'label': y_it, 'iterator_init_op': iter_init_op}
    return inputs

然后发起:

train_input = input_pipeline(args)
test_input = input_pipeline(args)

在模型中,我们使用feed_dict放置一个占位符来填充条件,这不会影响性能:

...
def f1(): return train_input
def f2(): return test_input
cond_pl = tf.placeholder(tf.string, name='cond_pl')
input = tf.cond(tf.equal(cond_pl, 'train'), lambda: f1(), lambda: f2())
...

在会议中:

for ep in range(nb_ep):
   ...
   for step in range(ep_len):
       print('step:{}\r'.format(step))
       try:
           sess.run([train_op], feed_dict={cond_pl: 'train'})
           if step % step_len == (step_len - 1):
               sess.run([test_op], feed_dict={cond_pl: 'test'})
       except tf.errors.OutOfRangeError:
           raise('drop the remainder')
   ...

如果条件适合,我怎样才能让输入管道get_next()被调用?


根据@sharky回答更新的片段:

def write_h5(args):
    x, is_training = args
    with h5py.File('./{}_{}.h5'.format('train' if is_training else 'test', x), 'w') as f:
        h = w = np.arange(-1, 1, 0.02)
        hh, _ = np.meshgrid(h, w)
        a = hh ** 2
        b = np.add(a + 1, np.random.randn(100, 100))  #do something and add gaussian noise
        f.create_dataset('X', shape=(100, 100), dtype='float32', data=a)
        f.create_dataset('y', shape=(100, 100), dtype='float32', data=b)


def input_pipeline(window_size, batch_size, is_train=True, ncores=mp.cpu_count()):
    flist = []
    for dirpath, _, fnames in os.walk('./'):
        for fname in fnames:
            if fname.startswith('train' if is_train else 'test') and fname.endswith('.h5'):
                print(fname)
                flist.append((os.path.abspath(os.path.join(dirpath, fname)), str(window_size)))
    f_len = len(flist)
    print(f_len)
    # init list of files
    batch = tf.data.Dataset.from_tensor_slices((tf.constant(flist)))
    batch = batch.map(_pyfn_wrapper, num_parallel_calls=ncores)
    batch = batch.shuffle(batch_size).batch(batch_size, drop_remainder=True).prefetch(ncores).repeat()

    # construct iterator
    it = batch.make_initializable_iterator()
    iter_init_op = it.initializer

    # get next img and label
    X_it, y_it = it.get_next()
    inputs = {'img': X_it, 'label': y_it, 'iterator_init_op': iter_init_op}
    return inputs, f_len


def _pyfn_wrapper(args):
    return tf.py_func(parse_h5,  #wrapped pythonic function
                      [args],
                      [tf.float32, tf.float32]  #[input, output] dtype
                      )

def parse_h5(args):
    name, window_size = args
    window_size = int(window_size.decode('utf-8'))
    with h5py.File(name, 'r') as f:
        X = f['X'][:].reshape(window_size, window_size, 1)
        y = f['y'][:].reshape(window_size, window_size, 1)
        return X, y


# init data
p = mp.Pool(mp.cpu_count())
p.map(write_h5, zip(range(9000), repeat(True)))
p.map(write_h5, zip(range(1000), repeat(False)))

# hparam
ep_len = 90
step_len = 9  # run test_op after 9 steps

# create tf.data.Dataset
train_input, train_len = input_pipeline(100, 5, is_train=True)
test_input, test_len = input_pipeline(100, 5, is_train=False)


# draw graph
def f1(): return train_input
def f2(): return test_input


cond_pl = tf.placeholder(tf.string, shape=None, name='cond_pl')
input = tf.cond(tf.equal(cond_pl, 'train'), lambda: f1(), lambda: f2())  # I thou

with tf.name_scope("Conv1"):
    W = tf.get_variable("W", shape=[3, 3, 1, 1],
                         initializer=tf.contrib.layers.xavier_initializer())
    b = tf.get_variable("b", shape=[1], initializer=tf.contrib.layers.xavier_initializer())
    layer1 = tf.nn.conv2d(input['img'], W, strides=[1, 1, 1, 1], padding='SAME') + b
    logits = tf.nn.relu(layer1)

loss = tf.reduce_mean(tf.losses.mean_squared_error(labels=input['label'], predictions=logits))
train_op = tf.train.AdamOptimizer(learning_rate=0.0001).minimize(loss)
test_op = print(loss)
#

# session
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    for ep in range(5):
        print('ep:{}'.format(ep))
        sess.run(input['iterator_init_op'], feed_dict={cond_pl: 'train'})
        sess.run(input['iterator_init_op'], feed_dict={cond_pl: 'test'})
        for step in range(ep_len):
            print('step:{}\r'.format(step))
            try:
                sess.run([train_op], feed_dict={cond_pl: 'train'})
                if step % step_len == (step_len - 1):
                    sess.run([test_op], feed_dict={cond_pl: 'test'})
            except tf.errors.OutOfRangeError:
                raise('drop the remainder')
答案

考虑示例:

train = np.arange(90)
test = np.arange(10)

train_ds = tf.data.Dataset.from_tensor_slices(train).shuffle(10).batch(10).repeat()
test_ds = tf.data.Dataset.from_tensor_slices(test).shuffle(10).batch(10).repeat()

train_iterator = train_ds.make_initializable_iterator()
test_iterator = test_ds.make_initializable_iterator()

with tf.Session() as sess:
    sess.run(train_iterator.initializer)
    sess.run(test_iterator.initializer)
    for i in range(len(train) + 1):
        print(sess.run(train_iterator.get_next()))
        if i % 9 == 8:
            print(sess.run(test_iterator.get_next()))

两个数据集,两个迭代器,都在启动时初始化。当i超过数据集的长度时,由于repeat(),它开始重复它们。如果它将被num_epochs调用或根本没有被调用,那么你将得到序列的结束。如果由于某种原因你需要/想要使用cond,也许这个答案会有所帮助

How to use Tensorflow's tf.cond() with two different Dataset iterators without iterating both?

以上是关于两个tf.data.Dataset可以共存并由tf.cond()控制的主要内容,如果未能解决你的问题,请参考以下文章

提供给 `tf.data.Dataset.from_generator(...)` 的 map 函数可以解析张量对象吗?

如何在 tf.data.Dataset 中输入不同大小的列表列表

如何将 tf.data.Dataset 与 kedro 一起使用?

建议在 tensorflow 2.0 中调试 `tf.data.Dataset` 操作

tf.keras 模型 多个输入 tf.data.Dataset

如何在 tf.data.Dataset.map 中使用 sklearn.preprocessing?