如何在Tensorflow中组合feature_columns,model_to_estimator和dataset API

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了如何在Tensorflow中组合feature_columns,model_to_estimator和dataset API相关的知识,希望对你有一定的参考价值。

我在tensorflow中使用高级API有一个玩具示例:tf.estimatortf.datatf.feature_column。我想使用tf.keras.estimator.model_to_estimator将罐装估算器与keras模型交换。我可以从keras模型生成一个估算器,但后来我得到一个关于输入的名称和形状的错误。我认为keras模型的输入形状是错误的,因为input_fn传递了所有数据,而不是特征列。换句话说,我不确定如何将特征列连接到keras模型

以下是有效代码的相关部分:

...
col1 = categorical_column_with_vocabulary_list('col1', [1, 2, 3])
col1_ind = C.indicator_column(col1)

col2 = numeric_column('col2')

...

estimator = E.DNNClassifier(
    feature_columns=[col1_ind, col2],
    hidden_units=[10])

...

def input_fn(features, labels, batch_size):
    dataset = D.Dataset.from_tensor_slices((dict(features),
                                            labels))
    dataset = dataset.shuffle(1000).repeat().batch(batch_size)
return dataset

...

train_and_evaluate(estimator, train_spec, eval_spec)

如果我尝试将DNNClassifier换成以下内容,我会遇到问题:

model = tf.keras.models.Sequential()
model.add(L.Dense(10, activation='relu', input_dim=9))
....

model.compile(loss='binary_crossentropy',
              optimizer='adam',
              metrics=['accuracy'])

estimator = model_to_estimator(keras_model=model)

在这种情况下,我收到以下错误消息:

INFO:tensorflow:Running training and evaluation locally (non-distributed).
INFO:tensorflow:Start train and evaluate loop. The evaluate will happen after 600 secs (eval_spec.throttle_secs) or training is finished.
INFO:tensorflow:Calling model_fn.
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-82-0242f6f379fc> in <module>()
----> 1 E.train_and_evaluate(estimator, train_spec, eval_spec)

~/.local/lib/python3.5/site-packages/tensorflow/python/estimator/training.py in train_and_evaluate(estimator, train_spec, eval_spec)
    437         '(with task id 0).  Given task id {}'.format(config.task_id))
    438 
--> 439   executor.run()
    440 
    441 

~/.local/lib/python3.5/site-packages/tensorflow/python/estimator/training.py in run(self)
    516         config.task_type != run_config_lib.TaskType.EVALUATOR):
    517       logging.info('Running training and evaluation locally (non-distributed).')
--> 518       self.run_local()
    519       return
    520 

~/.local/lib/python3.5/site-packages/tensorflow/python/estimator/training.py in run_local(self)
    648           input_fn=self._train_spec.input_fn,
    649           max_steps=self._train_spec.max_steps,
--> 650           hooks=train_hooks)
    651 
    652       # Final export signal: For any eval result with global_step >= train

~/.local/lib/python3.5/site-packages/tensorflow/python/estimator/estimator.py in train(self, input_fn, hooks, steps, max_steps, saving_listeners)
    353 
    354     saving_listeners = _check_listeners_type(saving_listeners)
--> 355     loss = self._train_model(input_fn, hooks, saving_listeners)
    356     logging.info('Loss for final step: %s.', loss)
    357     return self

~/.local/lib/python3.5/site-packages/tensorflow/python/estimator/estimator.py in _train_model(self, input_fn, hooks, saving_listeners)
    822       worker_hooks.extend(input_hooks)
    823       estimator_spec = self._call_model_fn(
--> 824           features, labels, model_fn_lib.ModeKeys.TRAIN, self.config)
    825 
    826       if self._warm_start_settings:

~/.local/lib/python3.5/site-packages/tensorflow/python/estimator/estimator.py in _call_model_fn(self, features, labels, mode, config)
    803 
    804     logging.info('Calling model_fn.')
--> 805     model_fn_results = self._model_fn(features=features, **kwargs)
    806     logging.info('Done calling model_fn.')
    807 

~/.local/lib/python3.5/site-packages/tensorflow/python/keras/_impl/keras/estimator.py in model_fn(features, labels, mode)
    317     """model_fn for keras Estimator."""
    318     model = _clone_and_build_model(mode, keras_model, custom_objects, features,
--> 319                                    labels)
    320     # Get inputs to EstimatorSpec
    321     predictions = dict(zip(model.output_names, model.outputs))

~/.local/lib/python3.5/site-packages/tensorflow/python/keras/_impl/keras/estimator.py in _clone_and_build_model(mode, keras_model, custom_objects, features, labels)
    251     input_tensors = _create_ordered_io(keras_model,
    252                                        estimator_io=features,
--> 253                                        is_input=True)
    254   # Get list of outputs.
    255   if labels is None:

~/.local/lib/python3.5/site-packages/tensorflow/python/keras/_impl/keras/estimator.py in _create_ordered_io(keras_model, estimator_io, is_input)
     94             'It needs to match one '
     95             'of the following: %s' % ('input' if is_input else 'output', key,
---> 96                                       ', '.join(keras_io_names)))
     97       tensors = [_cast_tensor_to_floatx(estimator_io[io_name])
     98                  for io_name in keras_io_names]

ValueError: Cannot find input with name "col1" in Keras Model. It needs to match one of the following: dense_1_input
答案

要将feature_columns与通过model_to_estimator(keras_model=model)创建的估算器连接起来,必须使feature_column的名称与模型的输入图层的名称相匹配。

例如,您的input_fn()可能如下所示:

def input_fn(features, labels, batch_size):
    dataset = D.Dataset.from_tensor_slices((dict(features), labels))
    dataset = dataset.shuffle(1000).repeat().batch(batch_size)
    iterator = dataset.make_initializable_iterator()
    tf.add_to_collection(
        tf.GraphKeys.TABLE_INITIALIZERS, iterator.initializer)
    features, labels = iterator.get_next()
    return {"dense_1_input": features}, labels

因此,无论输入图层的名称是什么,keras模型都需要添加_input的该名称的要素列:

model = tf.keras.models.Sequential()
model.add(L.Dense(10, activation='relu', input_dim=9, name="MY_NAME"))

def input_fn(features, labels, batch_size):
    ...
    return {"MY_NAME_input": features}, labels
另一答案

一些示例代码:

from tensorflow.python.feature_column import feature_column_v2 as fc

feature_layer = fc.FeatureLayer(your_feature_columns)

model = tf.keras.Sequential([
  feature_layer,
  tf.keras.layers.Dense(128, activation=tf.nn.relu),
  tf.keras.layers.Dense(64, activation=tf.nn.relu),
  tf.keras.layers.Dense(1, activation=tf.nn.sigmoid)
])

请参考feature_cols_keras

以上是关于如何在Tensorflow中组合feature_columns,model_to_estimator和dataset API的主要内容,如果未能解决你的问题,请参考以下文章

TensorFlow:组合两个图后从两个检查点恢复变量

Keras/Tensorflow:单输出的组合损失函数

Tensorflow之Tensor形状变换和剪切组合

tensorflow-读写数据最佳代码组合方式

TensorFlow多线程输入数据处理框架——组合训练数据

组合图:C++ 是不是有等效的 TensorFlow import_graph_def?