Tensorflow 2.0 - LSTM 状态和输入大小
Posted
技术标签:
【中文标题】Tensorflow 2.0 - LSTM 状态和输入大小【英文标题】:Tensorflow 2.0 - LSTM statefulness and input size 【发布时间】:2020-09-10 17:42:36 【问题描述】:对于强化学习中的一个特定问题(受this paper 的启发),我使用了一个 RNN,它输入了形状为 (batch_size, time_steps, features) = (1,1,1) 的数据,用于 L 数据-点,然后一个“循环”结束;使用 LSTM 单元。我使用的是 lstm.stateful = True,在 L 馈送到网络后,我调用 lstm.reset_states()。
在一个周期和另一个周期之间,并且在调用 lstm.reset_states() 之后,我想在形状 (batch_size, time_steps, features) = (L,1) 的输入数据上评估网络的输出,1);然后继续使用输入为 batch_size = 1 的 RNN。
此外,我希望代码尽可能优化,为此我尝试通过 @tf.function 装饰器使用 AutoGraph。
问题是我遇到了一个错误,可以通过以下示例重新创建(请注意,如果删除 @tf.function,一切正常,但我不明白为什么?)
import tensorflow as tf
import numpy as np
class Actor(tf.keras.Model):
def __init__(self):
super(Actor,self).__init__()
self.lstm = tf.keras.layers.LSTM(5, return_sequences=True, stateful=True, input_shape=(None,None,1))#, input_shape=(None,None,1))
def call(self, inputs):
feat= self.lstm(inputs)
return feat
actor = Actor()
@tf.function
def g(actor):
context1 = tf.reshape(np.array([0.]*10),(10,1,1))
actor(context1)
actor.reset_states()
actor.lstm.stateful=False
context = tf.reshape(np.array([0.]),(1,1,1))
actor(context)
g(actor)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-28-4487772bee64> in <module>
23 actor(context)
24
---> 25 g(actor)
~/.local/lib/python3.6/site-packages/tensorflow/python/eager/def_function.py in __call__(self, *args, **kwds)
578 xla_context.Exit()
579 else:
--> 580 result = self._call(*args, **kwds)
581
582 if tracing_count == self._get_tracing_count():
~/.local/lib/python3.6/site-packages/tensorflow/python/eager/def_function.py in _call(self, *args, **kwds)
625 # This is the first call of __call__, so we have to initialize.
626 initializers = []
--> 627 self._initialize(args, kwds, add_initializers_to=initializers)
628 finally:
629 # At this point we know that the initialization is complete (or less
~/.local/lib/python3.6/site-packages/tensorflow/python/eager/def_function.py in _initialize(self, args, kwds, add_initializers_to)
504 self._concrete_stateful_fn = (
505 self._stateful_fn._get_concrete_function_internal_garbage_collected( # pylint: disable=protected-access
--> 506 *args, **kwds))
507
508 def invalid_creator_scope(*unused_args, **unused_kwds):
~/.local/lib/python3.6/site-packages/tensorflow/python/eager/function.py in _get_concrete_function_internal_garbage_collected(self, *args, **kwargs)
2444 args, kwargs = None, None
2445 with self._lock:
-> 2446 graph_function, _, _ = self._maybe_define_function(args, kwargs)
2447 return graph_function
2448
~/.local/lib/python3.6/site-packages/tensorflow/python/eager/function.py in _maybe_define_function(self, args, kwargs)
2775
2776 self._function_cache.missed.add(call_context_key)
-> 2777 graph_function = self._create_graph_function(args, kwargs)
2778 self._function_cache.primary[cache_key] = graph_function
2779 return graph_function, args, kwargs
~/.local/lib/python3.6/site-packages/tensorflow/python/eager/function.py in _create_graph_function(self, args, kwargs, override_flat_arg_shapes)
2665 arg_names=arg_names,
2666 override_flat_arg_shapes=override_flat_arg_shapes,
-> 2667 capture_by_value=self._capture_by_value),
2668 self._function_attributes,
2669 # Tell the ConcreteFunction to clean up its graph once it goes out of
~/.local/lib/python3.6/site-packages/tensorflow/python/framework/func_graph.py in func_graph_from_py_func(name, python_func, args, kwargs, signature, func_graph, autograph, autograph_options, add_control_dependencies, arg_names, op_return_value, collections, capture_by_value, override_flat_arg_shapes)
979 _, original_func = tf_decorator.unwrap(python_func)
980
--> 981 func_outputs = python_func(*func_args, **func_kwargs)
982
983 # invariant: `func_outputs` contains only Tensors, CompositeTensors,
~/.local/lib/python3.6/site-packages/tensorflow/python/eager/def_function.py in wrapped_fn(*args, **kwds)
439 # __wrapped__ allows AutoGraph to swap in a converted function. We give
440 # the function a weak reference to itself to avoid a reference cycle.
--> 441 return weak_wrapped_fn().__wrapped__(*args, **kwds)
442 weak_wrapped_fn = weakref.ref(wrapped_fn)
443
~/.local/lib/python3.6/site-packages/tensorflow/python/framework/func_graph.py in wrapper(*args, **kwargs)
966 except Exception as e: # pylint:disable=broad-except
967 if hasattr(e, "ag_error_metadata"):
--> 968 raise e.ag_error_metadata.to_exception(e)
969 else:
970 raise
ValueError: in user code:
<ipython-input-28-4487772bee64>:23 g *
actor(context)
<ipython-input-28-4487772bee64>:11 call *
feat= self.lstm(inputs)
/home/cooper-cooper/.local/lib/python3.6/site-packages/tensorflow/python/keras/layers/recurrent.py:654 __call__ **
return super(RNN, self).__call__(inputs, **kwargs)
/home/cooper-cooper/.local/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py:886 __call__
self.name)
/home/cooper-cooper/.local/lib/python3.6/site-packages/tensorflow/python/keras/engine/input_spec.py:227 assert_input_compatibility
', found shape=' + str(shape))
ValueError: Input 0 is incompatible with layer lstm_7: expected shape=(10, None, 1), found shape=[1, 1, 1]
【问题讨论】:
【参考方案1】:如果有人感兴趣,我在以下帖子中找到了答案,我的解决方法如下:
import tensorflow as tf
import numpy as np
class Actor(tf.keras.Model):
def __init__(self):
super(Actor,self).__init__()
self.lstm = tf.keras.layers.LSTM(5, return_sequences=True, stateful=True,input_shape=(1,1))#, input_shape=(None,None,1))
def call(self, inputs):
feat= self.lstm(inputs)
return feat
def reset_states_workaround(self, new_batch_size=1):
self.lstm.states = [tf.Variable(tf.random.uniform((new_batch_size,5))), tf.Variable(tf.random.uniform((new_batch_size,5)))]
self.lstm.input_spec = [tf.keras.layers.InputSpec(shape=(new_batch_size,None,1), ndim=3)]
然后,在使用 @tf.function 的两个不同调用之间,我会这样做:
actor = Actor()
@tf.function
def g(actor):
context1 = tf.reshape(np.array([0.]*10),(10,1,1))
preds = actor(context1)
return preds
g(actor)
actor.reset_states_workaround(new_batch_size=1)
@tf.function
def g2(actor):
context1 = tf.reshape(np.array([0.]*1),(1,1,1))
preds = actor(context1)
return preds
g2(actor)
在@tf.function 内部使用actor.reset_states_workaround(new_batch_size=1)
会出现问题:ValueError: tf.function-decorated function tried to create variables on non-first call.
,这就是我在外部使用它的原因。
【讨论】:
你能创建变量而不是__init__
而不是reset_states_workaround
吗?然后reset_states
将只分配给这些变量,您将避免“函数试图创建变量”错误。以上是关于Tensorflow 2.0 - LSTM 状态和输入大小的主要内容,如果未能解决你的问题,请参考以下文章
python tensorflow 2.0 不使用 Keras 搭建简单的 LSTM 网络
TensorFlow Serving - 有状态的 LSTM