model.summary() 在使用子类模型时无法打印输出形状
Posted
技术标签:
【中文标题】model.summary() 在使用子类模型时无法打印输出形状【英文标题】:model.summary() can't print output shape while using subclass model 【发布时间】:2019-08-09 15:19:49 【问题描述】:这是创建keras模型的两种方法,但是两种方法的总结结果output shapes
不一样。显然,前者打印的信息更多,更容易检查网络的正确性。
import tensorflow as tf
from tensorflow.keras import Input, layers, Model
class subclass(Model):
def __init__(self):
super(subclass, self).__init__()
self.conv = layers.Conv2D(28, 3, strides=1)
def call(self, x):
return self.conv(x)
def func_api():
x = Input(shape=(24, 24, 3))
y = layers.Conv2D(28, 3, strides=1)(x)
return Model(inputs=[x], outputs=[y])
if __name__ == '__main__':
func = func_api()
func.summary()
sub = subclass()
sub.build(input_shape=(None, 24, 24, 3))
sub.summary()
输出:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) (None, 24, 24, 3) 0
_________________________________________________________________
conv2d (Conv2D) (None, 22, 22, 28) 784
=================================================================
Total params: 784
Trainable params: 784
Non-trainable params: 0
_________________________________________________________________
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_1 (Conv2D) multiple 784
=================================================================
Total params: 784
Trainable params: 784
Non-trainable params: 0
_________________________________________________________________
那么,我应该如何使用子类方法在summary()中获取output shape
?
【问题讨论】:
【参考方案1】:我已经用这个方法解决了这个问题,不知道有没有更简单的方法。
class subclass(Model):
def __init__(self):
...
def call(self, x):
...
def model(self):
x = Input(shape=(24, 24, 3))
return Model(inputs=[x], outputs=self.call(x))
if __name__ == '__main__':
sub = subclass()
sub.model().summary()
【讨论】:
您能解释一下为什么会这样吗?尤其是outputs=self.call(x)
部分。
@Samuel 通过评估outputs=self.call(x)
,调用subclass.call(self, x)
方法。这会触发封装实例中的形状计算。此外,Model
的返回实例还会计算在.summary()
中报告的自己的形状。这种方法的主要问题是输入形状是恒定的shape=(24, 24, 3)
,所以如果你需要一个动态的解决方案,这是行不通的。
您能解释一下...
中的内容吗?这是一个通用的解决方案,还是您需要在这些调用中使用特定于模型的东西?
@GuySoft ... in init 实例化您的层,而 ... in call 连接构建网络的不同层。它对所有子类 keras 模型都是通用的。【参考方案2】:
我猜关键点是Network
类中的_init_graph_network
方法,它是Model
的父类。如果在调用__init__
方法时指定了inputs
和outputs
参数,则将调用_init_graph_network
。
所以会有两种可能的方法:
-
手动调用
_init_graph_network
方法构建模型图。
使用输入层和输出重新初始化。
这两种方法都需要输入层和输出层(self.call
需要)。
现在调用summary
将给出准确的输出形状。但是它会显示Input
层,它不是子类化模型的一部分。
from tensorflow import keras
from tensorflow.keras import layers as klayers
class MLP(keras.Model):
def __init__(self, input_shape=(32), **kwargs):
super(MLP, self).__init__(**kwargs)
# Add input layer
self.input_layer = klayers.Input(input_shape)
self.dense_1 = klayers.Dense(64, activation='relu')
self.dense_2 = klayers.Dense(10)
# Get output layer with `call` method
self.out = self.call(self.input_layer)
# Reinitial
super(MLP, self).__init__(
inputs=self.input_layer,
outputs=self.out,
**kwargs)
def build(self):
# Initialize the graph
self._is_graph_network = True
self._init_graph_network(
inputs=self.input_layer,
outputs=self.out
)
def call(self, inputs):
x = self.dense_1(inputs)
return self.dense_2(x)
if __name__ == '__main__':
mlp = MLP(16)
mlp.summary()
输出将是:
Model: "mlp_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) [(None, 16)] 0
_________________________________________________________________
dense (Dense) (None, 64) 1088
_________________________________________________________________
dense_1 (Dense) (None, 10) 650
=================================================================
Total params: 1,738
Trainable params: 1,738
Non-trainable params: 0
_________________________________________________________________
【讨论】:
【参考方案3】:我解决问题的方式与 Elazar 提到的非常相似。覆盖类subclass
中的函数summary()。然后可以在使用模型子类化的同时直接调用summary():
class subclass(Model):
def __init__(self):
...
def call(self, x):
...
def summary(self):
x = Input(shape=(24, 24, 3))
model = Model(inputs=[x], outputs=self.call(x))
return model.summary()
if __name__ == '__main__':
sub = subclass()
sub.summary()
【讨论】:
【参考方案4】:我分析了 Adi Shumely 的答案:
应该不需要添加 Input_shape,因为您在 build() 中将其设置为参数 添加 Input 层对模型没有任何作用,而是作为 call() 方法的参数引入 添加所谓的输出不是我看到的方式。它所做的唯一也是最重要的事情就是调用 call() 方法。所以我提出了这个解决方案,它不需要对模型进行任何修改,只需要改进模型,因为它是在调用 summary() 方法之前构建的,方法是添加对调用的调用具有输入张量的模型的 () 方法。 我尝试了我自己的模型以及此提要中提供的三个模型,并且到目前为止它工作正常。
来自此提要的第一篇文章:
import tensorflow as tf
from tensorflow.keras import Input, layers, Model
class subclass(Model):
def __init__(self):
super(subclass, self).__init__()
self.conv = layers.Conv2D(28, 3, strides=1)
def call(self, x):
return self.conv(x)
if __name__ == '__main__':
sub = subclass()
sub.build(input_shape=(None, 24, 24, 3))
# Adding this call to the call() method solves it all
sub.call(Input(shape=(24, 24, 3)))
# And the summary() outputs all the information
sub.summary()
来自第二条动态
from tensorflow import keras
from tensorflow.keras import layers as klayers
class MLP(keras.Model):
def __init__(self, **kwargs):
super(MLP, self).__init__(**kwargs)
self.dense_1 = klayers.Dense(64, activation='relu')
self.dense_2 = klayers.Dense(10)
def call(self, inputs):
x = self.dense_1(inputs)
return self.dense_2(x)
if __name__ == '__main__':
mlp = MLP()
mlp.build(input_shape=(None, 16))
mlp.call(klayers.Input(shape=(16)))
mlp.summary()
从提要的最后一个帖子开始
import tensorflow as tf
class MyModel(tf.keras.Model):
def __init__(self, **kwargs):
super(MyModel, self).__init__(**kwargs)
self.dense10 = tf.keras.layers.Dense(10, activation=tf.keras.activations.softmax)
self.dense20 = tf.keras.layers.Dense(20, activation=tf.keras.activations.softmax)
def call(self, inputs):
x = self.dense10(inputs)
y_pred = self.dense20(x)
return y_pred
model = MyModel()
model.build(input_shape = (None, 32, 32, 1))
model.call(tf.keras.layers.Input(shape = (32, 32, 1)))
model.summary()
【讨论】:
【参考方案5】:遇到同样的问题 - 通过 3 个步骤修复它:
-
在 _ init _ 中添加 input_shape
添加输入层
添加图层
class MyModel(tf.keras.Model):
def __init__(self,input_shape=(32,32,1), **kwargs):
super(MyModel, self).__init__(**kwargs)
self.input_layer = tf.keras.layers.Input(input_shape)
self.dense10 = tf.keras.layers.Dense(10, activation=tf.keras.activations.softmax)
self.dense20 = tf.keras.layers.Dense(20, activation=tf.keras.activations.softmax)
self.out = self.call(self.input_layer)
def call(self, inputs):
x = self.dense10(inputs)
y_pred = self.dense20(x)
return y_pred
model = MyModel()
model(x_test[:99])
print('x_test[:99].shape:',x_test[:10].shape)
model.summary()
输出:
x_test[:99].shape: (99, 32, 32, 1)
Model: "my_model_32"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_79 (Dense) (None, 32, 32, 10) 20
_________________________________________________________________
dense_80 (Dense) (None, 32, 32, 20) 220
=================================================================
Total params: 240
Trainable params: 240
Non-trainable params: 0
【讨论】:
【参考方案6】:我已经使用这种方法解决了在 tensorflow 2.1 和 tensorflow 2.4.1 上测试的这个问题。用model.inputs_layer
声明InputLayer
class Logistic(tf.keras.models.Model):
def __init__(self, hidden_size = 5, output_size=1, dynamic=False, **kwargs):
'''
name: String name of the model.
dynamic: (Subclassed models only) Set this to `True` if your model should
only be run eagerly, and should not be used to generate a static
computation graph. This attribute is automatically set for Functional API
models.
trainable: Boolean, whether the model's variables should be trainable.
dtype: (Subclassed models only) Default dtype of the model's weights (
default of `None` means use the type of the first input). This attribute
has no effect on Functional API models, which do not have weights of their
own.
'''
super().__init__(dynamic=dynamic, **kwargs)
self.inputs_ = tf.keras.Input(shape=(2,), name="hello")
self._set_input_layer(self.inputs_)
self.hidden_size = hidden_size
self.dense = layers.Dense(hidden_size, name = "linear")
self.outlayer = layers.Dense(output_size,
activation = 'sigmoid', name = "out_layer")
self.build()
def _set_input_layer(self, inputs):
"""add inputLayer to model and display InputLayers in model.summary()
Args:
inputs ([dict]): the result from `tf.keras.Input`
"""
if isinstance(inputs, dict):
self.inputs_layer = n: tf.keras.layers.InputLayer(input_tensor=i, name=n)
for n, i in inputs.items()
elif isinstance(inputs, (list, tuple)):
self.inputs_layer = [tf.keras.layers.InputLayer(input_tensor=i, name=i.name)
for i in inputs]
elif tf.is_tensor(inputs):
self.inputs_layer = tf.keras.layers.InputLayer(input_tensor=inputs, name=inputs.name)
def build(self):
super(Logistic, self).build(self.inputs_.shape if tf.is_tensor(self.inputs_) else self.inputs_)
_ = self.call(self.inputs_)
def call(self, X):
X = self.dense(X)
Y = self.outlayer(X)
return Y
model = Logistic()
model.summary()
Model: "logistic"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
hello:0 (InputLayer) [(None, 2)] 0
_________________________________________________________________
linear (Dense) (None, 5) 15
_________________________________________________________________
out_layer (Dense) (None, 1) 6
=================================================================
Total params: 21
Trainable params: 21
Non-trainable params: 0
_________________________________________________________________
【讨论】:
【参考方案7】:Gary's answer 有效。然而,为了更方便,我想从我的自定义类对象中透明地访问keras.Model
的summary
方法。
这可以通过实现内置的__getattr__
方法轻松完成,如下所示:
from tensorflow.keras import Input, layers, Model
class MyModel(Model):
def __init__(self):
self.model = self.get_model()
def get_model(self):
# here we use the usual Keras functional API
x = Input(shape=(24, 24, 3))
y = layers.Conv2D(28, 3, strides=1)(x)
return Model(inputs=[x], outputs=[y])
def __getattr__(self, name):
"""
This method enables to access an attribute/method of self.model.
Thus, any method of keras.Model() can be used transparently from a MyModel object
"""
return getattr(self.model, name)
if __name__ == '__main__':
mymodel = MyModel()
mymodel.summary() # underlyingly calls MyModel.model.summary()
【讨论】:
以上是关于model.summary() 在使用子类模型时无法打印输出形状的主要内容,如果未能解决你的问题,请参考以下文章
Keras model.summary() 结果 - 了解参数的数量