自定义层中构建的继承 (super().build(input_shape))

Posted 2023-02-16

技术标签:

【中文标题】自定义层中构建的继承 (super().build(input_shape))【英文标题】：Inheritance of Build in Custom Layer (super().build(input_shape)) 【发布时间】：2021-10-17 01:02:30 【问题描述】：

我试图理解 tensorflow keras 中自定义层的概念。当 Simple_dense 层在未激活的情况下创建时，代码如下所示：

class SimpleDense(Layer):

    def __init__(self, units=32):
        '''Initializes the instance attributes'''
        super(SimpleDense, self).__init__()
        self.units = units

    def build(self, input_shape):
        '''Create the state of the layer (weights)'''
        # initialize the weights
        w_init = tf.random_normal_initializer()
        self.w = tf.Variable(name="kernel",
            initial_value=w_init(shape=(input_shape[-1], self.units),
                                 dtype='float32'),
            trainable=True)

        # initialize the biases
        b_init = tf.zeros_initializer()
        self.b = tf.Variable(name="bias",
            initial_value=b_init(shape=(self.units,), dtype='float32'),
            trainable=True)

    def call(self, inputs):
        '''Defines the computation from inputs to outputs'''
        return tf.matmul(inputs, self.w) + self.b

但是当在代码中引入激活函数后，代码变成了：

class SimpleDense(Layer):

    # add an activation parameter
    def __init__(self, units=32, activation=None):
        super(SimpleDense, self).__init__()
        self.units = units
        
        # define the activation to get from the built-in activation layers in Keras
        self.activation = tf.keras.activations.get(activation)


    def build(self, input_shape):
        w_init = tf.random_normal_initializer()
        self.w = tf.Variable(name="kernel",
            initial_value=w_init(shape=(input_shape[-1], self.units), 
                                 dtype='float32'),
            trainable=True)
        #input shape is -1 as the last instance of the shape tuple actually consists 
        # the total neurons in the previous layer you can see in the model summary
        b_init = tf.zeros_initializer()
        self.b = tf.Variable(name="bias",
            initial_value=b_init(shape=(self.units,), dtype='float32'),
            trainable=True)
        super().build(input_shape)


    def call(self, inputs):
        
        # pass the computation to the activation layer
        return self.activation(tf.matmul(inputs, self.w) + self.b)

我明白__init__ 和call 函数的变化我不明白为什么我们在build 函数中添加super().build(input_shape)？

我已经在其他几个地方看到了这一点，在这些地方继承构建函数成为必需品，例如这里(How to build this custom layer in Keras?) 是这样写的

一定要在最后调用这个

【问题讨论】：

【参考方案1】：

在过去，在独立的 keras 中，您必须在自定义构建函数中调用 super().build(input_shape)。而在某些旧版本的 TF2 中，您必须改为在自定义构建函数中设置 self.built = True。

但他们一直在改变它。在 tensorflow 的最新版本（v2.5.0 或更高版本）中，您不再需要做这些事情了。无论您是否在自定义构建函数中调用super().build(input_shape)，它都将起作用。

【讨论】：

我对调用 super().build(input_shape) 的假设是我们试图从父类层继承一些东西。您能否解释一下我们在以前的版本中试图继承的东西是什么。新版本还有哪些变化？ build 方法会在 layer 被调用的第一时间被执行。为了实现这一点，他们有一个名为self.built 的属性来跟踪层是否已构建，即如果self.built == True，则不会再次调用构建方法。所以super().build(input_shape) 做的第一件事就是将self.built 设置为True。其次，它还将input_shape存储为一个类属性，这样当图层被保存和重新加载时，它使用这个类属性来自动重建图层。现在您不再需要使用它的原因是因为它会在您调用自定义构建方法之后为您调用super().build(input_shape)。

以上是关于自定义层中构建的继承 (super().build(input_shape))的主要内容，如果未能解决你的问题，请参考以下文章