如何获得每一层的重量

Posted

技术标签:

【中文标题】如何获得每一层的重量【英文标题】:How to get weight on each layers 【发布时间】:2021-12-24 20:38:33 【问题描述】:

我正在尝试获取每一层的输入权重,包括 lstm 1、lstm 2 和注意力层之后的权重,并希望使用热图显示它们。但是当我运行代码时,会出现以下错误。发生了什么?因为层存在。 代码如下:

model.add(LSTM(32, input_shape=(n_timesteps,n_features), return_sequences=True))
#print weights
print(model.get_layer(LSTM).get_weights()[0])
model.add(LSTM(32, input_shape=(n_timesteps,n_features), return_sequences=True))
model.add(Dropout(0.1))
model.add(attention(return_sequences=False)) # receive 3D and output 2D
model.add(Dense(n_outputs, activation='softmax'))
model.summary()
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
# fit network
model.fit(trainX, trainy, epochs=epochs, batch_size=batch_size, verbose=verbose)
# evaluate model
_, accuracy = model.evaluate(testX, testy, batch_size=batch_size, verbose=0)

注意层:

class attention(Layer):
def __init__(self, return_sequences=True):
    self.return_sequences = return_sequences
    super(attention,self).__init__()
def build(self, input_shape):
    self.W=self.add_weight(name="att_weight", shape=(input_shape[-1],1),
                           initializer="normal")
    self.b=self.add_weight(name="att_bias", shape=(input_shape[1],1),
                           initializer="zeros")
    super(attention,self).build(input_shape)
def call(self, x):
    e = K.tanh(K.dot(x,self.W)+self.b)
    a = K.softmax(e, axis=1)
    output = x*a
    if self.return_sequences:
        return output
    return K.sum(output, axis=1)

这是出现的错误:

ValueError: No such layer: <class 'keras.layers.recurrent_v2.LSTM'>. Existing layers are [<keras.layers.recurrent_v2.LSTM object at 0x7f7b5c215910>].

【问题讨论】:

【参考方案1】:

在定义整个模型后,您可以使用model.layers 获得某些层权重:

import tensorflow as tf
import seaborn as sb
import matplotlib.pyplot as plt

class attention(tf.keras.layers.Layer):
  def __init__(self, return_sequences=True):
      self.return_sequences = return_sequences
      super(attention,self).__init__()
  def build(self, input_shape):
      self.W=self.add_weight(name="att_weight", shape=(input_shape[-1],1),
                            initializer="normal")
      self.b=self.add_weight(name="att_bias", shape=(input_shape[1],1),
                            initializer="zeros")
      super(attention,self).build(input_shape)
  def call(self, x):
      e = tf.keras.backend.tanh(tf.keras.backend.dot(x,self.W)+self.b)
      a = tf.keras.backend.softmax(e, axis=1)
      output = x*a
      if self.return_sequences:
          return output
      return tf.keras.backend.sum(output, axis=1)

model = tf.keras.Sequential()
model.add(tf.keras.layers.LSTM(32, input_shape=(5,10), return_sequences=True))
model.add(tf.keras.layers.LSTM(32, return_sequences=True))
model.add(tf.keras.layers.Dropout(0.1))
model.add(attention(return_sequences=False)) # receive 3D and output 2D
model.add(tf.keras.layers.Dense(3, activation='softmax'))
model.summary()
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

trainx = tf.random.normal((25, 5, 10))
trainy = tf.random.uniform((25, 3), maxval=3)
model.fit(trainx, trainy, epochs=5, batch_size=4)


lstm1_weights = model.layers[0].get_weights()[0]
lstm2_weights = model.layers[1].get_weights()[0]
attention_weights = model.layers[3].get_weights()[0]

heat_map = sb.heatmap(lstm1_weights)
plt.show()
Model: "sequential_16"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 lstm_24 (LSTM)              (None, 5, 32)             5504      
                                                                 
 lstm_25 (LSTM)              (None, 5, 32)             8320      
                                                                 
 dropout_12 (Dropout)        (None, 5, 32)             0         
                                                                 
 attention_12 (attention)    (None, 32)                37        
                                                                 
 dense_12 (Dense)            (None, 3)                 99        
                                                                 
=================================================================
Total params: 13,960
Trainable params: 13,960
Non-trainable params: 0
_________________________________________________________________
Epoch 1/5
7/7 [==============================] - 4s 10ms/step - loss: 5.5033 - accuracy: 0.4400
Epoch 2/5
7/7 [==============================] - 0s 8ms/step - loss: 5.4899 - accuracy: 0.5200
Epoch 3/5
7/7 [==============================] - 0s 9ms/step - loss: 5.4771 - accuracy: 0.4800
Epoch 4/5
7/7 [==============================] - 0s 9ms/step - loss: 5.4701 - accuracy: 0.5200
Epoch 5/5
7/7 [==============================] - 0s 8ms/step - loss: 5.4569 - accuracy: 0.5200

                             

如果您想查看训练期间层的权重如何变化,您应该定义一个回调,如post 所示。

【讨论】:

哇,非常感谢您的帮助。我很难找到原因并解决它。我会试试你的建议。再次感谢您 我想再问一下,请问可以知道每个标签是什么意思吗? 在我发布的示例中,标签没有任何特殊意义。只是三个不同的类。 再次感谢您的帮助。如果遇到麻烦,希望可以再问。

以上是关于如何获得每一层的重量的主要内容,如果未能解决你的问题,请参考以下文章

如何获得每一层的权重形状?

如何避免 Objective C 中每一层的大量错误处理代码

如何使用 dynamic_rnn 获取多层 RNN 中每一步和每一层的状态

如何在 Pytorch 中获取神经网络每一层的输出维度?

Keras 函数(K.function)不适用于 RNN(提供的代码)

如何利用CNN实现图像识别的任务?