如何获得每一层的重量
Posted
技术标签:
【中文标题】如何获得每一层的重量【英文标题】:How to get weight on each layers 【发布时间】:2021-12-24 20:38:33 【问题描述】:我正在尝试获取每一层的输入权重,包括 lstm 1、lstm 2 和注意力层之后的权重,并希望使用热图显示它们。但是当我运行代码时,会出现以下错误。发生了什么?因为层存在。 代码如下:
model.add(LSTM(32, input_shape=(n_timesteps,n_features), return_sequences=True))
#print weights
print(model.get_layer(LSTM).get_weights()[0])
model.add(LSTM(32, input_shape=(n_timesteps,n_features), return_sequences=True))
model.add(Dropout(0.1))
model.add(attention(return_sequences=False)) # receive 3D and output 2D
model.add(Dense(n_outputs, activation='softmax'))
model.summary()
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
# fit network
model.fit(trainX, trainy, epochs=epochs, batch_size=batch_size, verbose=verbose)
# evaluate model
_, accuracy = model.evaluate(testX, testy, batch_size=batch_size, verbose=0)
注意层:
class attention(Layer):
def __init__(self, return_sequences=True):
self.return_sequences = return_sequences
super(attention,self).__init__()
def build(self, input_shape):
self.W=self.add_weight(name="att_weight", shape=(input_shape[-1],1),
initializer="normal")
self.b=self.add_weight(name="att_bias", shape=(input_shape[1],1),
initializer="zeros")
super(attention,self).build(input_shape)
def call(self, x):
e = K.tanh(K.dot(x,self.W)+self.b)
a = K.softmax(e, axis=1)
output = x*a
if self.return_sequences:
return output
return K.sum(output, axis=1)
这是出现的错误:
ValueError: No such layer: <class 'keras.layers.recurrent_v2.LSTM'>. Existing layers are [<keras.layers.recurrent_v2.LSTM object at 0x7f7b5c215910>].
【问题讨论】:
【参考方案1】:在定义整个模型后,您可以使用model.layers
获得某些层权重:
import tensorflow as tf
import seaborn as sb
import matplotlib.pyplot as plt
class attention(tf.keras.layers.Layer):
def __init__(self, return_sequences=True):
self.return_sequences = return_sequences
super(attention,self).__init__()
def build(self, input_shape):
self.W=self.add_weight(name="att_weight", shape=(input_shape[-1],1),
initializer="normal")
self.b=self.add_weight(name="att_bias", shape=(input_shape[1],1),
initializer="zeros")
super(attention,self).build(input_shape)
def call(self, x):
e = tf.keras.backend.tanh(tf.keras.backend.dot(x,self.W)+self.b)
a = tf.keras.backend.softmax(e, axis=1)
output = x*a
if self.return_sequences:
return output
return tf.keras.backend.sum(output, axis=1)
model = tf.keras.Sequential()
model.add(tf.keras.layers.LSTM(32, input_shape=(5,10), return_sequences=True))
model.add(tf.keras.layers.LSTM(32, return_sequences=True))
model.add(tf.keras.layers.Dropout(0.1))
model.add(attention(return_sequences=False)) # receive 3D and output 2D
model.add(tf.keras.layers.Dense(3, activation='softmax'))
model.summary()
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
trainx = tf.random.normal((25, 5, 10))
trainy = tf.random.uniform((25, 3), maxval=3)
model.fit(trainx, trainy, epochs=5, batch_size=4)
lstm1_weights = model.layers[0].get_weights()[0]
lstm2_weights = model.layers[1].get_weights()[0]
attention_weights = model.layers[3].get_weights()[0]
heat_map = sb.heatmap(lstm1_weights)
plt.show()
Model: "sequential_16"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
lstm_24 (LSTM) (None, 5, 32) 5504
lstm_25 (LSTM) (None, 5, 32) 8320
dropout_12 (Dropout) (None, 5, 32) 0
attention_12 (attention) (None, 32) 37
dense_12 (Dense) (None, 3) 99
=================================================================
Total params: 13,960
Trainable params: 13,960
Non-trainable params: 0
_________________________________________________________________
Epoch 1/5
7/7 [==============================] - 4s 10ms/step - loss: 5.5033 - accuracy: 0.4400
Epoch 2/5
7/7 [==============================] - 0s 8ms/step - loss: 5.4899 - accuracy: 0.5200
Epoch 3/5
7/7 [==============================] - 0s 9ms/step - loss: 5.4771 - accuracy: 0.4800
Epoch 4/5
7/7 [==============================] - 0s 9ms/step - loss: 5.4701 - accuracy: 0.5200
Epoch 5/5
7/7 [==============================] - 0s 8ms/step - loss: 5.4569 - accuracy: 0.5200
如果您想查看训练期间层的权重如何变化,您应该定义一个回调,如post 所示。
【讨论】:
哇,非常感谢您的帮助。我很难找到原因并解决它。我会试试你的建议。再次感谢您 我想再问一下,请问可以知道每个标签是什么意思吗? 在我发布的示例中,标签没有任何特殊意义。只是三个不同的类。 再次感谢您的帮助。如果遇到麻烦,希望可以再问。以上是关于如何获得每一层的重量的主要内容,如果未能解决你的问题,请参考以下文章
如何避免 Objective C 中每一层的大量错误处理代码
如何使用 dynamic_rnn 获取多层 RNN 中每一步和每一层的状态