如何在每一层和每个时期获得权重然后保存在文件中

Posted

技术标签:

【中文标题】如何在每一层和每个时期获得权重然后保存在文件中【英文标题】:How to get weight in each layer and epoch then save in file 【发布时间】:2022-01-05 01:31:38 【问题描述】:

我正在尝试获取每个时期中每一层的权重值,然后将其保存在文件中。 我正在尝试在page 上实现 Eric M 提出的代码。但是在仍然尝试获取重量值时,我收到如下错误:

<ipython-input-15-81ab617ec631> in on_epoch_end(self, epoch, logs)
w = self.model.layers[layer_i].get_weights()[0]
IndexError: list index out of range

发生了什么?因为 layer_i 只获取我使用的层数。是因为我使用了注意力层吗?我也无法将其保存到文件中,因为我不知道代码会产生什么。

这是我使用的回调和模型:

class GetWeights(keras.callbacks.Callback):
  def __init__(self):
    super(GetWeights, self).__init__()
    self.weight_dict = 
  def on_epoch_end(self, epoch, logs=None):
    for layer_i in range(len(self.model.layers)):
      w = self.model.layers[layer_i].get_weights()[0]
      b = self.model.layers[layer_i].get_weights()[1]
      heat_map = sb.heatmap(w)
      pyplot.show()
      print('Layer %s has weights of shape %s and biases of shape %s' %(layer_i, np.shape(w), np.shape(b)))
      if epoch == 0:
        # create array to hold weights and biases
        self.weight_dict['w_'+str(layer_i+1)] = w
        self.weight_dict['b_'+str(layer_i+1)] = b
      else:
        # append new weights to previously-created weights array
        self.weight_dict['w_'+str(layer_i+1)] = np.dstack(
            (self.weight_dict['w_'+str(layer_i+1)], w))
        # append new weights to previously-created weights array
        self.weight_dict['b_'+str(layer_i+1)] = np.dstack(
            (self.weight_dict['b_'+str(layer_i+1)], b))

gw = GetWeights()
model = Sequential() 
model.add(LSTM(hidden_units_masukan, input_shape=(n_timesteps,n_features), return_sequences=True))
model.add(LSTM(hidden_units_masukan, input_shape=(n_timesteps,n_features), return_sequences=True))
model.add(Dropout(dropout_masukan))
model.add(attention(return_sequences=False)) # receive 3D and output 2D
model.add(Dense(n_outputs, activation=activation_masukan))
model.compile(loss='categorical_crossentropy', optimizer=optimizer_masukan, metrics=['accuracy'])
model.fit(trainX, trainy, epochs=epochs, batch_size=batch_size_masukan, verbose=verbose, callbacks=[gw],)

【问题讨论】:

【参考方案1】:

问题是您试图从模型中的每一层提取权重和偏差,但 Dropout 层没有任何权重。这就是您收到此错误消息的原因。您需要排除此层。这是一个工作示例:

import tensorflow as tf
import seaborn as sb
import matplotlib.pyplot as plt
import numpy as np

class attention(tf.keras.layers.Layer):
  def __init__(self, return_sequences=True):
      self.return_sequences = return_sequences
      super(attention,self).__init__()
  def build(self, input_shape):
      self.W=self.add_weight(name="att_weight", shape=(input_shape[-1],1),
                            initializer="normal")
      self.b=self.add_weight(name="att_bias", shape=(input_shape[1],1),
                            initializer="zeros")
      super(attention,self).build(input_shape)
  def call(self, x):
      e = tf.keras.backend.tanh(tf.keras.backend.dot(x,self.W)+self.b)
      a = tf.keras.backend.softmax(e, axis=1)
      output = x*a
      if self.return_sequences:
          return output
      return tf.keras.backend.sum(output, axis=1)

class GetWeights(tf.keras.callbacks.Callback):
  def __init__(self):
    super(GetWeights, self).__init__()
    self.weight_dict = 
  def on_epoch_end(self, epoch, logs=None):
    drop_out_index = 2
    for i, layer in enumerate(self.model.layers):
      if drop_out_index != i:
        w = layer.get_weights()[0]
        b = layer.get_weights()[1]
        heat_map = sb.heatmap(w)
        plt.show()
        print('Layer %s has weights of shape %s and biases of shape %s' %(i, np.shape(w), np.shape(b)))
        if epoch == 0:
          # create array to hold weights and biases
          self.weight_dict['w_'+str(i+1)] = w
          self.weight_dict['b_'+str(i+1)] = b
        else:
          # append new weights to previously-created weights array
          self.weight_dict['w_'+str(i+1)] = np.dstack(
              (self.weight_dict['w_'+str(i+1)], w))
          # append new weights to previously-created weights array
          self.weight_dict['b_'+str(i+1)] = np.dstack(
              (self.weight_dict['b_'+str(i+1)], b))

gw = GetWeights()
model = tf.keras.Sequential()
model.add(tf.keras.layers.LSTM(32, input_shape=(5,10), return_sequences=True))
model.add(tf.keras.layers.LSTM(32, return_sequences=True))
model.add(tf.keras.layers.Dropout(0.1))
model.add(attention(return_sequences=False)) # receive 3D and output 2D
model.add(tf.keras.layers.Dense(3, activation='softmax'))
model.summary()
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

trainx = tf.random.normal((25, 5, 10))
trainy = tf.random.uniform((25, 3), maxval=3)
model.fit(trainx, trainy, epochs=1, batch_size=4, callbacks=[gw])
Model: "sequential_11"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 lstm_22 (LSTM)              (None, 5, 32)             5504      
                                                                 
 lstm_23 (LSTM)              (None, 5, 32)             8320      
                                                                 
 dropout_11 (Dropout)        (None, 5, 32)             0         
                                                                 
 attention_11 (attention)    (None, 32)                37        
                                                                 
 dense_11 (Dense)            (None, 3)                 99        
                                                                 
=================================================================
Total params: 13,960
Trainable params: 13,960
Non-trainable params: 0
_________________________________________________________________
7/7 [==============================] - ETA: 0s - loss: 4.4367 - accuracy: 0.3200     

Layer 0 has weights of shape (10, 128) and biases of shape (32, 128)

Layer 1 has weights of shape (32, 128) and biases of shape (32, 128)

Layer 3 has weights of shape (32, 1) and biases of shape (5, 1)

Layer 4 has weights of shape (32, 3) and biases of shape (3,)
7/7 [==============================] - 5s 265ms/step - loss: 4.4367 - accuracy: 0.3200
<keras.callbacks.History at 0x7f3914737b10>

【讨论】:

非常感谢您解释并包含代码。我刚刚从你那里了解到,dropout 只会禁用隐藏的神经元,并且没有权重和偏差。我可以再问一次吗?当我用热图打印它时,每个时期我都会得到相同的图像。它怎么发生的?虽然 seaborn 代码在每个 epoch 中都应该生成不同的 hetmap。 你得到相同的图像是什么?看看你的模型总结,Dropout 层没有任何可训练的参数。 我的意思是,当我尝试运行程序时,输出是这样的:Link。在每个 epoch 中总是相同的,而每个 epoch 获得的准确度是不同的。我对此很困惑,是代码错误还是我的设置错误。 您的代码没有问题。权重的变化太小,无法在热图中看到它们。您必须调整热图的参数,如vminvmax。但是,这超出了您原始问题的范围。因此,如果您需要帮助,请提出一个新问题。 aaa 所以它可能是由于非常小的变化而发生的,我刚刚明白了。抱歉,因为我是深度学习和 python 的新手。我非常感谢您的帮助和解释。非常感谢。

以上是关于如何在每一层和每个时期获得权重然后保存在文件中的主要内容,如果未能解决你的问题,请参考以下文章

如何获得每一层的重量

如何获得二维数组中每一列和每一行的总和?

CNN输出每一层的卷积核,即每一层的权重矩阵和偏移量矩阵

测试期间的批量标准化

杨辉三角的解法

杨辉三角的解法