微调 mnist 的深度自动编码器模型

Posted 2023-03-24

技术标签:

【中文标题】微调 mnist 的深度自动编码器模型【英文标题】：Fine tuning deep autoencoder model for mnist 【发布时间】：2019-10-01 12:39:03 【问题描述】：

我已经为 mnist 数据集开发了一个 3 层深度自动编码器模型，因为我只是在这个玩具数据集上练习，因为我是这个微调范式的初学者

下面是代码

from keras import  layers
from keras.layers import Input, Dense
from keras.models import Model,Sequential
from keras.datasets import mnist
import numpy as np

# Deep Autoencoder


# this is the size of our encoded representations
encoding_dim = 32   # 32 floats -> compression factor 24.5, assuming the input is 784 floats

# this is our input placeholder; 784 = 28 x 28
input_img = Input(shape=(784, ))

my_epochs = 100

# "encoded" is the encoded representation of the inputs
encoded = Dense(encoding_dim * 4, activation='relu')(input_img)
encoded = Dense(encoding_dim * 2, activation='relu')(encoded)
encoded = Dense(encoding_dim, activation='relu')(encoded)

# "decoded" is the lossy reconstruction of the input
decoded = Dense(encoding_dim * 2, activation='relu')(encoded)
decoded = Dense(encoding_dim * 4, activation='relu')(decoded)
decoded = Dense(784, activation='sigmoid')(decoded)

# this model maps an input to its reconstruction
autoencoder = Model(input_img, decoded)

# Separate Encoder model

# this model maps an input to its encoded representation
encoder = Model(input_img, encoded)

# Separate Decoder model

# create a placeholder for an encoded (32-dimensional) input
encoded_input = Input(shape=(encoding_dim, ))
# retrieve the layers of the autoencoder model
decoder_layer1 = autoencoder.layers[-3]
decoder_layer2 = autoencoder.layers[-2]
decoder_layer3 = autoencoder.layers[-1]
# create the decoder model
decoder = Model(encoded_input, decoder_layer3(decoder_layer2(decoder_layer1(encoded_input))))

# Train to reconstruct MNIST digits

# configure model to use a per-pixel binary crossentropy loss, and the Adadelta optimizer
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')

# prepare input data
(x_train, y_train), (x_test, y_test) = mnist.load_data()

# normalize all values between 0 and 1 and flatten the 28x28 images into vectors of size 784
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.
x_train = x_train.reshape((len(x_train), np.prod(x_train.shape[1:])))
x_test = x_test.reshape((len(x_test), np.prod(x_test.shape[1:])))

# Train autoencoder for 50 epochs

autoencoder.fit(x_train, x_train, epochs=my_epochs, batch_size=256, shuffle=True, validation_data=(x_test, x_test),
                verbose=2)

# after 100 epochs the autoencoder seems to reach a stable train/test lost value

# Visualize the reconstructed encoded representations

# encode and decode some digits
# note that we take them from the *test* set
encodedTrainImages=encoder.predict(x_train)
encoded_imgs = encoder.predict(x_test)
decoded_imgs = decoder.predict(encoded_imgs)





# From here I want to fine tune just the encoder model
model=Sequential()
model=Sequential()
for layer in encoder.layers:
  model.add(layer)
model.add(layers.Flatten())
model.add(layers.Dense(20, activation='relu'))
model.add(layers.Dropout(0.5))
model.add(layers.Dense(10, activation='softmax'))

以下是我想要微调的编码器模型。

encoder.summary()
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         (None, 784)               0         
_________________________________________________________________
dense_1 (Dense)              (None, 128)               100480    
_________________________________________________________________
dense_2 (Dense)              (None, 64)                8256      
_________________________________________________________________
dense_3 (Dense)              (None, 32)                2080      
=================================================================
Total params: 110,816
Trainable params: 110,816
Non-trainable params: 0
_________________________________________________________________

问题：1

在构建自动编码器模型后，我只想使用编码器模型并对其进行微调以用于 mnist 数据集中的分类任务，但我遇到了错误。

错误：

Traceback (most recent call last):
  File "C:\Users\samer\Anaconda3\envs\tensorflow-gpu\lib\site-packages\IPython\core\interactiveshell.py", line 3267, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-15-528c079e5325>", line 3, in <module>
    model.add(layers.Flatten())
  File "C:\Users\samer\Anaconda3\envs\tensorflow-gpu\lib\site-packages\keras\engine\sequential.py", line 181, in add
    output_tensor = layer(self.outputs[0])
  File "C:\Users\samer\Anaconda3\envs\tensorflow-gpu\lib\site-packages\keras\engine\base_layer.py", line 414, in __call__
    self.assert_input_compatibility(inputs)
  File "C:\Users\samer\Anaconda3\envs\tensorflow-gpu\lib\site-packages\keras\engine\base_layer.py", line 327, in assert_input_compatibility
    str(K.ndim(x)))
ValueError: Input 0 is incompatible with layer flatten_4: expected min_ndim=3, found ndim=2

问题2：

类似地，我稍后会使用预训练模型，其中每个自动编码器将以贪婪的方式进行训练，然后对最终模型进行微调。有人可以指导我如何进一步完成这两项任务。

问候

【问题讨论】：

我不确定您所说的 fine_tune 是什么意思，在我看来，您似乎正在尝试“按原样”使用编码器并为其添加层。这是迁移学习。你是对的，这差不多就是这样，但我相信迁移学习是将一个模型从一个域迁移到另一个没有大量训练数据或者你认为学习模型已经学习了一些您认为对您的新领域有用的功能。就我而言，我在同一个域中使用相同的数据集，只是在练习这些东西如何在代码中工作？您能否更详细地解释您希望在问题 2 中实现的目标？您是否打算将多个自动编码器一个接一个地堆叠，或者您是否想要一个“并行”结构，其中每个自动编码器专门从事一项任务并最终执行某种投票/连接？还是别的什么？您能否向我们展示您正在运行的代码以获取问题 1 中的错误？我认为第二个问题太宽泛了，你应该试着缩小范围并澄清一下。 【参考方案1】：

问题 1

问题是您正试图展平已经平坦的层：您的编码器由一维 Desnse 层组成，其形状为 (batch_size, dim)。

Flatten 层需要至少一个 2D 输入，即具有 3 维形状 (batch_size, dim1, dim2)（例如 Conv2D 层的输出），通过删除它，模型将正确构建：

encoding_dim = 32
input_img = layers.Input(shape=(784, ))

encoded = layers.Dense(encoding_dim * 4, activation='relu')(input_img)
encoded = layers.Dense(encoding_dim * 2, activation='relu')(encoded)
encoded = layers.Dense(encoding_dim, activation='relu')(encoded)

encoder = Model(input_img, encoded)

[...]

model = Sequential()
for layer in encoder.layers:
    print(layer.name)
    model.add(layer)
model.add(layers.Dense(20, activation='relu'))
model.add(layers.Dropout(0.5))
model.add(layers.Dense(10, activation='softmax'))

model.summary()

哪些输出：

input_1
dense_1
dense_2
dense_3
Model: "sequential_1"
________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_1 (Dense)              (None, 128)               100480    
_________________________________________________________________
dense_2 (Dense)              (None, 64)                8256      
_________________________________________________________________
dense_3 (Dense)              (None, 32)                2080      
_________________________________________________________________
dense_4 (Dense)              (None, 20)                660       
_________________________________________________________________
dropout_1 (Dropout)          (None, 20)                0         
_________________________________________________________________
dense_5 (Dense)              (None, 10)                210       
=================================================================
Total params: 111,686
Trainable params: 111,686
Non-trainable params: 0
_________________________________________________________________

___

编辑：在 cmets 中整合问题的答案

问：我如何确定新模型将使用与之前训练的编码器相同的权重？

答：在您的代码中，您所做的是遍历编码器内部包含的层，然后将它们中的每一个传递给model.add()。您在这里所做的是直接将引用传递给每一层，因此您将在新模型中拥有完全相同的层。这是使用层名称的概念证明：

encoding_dim = 32

input_img = Input(shape=(784, ))

encoded = Dense(encoding_dim * 4, activation='relu')(input_img)
encoded = Dense(encoding_dim * 2, activation='relu')(encoded)

encoded = Dense(encoding_dim, activation='relu')(encoded)

decoded = Dense(encoding_dim * 2, activation='relu')(encoded)
decoded = Dense(encoding_dim * 4, activation='relu')(decoded)
decoded = Dense(784, activation='sigmoid')(decoded)

autoencoder = Model(input_img, decoded)

print("autoencoder first Dense layer reference:", autoencoder.layers[1])

encoder = Model(input_img, encoded)

print("encoder first Dense layer reference:", encoder.layers[1])

new_model = Sequential()
for i, layer in enumerate(encoder.layers):
  print("Before: ", layer.name)
  new_model.add(layer)
  if i != 0:
    new_model.layers[i-1].name = "new_model_"+layer.name
    print("After: ", layer.name)

哪些输出：

autoencoder first Dense layer reference: <keras.layers.core.Dense object at 
0x7fb5f138e278>
encoder first Dense layer reference: <keras.layers.core.Dense object at 
0x7fb5f138e278>
Before:  input_1
Before:  dense_1
After:  new_model_dense_1
Before:  dense_2
After:  new_model_dense_2
Before:  dense_3
After:  new_model_dense_3

如您所见，编码器和自动编码器中的层引用是相同的。更重要的是，通过更改新模型内部的层名称，我们也在更改编码器相应层内部的层名称。有关通过引用传递的 python 参数的更多详细信息，请查看answer。

问：我的数据是否需要单热编码？如果是，那怎么办？

A：您确实需要一次性编码，因为您正在处理多标签分类问题。编码只需使用方便的 keras 函数即可完成：

from keras.utils import np_utils

one_hot = np_utils.to_categorical(y_train)

这是documentation的链接。

___

问题 2

关于您的第二个问题，您的目标不是很清楚，但是在我看来，您想要构建一个包含多个并行自动编码器的架构，这些自动编码器专门用于不同的任务，然后将它们连接起来通过添加一些最终的公共层来输出。

无论如何，到目前为止，我能做的就是建议您看看这个guide，它解释了如何构建多输入和多输出模型并将其用作开始自定义的基准实施。

___

编辑 2：问题 2 答案整合

关于贪心训练任务，方法是通过在追加新层时冻结所有前一层来一次训练一层。这是一个 3(+1) 贪婪训练层网络的示例，后来用作新模型的基础：

(x_train, y_train), (x_test, y_test) = mnist.load_data()
y_train = np_utils.to_categorical(y_train)
y_test = np_utils.to_categorical(y_test)
x_train = np.reshape(x_train, (x_train.shape[0], -1))
x_test = np.reshape(x_test, (x_test.shape[0], -1))

model = Sequential()
model.add(Dense(256, activation="relu", kernel_initializer="he_uniform", input_shape=(28*28,)))
model.add(Dense(10, activation="softmax"))

model.compile(optimizer=SGD(lr=0.01, momentum=0.9), loss="categorical_crossentropy", metrics=["accuracy"])
model.fit(x_train, y_train, batch_size=64, epochs=50, verbose=1)

# Remove last layer
model.pop()

# 'Freeze' previous layers, so to single-train the new one
for layer in model.layers:
    layer.trainable = False

# Append new layer + classification layer
model.add(Dense(64, activation="relu", kernel_initializer="he_uniform"))
model.add(Dense(10, activation="softmax"))

model.fit(x_train, y_train, batch_size=64, epochs=50, verbose=0)

#  Remove last layer
model.pop()

# 'Freeze' previous layers, so to single-train the new one
for layer in model.layers:
    layer.trainable = False

# Append new layer + classification layer
model.add(Dense(32, activation="relu", kernel_initializer="he_uniform"))
model.add(Dense(10, activation="softmax"))

model.fit(x_train, y_train, batch_size=64, epochs=50, verbose=0)

# Create new model which will use the pre-trained layers
new_model = Sequential()

# Discard the last layer from the previous model
model.pop()

# Optional: you can decide to set the pre-trained layers as trainable, in 
# which case it would be like having initialized their weights, or not.
for l in model.layers:
    l.trainable = True
new_model.add(model)

new_model.add(Dense(20, activation='relu'))
new_model.add(Dropout(0.5))
new_model.add(Dense(10, activation='softmax'))

new_model.compile(optimizer=SGD(lr=0.01, momentum=0.9), loss="categorical_crossentropy", metrics=["accuracy"])
new_model.fit(x_train, y_train, batch_size=64, epochs=100, verbose=1)

大致就是这样，但是我必须说贪婪层训练可能不再是一个合适的解决方案：现在 ReLU、Dropout 和其他正则化技术使贪婪层训练成为过时且耗时的权重初始化，因此您可能想要在进行贪婪训练之前，还要看看其他可能性。

___

【讨论】：

你在第二种情况下是对的，我正在寻找分层训练。所以，如果你有时间你可以这样做，我也在做我的。与问题 1 相关的一个问题是，我如何确定微调编码器正在使用我刚刚从自动编码器获得的训练权重，并且它不会再次从一开始就初始化权重，并且如果我需要，它不会再次初始化输出端的 softmax 层我的数据的 one-hot 编码如果是这样，那么如何？您可以确定这一点，因为您将图层传递给 model.add()，这是通过引用完成的，而不是新的，甚至不是副本。另外，是的，您确实需要对数据进行一次性编码，并且只需使用 to_categorical keras 实用程序函数即可完成。我将编辑我的答案，为这两个问题添加概念证明。关于第二个问题，我会尽量找时间提供一个小例子，但是你真的应该明确你想要做什么，也许你可以举个例子？还请完成问题 2，仅包含 3 个大小为 256,64 和 32 的隐藏层，最后根据贪心层微调网络进行分类，以便答案完整，我可以将这个赏金奖励给你. 我将答案 2 与示例进行了整合，希望对您有所帮助！

以上是关于微调 mnist 的深度自动编码器模型的主要内容，如果未能解决你的问题，请参考以下文章