Keras 自动编码器分类

Posted

技术标签:

【中文标题】Keras 自动编码器分类【英文标题】:Keras autoencoder classification 【发布时间】:2018-05-11 15:12:22 【问题描述】:

我正在尝试找到一个有用的代码来使用自动编码器改进分类。 我按照这个例子keras autoencoder vs PCA 但不适用于 MNIST 数据,我尝试将其与 cifar-10 一起使用

所以我做了一些更改,但似乎有些不合适。 有人可以帮我吗? 如果您有另一个可以在不同数据集中运行的示例,那将有所帮助。

reduce.fit 中的验证,即 (X_test,Y_test) 没有学习,因此它在 .evalute() 中给出了错误的准确性 总是给 val_loss:2.3026 - val_acc:0.1000 这是代码,错误:

rom keras.datasets import  cifar10
from keras.models import Model
from keras.layers import Input, Dense
from keras.utils import np_utils
import numpy as np

num_train = 50000
num_test = 10000

height, width, depth = 32, 32, 3 # MNIST images are 28x28
num_classes = 10 # there are 10 classes (1 per digit)

(X_train, y_train), (X_test, y_test) = cifar10.load_data()

X_train = X_train.reshape(num_train,height * width * depth)
X_test = X_test.reshape(num_test,height * width*depth)
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')

X_train /= 255 # Normalise data to [0, 1] range
X_test /= 255 # Normalise data to [0, 1] range

Y_train = np_utils.to_categorical(y_train, num_classes) # One-hot encode the labels
Y_test = np_utils.to_categorical(y_test, num_classes) # One-hot encode the labels

input_img = Input(shape=(height * width * depth,))
s=height * width * depth
x = Dense(s, activation='relu')(input_img)

encoded = Dense(s//2, activation='relu')(x)
encoded = Dense(s//8, activation='relu')(encoded)

y = Dense(s//256, activation='relu')(x)

decoded = Dense(s//8, activation='relu')(y)
decoded = Dense(s//2, activation='relu')(decoded)

z = Dense(s, activation='sigmoid')(decoded)
model = Model(input_img, z)

model.compile(optimizer='adadelta', loss='mse') # reporting the accuracy

model.fit(X_train, X_train,
      nb_epoch=10,
      batch_size=128,
      shuffle=True,
      validation_data=(X_test, X_test))

mid = Model(input_img, y)
reduced_representation =mid.predict(X_test)


out = Dense(num_classes, activation='softmax')(y)
reduced = Model(input_img, out)
reduced.compile(loss='categorical_crossentropy',
          optimizer='adam',
          metrics=['accuracy'])

reduced.fit(X_train, Y_train,
      nb_epoch=10,
      batch_size=128,
      shuffle=True,
      validation_data=(X_test, Y_test))


scores = reduced.evaluate(X_test, Y_test, verbose=0)
print("Accuracy: ", scores[1])
Train on 50000 samples, validate on 10000 samples
Epoch 1/10
50000/50000 [==============================] - 5s - loss: 0.0639 - val_loss: 0.0633
Epoch 2/10
50000/50000 [==============================] - 5s - loss: 0.0610 - val_loss: 0.0568
Epoch 3/10
50000/50000 [==============================] - 5s - loss: 0.0565 - val_loss: 0.0558
Epoch 4/10
50000/50000 [==============================] - 5s - loss: 0.0557 - val_loss: 0.0545
Epoch 5/10
50000/50000 [==============================] - 5s - loss: 0.0536 - val_loss: 0.0518
Epoch 6/10
50000/50000 [==============================] - 5s - loss: 0.0502 - val_loss: 0.0461
Epoch 7/10
50000/50000 [==============================] - 5s - loss: 0.0443 - val_loss: 0.0412
Epoch 8/10
50000/50000 [==============================] - 5s - loss: 0.0411 - val_loss: 0.0397
Epoch 9/10
50000/50000 [==============================] - 5s - loss: 0.0391 - val_loss: 0.0371
Epoch 10/10
50000/50000 [==============================] - 5s - loss: 0.0377 - val_loss: 0.0403
Train on 50000 samples, validate on 10000 samples
Epoch 1/10
50000/50000 [==============================] - 3s - loss: 2.3605 - acc: 0.0977 - val_loss: 2.3026 - val_acc: 0.1000
Epoch 2/10
50000/50000 [==============================] - 3s - loss: 2.3027 - acc: 0.0952 - val_loss: 2.3026 - val_acc: 0.1000
Epoch 3/10
50000/50000 [==============================] - 3s - loss: 2.3027 - acc: 0.0978 - val_loss: 2.3026 - val_acc: 0.1000
Epoch 4/10
50000/50000 [==============================] - 3s - loss: 2.3027 - acc: 0.0980 - val_loss: 2.3026 - val_acc: 0.1000
Epoch 5/10
50000/50000 [==============================] - 3s - loss: 2.3027 - acc: 0.0974 - val_loss: 2.3026 - val_acc: 0.1000
Epoch 6/10
50000/50000 [==============================] - 3s - loss: 2.3027 - acc: 0.1000 - val_loss: 2.3026 - val_acc: 0.1000
Epoch 7/10
50000/50000 [==============================] - 3s - loss: 2.3027 - acc: 0.0992 - val_loss: 2.3026 - val_acc: 0.1000
Epoch 8/10
50000/50000 [==============================] - 3s - loss: 2.3027 - acc: 0.0982 - val_loss: 2.3026 - val_acc: 0.1000
Epoch 9/10
50000/50000 [==============================] - 3s - loss: 2.3027 - acc: 0.0965 - val_loss: 2.3026 - val_acc: 0.1000
Epoch 10/10
50000/50000 [==============================] - 3s - loss: 2.3027 - acc: 0.0978 - val_loss: 2.3026 - val_acc: 0.1000
 9856/10000 [============================>.] - ETA: 0s('Accuracy: ', 0.10000000000000001)

【问题讨论】:

***.com/questions/47842931/… 有什么建议吗? 【参考方案1】:

您的代码存在多个问题。

您的自动编码器尚未完全训练,如果您绘制训练数据,您会看到模型尚未收敛。由

history = model.fit(X_train, X_train,
nb_epoch=10,
batch_size=128,
shuffle=True,
validation_data=(X_test, X_test))

您将在训练期间获得损失值。如果你绘制它们,例如在 matplotlib 中,

import matplotlib.pyplot as plt
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('model train vs validation loss 1')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'validation'], loc='upper right')
plt.show()

你会发现它需要更多的 epoch 才能收敛。

自动编码器架构构建错误,y = Dense(s//256, activation='relu')(x) 行有错字,您可能想使用y = Dense(s//256, activation='linear')(encoded),所以它使用前一层而不是输入。而且您也不想在潜在空间中使用 relu 激活,因为它不允许您从彼此中减去潜在变量,从而使自动编码器的效率大大降低。 通过这些修复,模型训练时不会出现我们的问题。

我将训练两个网络的 epoch 数增加到 30 个,以便训练得更好。 训练结束时,分类模型报告loss: 1.2881 - acc: 0.5397 - val_loss: 1.3841 - val_acc: 0.5126,低于您的体验。

【讨论】:

以上是关于Keras 自动编码器分类的主要内容,如果未能解决你的问题,请参考以下文章

如何使用 sklearn 管道缩放 Keras 自动编码器模型的目标值?

自动编码器的正则化太强(Keras 自动编码器教程代码)

变分自动编码器损失函数(keras)

Keras - 变分自动编码器 NaN 损失

Keras 自动编码器中的输入形状

Keras中的卷积自动编码器解码器错误