cifar 10 的自动编码器，精度低

Posted 2023-02-16

技术标签:

【中文标题】cifar 10 的自动编码器，精度低【英文标题】：autoencoder for cifar 10 with low accuracy 【发布时间】：2021-11-10 14:07:37 【问题描述】：

我正在构建一个卷积自动编码器，其目标是对图像进行编码然后对其进行解码。但是，我总是绕过准确度：61% - 损失：~ 0.0159。以下是我的代码。我没有使用批量标准化或辍学。我不确定如何提高准确性。

#define the input shape
input_img = Input(shape = (img_width, img_height, img_channels))

# convert to float32 format
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')

# Normalize the data
x_train = x_train / 255
x_test = x_test / 255


x = Conv2D(64, (3, 3), activation='relu', padding='same') (input_img)
x = MaxPooling2D((2, 2)) (x)
x = Conv2D(32, (3, 3), activation='relu', padding='same') (x)
x = MaxPooling2D((2, 2)) (x)
x = Conv2D(16, (3, 3), activation='relu', padding='same') (x)
x = MaxPooling2D((2, 2)) (x)
x = Conv2D(8, (3, 3), activation='relu', padding='same') (x)
encoded = MaxPooling2D((2, 2)) (x)

x = Conv2D(8, (3, 3), activation='relu', padding='same') (encoded)
x = UpSampling2D((2, 2)) (x)
x = Conv2D(16, (3, 3), activation='relu', padding='same') (x)
x = UpSampling2D((2, 2)) (x)
x = Conv2D(32, (3, 3), activation='relu', padding='same') (x)
x = UpSampling2D((2, 2)) (x)
x = Conv2D(64, (3, 3), activation='relu', padding='same') (x)
x = UpSampling2D((2, 2)) (x)
decoded = Conv2D(3, (3, 3), padding='same') (x)

cae = Model(input_img,decoded)
cae.compile(optimizer = 'adam', loss ='mse', metrics=['accuracy'] )
cae.summary()

history = cae.fit(x_train,x_train,
       epochs = 25,
       batch_size = 50,
       validation_data = (x_test, x_test))

【问题讨论】：

您是否考虑过您的自动编码器会进行回归，而准确度是仅对分类有效的指标？ 【参考方案1】：

根据您正在编码的特定图像，期望获得比这更高的准确度可能是不合理的。您执行 4 次下采样 (maxpool2D)，大致将数据位数减少了 16 倍。自动编码器本质上是一种压缩算法，其中学习了压缩策略/编码空间。一般来说，压缩算法只能希望实现 1:3 左右的无损压缩，因此对自动编码器的期望更高可能是不合理的。

话虽如此，您的用例可能是一组严格受限的图像（例如，静态相机，因此所有图像的背景都相同，等等）。在这种情况下，尽管压缩因子相对较大，您仍可能期望获得高精度。我的猜测是，CIFAR 10 的输入空间有点太大，无法以您的压缩级别忠实地重建图像。

【讨论】：

【参考方案2】：

通过降低压缩比可以获得更高的精度。在我之前的代码中，我删除了一个MaxPooling2D 和一个UpSampling2D，然后我的准确率提高到了70%。以下是修改后的代码 sn -p。更高的准确性并不意味着更高的性能。由于我只是压缩和解码图像，这完全取决于压缩率和最终目标。

#define the input shape
input_img = Input(shape = (img_width, img_height, img_channels))

# convert to float32 format
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')

# Normalize the data
x_train = x_train / 255
x_test = x_test / 255


x = Conv2D(64, (3, 3), activation='relu', padding='same') (input_img)
x = MaxPooling2D((2, 2)) (x)
x = Conv2D(32, (3, 3), activation='relu', padding='same') (x)
x = MaxPooling2D((2, 2)) (x)
x = Conv2D(16, (3, 3), activation='relu', padding='same') (x)
# removed 
#x = MaxPooling2D((2, 2)) (x)
x = Conv2D(8, (3, 3), activation='relu', padding='same') (x)
encoded = MaxPooling2D((2, 2)) (x)

x = Conv2D(8, (3, 3), activation='relu', padding='same') (encoded)
x = UpSampling2D((2, 2)) (x)
x = Conv2D(16, (3, 3), activation='relu', padding='same') (x)
x = UpSampling2D((2, 2)) (x)
x = Conv2D(32, (3, 3), activation='relu', padding='same') (x)
x = UpSampling2D((2, 2)) (x)
x = Conv2D(64, (3, 3), activation='relu', padding='same') (x)
# removed 
#x = UpSampling2D((2, 2)) (x)
decoded = Conv2D(3, (3, 3), padding='same') (x)

cae = Model(input_img,decoded)
cae.compile(optimizer = 'adam', loss ='mse', metrics=['accuracy'] )
cae.summary()

history = cae.fit(x_train,x_train,
       epochs = 25,
       batch_size = 50,
       validation_data = (x_test, x_test))

【讨论】：

以上是关于cifar 10 的自动编码器，精度低的主要内容，如果未能解决你的问题，请参考以下文章