如何提高 CNN 模型的验证准确度
Posted
技术标签:
【中文标题】如何提高 CNN 模型的验证准确度【英文标题】:How to increase the validation accuracy in CNN model 【发布时间】:2022-01-12 03:39:43 【问题描述】:我想建立一个 CNN 模型来将唐氏综合症的面孔与正常人脸进行分类,然后通过另一个模型对性别进行分类。我尝试更改层数、节点数、时期数、优化器数。另外,我尝试了彩色图像和灰度。该数据集是 799 张图像,包括正常和唐氏综合症。 这是我的代码
model.add(Conv2D(filters=16, kernel_size=(5,5), activation="relu",
input_shape=X_train[0].shape))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2,2), strides=(2,2)))
model.add(Dropout(0.2))
model.add(Conv2D(filters=32, kernel_size=(5,5), activation='relu'))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2,2), strides=(2,2)))
model.add(Dropout(0.3))
model.add(Conv2D(filters=64, kernel_size=(5,5), activation="relu"))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2,2), strides=(2,2)))
model.add(Dropout(0.3))
model.add(Conv2D(filters=64, kernel_size=(5,5), activation="relu"))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2,2), strides=(2,2)))
model.add(Dropout(0.2))
model.add(Flatten())
#Two dense layers and then output layer
model.add(Dense(256, activation='relu'))
model.add(BatchNormalization())
model.add(Dropout(0.5)) #Using dropouts to make sure that
#the model isn't overfitting
model.add(Dense(128, activation='relu'))
model.add(BatchNormalization())
model.add(Dropout(0.5))
model.add(Dense(2, activation='softmax'))
我尝试将最后一个激活层从 softmax 更改为 sigmoid,反之亦然,但没有成功。输入图像的大小为 200x200
Model: "sequential_4"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_16 (Conv2D) (None, 196, 196, 16) 416
batch_normalization_24 (Bat (None, 196, 196, 16) 64
chNormalization)
max_pooling2d_16 (MaxPoolin (None, 98, 98, 16) 0
g2D)
dropout_24 (Dropout) (None, 98, 98, 16) 0
conv2d_17 (Conv2D) (None, 94, 94, 32) 12832
batch_normalization_25 (Bat (None, 94, 94, 32) 128
chNormalization)
max_pooling2d_17 (MaxPoolin (None, 47, 47, 32) 0
g2D)
dropout_25 (Dropout) (None, 47, 47, 32) 0
conv2d_18 (Conv2D) (None, 43, 43, 64) 51264
batch_normalization_26 (Bat (None, 43, 43, 64) 256
chNormalization)
max_pooling2d_18 (MaxPoolin (None, 21, 21, 64) 0
g2D)
dropout_26 (Dropout) (None, 21, 21, 64) 0
conv2d_19 (Conv2D) (None, 17, 17, 64) 102464
batch_normalization_27 (Bat (None, 17, 17, 64) 256
chNormalization)
max_pooling2d_19 (MaxPoolin (None, 8, 8, 64) 0
g2D)
dropout_27 (Dropout) (None, 8, 8, 64) 0
flatten_4 (Flatten) (None, 4096) 0
dense_12 (Dense) (None, 256) 1048832
batch_normalization_28 (Bat (None, 256) 1024
chNormalization)
dropout_28 (Dropout) (None, 256) 0
dense_13 (Dense) (None, 128) 32896
batch_normalization_29 (Bat (None, 128) 512
chNormalization)
dropout_29 (Dropout) (None, 128) 0
dense_14 (Dense) (None, 2) 258
=================================================================
Total params: 1,251,202
Trainable params: 1,250,082
Non-trainable params: 1,120
_________________________________________________________________
model.compile(optimizer='Adam', loss='binary_crossentropy', metrics=['accuracy'])
# split train and VALID data
X_train, X_valid, y_train, y_valid = train_test_split(X_train, y_train, test_size=0.15)
我想将准确率至少提高到 70,但我达到的最高分数是 47%
history = model.fit(X_train, y_train, epochs=50, validation_data=(X_valid, y_valid), batch_size=64)
Epoch 1/50
5/5 [==============================] - 23s 4s/step - loss: 0.9838 - accuracy: 0.5390 - val_loss: 0.6931 - val_accuracy: 0.4800
Epoch 2/50
5/5 [==============================] - 21s 4s/step - loss: 0.8043 - accuracy: 0.6348 - val_loss: 0.7109 - val_accuracy: 0.4800
Epoch 3/50
5/5 [==============================] - 21s 4s/step - loss: 0.6745 - accuracy: 0.6915 - val_loss: 0.7554 - val_accuracy: 0.4800
Epoch 4/50
5/5 [==============================] - 21s 4s/step - loss: 0.6429 - accuracy: 0.7589 - val_loss: 0.8261 - val_accuracy: 0.4800
Epoch 5/50
5/5 [==============================] - 21s 4s/step - loss: 0.5571 - accuracy: 0.8014 - val_loss: 0.9878 - val_accuracy: 0.4800
有什么办法可以增加更多吗?以及如何组合两个模型? 任何帮助将不胜感激。非常感谢。
【问题讨论】:
【参考方案1】:尝试图像增强。 我是说;很明显,模型过度拟合了数据
甚至可以更改train_test_split
比率(增加它。)
【讨论】:
【参考方案2】:我认为发生了两件事之一。训练数据会指向过度拟合,但考虑到模型中的辍学量,我不会怀疑是这种情况。我认为可能是训练数据的概率分布与验证数据的概率分布明显不同。如果您的训练样本很少,就会发生这种情况。那么,您的 2 个班级中的每个班级有多少个训练样本?如果每个类少于 120 个样本,则使用图像增强来创建更多训练样本。您是如何生成验证图像的?如果您有单独的验证图像,最好将训练集与验证集结合起来,然后使用 sklearn train_test_split 将组合数据随机拆分为训练集和验证集。注意:只对训练集使用扩充,而不是验证集。我还建议您使用 Keras 回调在高原上降低学习率来实现可调整的学习率。文档是 here. 下面的代码显示了我使用的设置
rlronp=tf.keras.callbacks.ReduceLROnPlateau(monitor="val_loss", factor=0.5,
patience=1, verbose=1)
还建议使用 Keras 回调 Early Stopping,文档是 here. 下面的代码显示了我的实现
estop=tf.keras.callbacks.EarlyStopping( monitor="val_loss", patience=3,
verbose=1,
restore_best_weights=True)
在 model.fit 中包含代码
history=model.fit(..... callbacks[estop, rlronp])
将要运行的 epoch 数设置为一个相当大的值。
【讨论】:
以上是关于如何提高 CNN 模型的验证准确度的主要内容,如果未能解决你的问题,请参考以下文章