2021-10-22计算机视觉:4更深的卷积神经网络:MiniVGGNet
Posted 俯仰天地
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了2021-10-22计算机视觉:4更深的卷积神经网络:MiniVGGNet相关的知识,希望对你有一定的参考价值。
MiniVGGNet:更深层的卷积神经网络
VGGNet,首次被Simonyan和Zisserman在他们的论文:Very Deep Learning Convolutional Neural Networks for Large-Scale Image Recognition 中提出。
在此之前,深度学习中的神经网络混合使用各种尺寸的卷积核。
经常是第一层卷积核大小在7 * 7到 11*11之间,然后减小到5*5,最深层一般是3*3。
VGG则不同,在整个网络结构中,只使用3*3的卷积核。
这种只用小尺寸卷积核的做饭被广泛认为帮助VGGNet提高了其泛化能力。
这种3*3的卷积核成为了VGG的代表,如果一个网络结构中只使用了3*3的卷积核,那么说明这是从VGGNet中得到的灵感。
但是,完整版的VGG16和VGG19对我们目前的水平来说还是有些过于高级了。
所以我们先来讨论VGG家族的网络,以及其必须具有的特征,并且通过实现和训练一个小型的类VGGNet的结构来学习它。在其实现过程中,我们会用到两种网络层:BN以及Dropout
VGG家族
VGG家族的卷积神经网络往往具有以下两个关键特征:
- 所有卷积层的卷积核的尺寸均为3 * 3
- 在进行了多次的卷积与激活之后,才执行一次池化。
MiniVGGNet
我们先将MiniVGGNet的网络结构列出来,如下表:
Layer Type | Output Size | Filter Size / Stride |
---|---|---|
INPUT IMAGE | 32 * 32 *3 | |
CONV | 32 * 32 *32 | 3 * 3 , K = 32 |
ACT | 32 * 32 *32 | |
BN | 32 * 32 *32 | |
CONV | 32 * 32 *32 | 3 * 3 , K = 32 |
ACT | 32 * 32 *32 | |
BN | 32 * 32 *32 | |
POOL | 32 * 32 *32 | 2 * 2 |
DROPOUT | 32 * 32 *32 | |
CONV | 32 * 32 *32 | 3 * 3 , K = 64 |
ACT | 32 * 32 *32 | |
BN | 32 * 32 *32 | |
CONV | 32 * 32 *32 | 2 * 2 |
ACT | 32 * 32 *32 | |
BN | 32 * 32 *32 | |
POOL | 32 * 32 *32 | |
DROPOUT | 32 * 32 *32 | |
FC | 512 | |
ACT | 512 | |
BN | 512 | |
DROPOUT | 512 | |
FC | 10 | |
SOFTMAX | 10 |
代码实现:
目录结构:
----pyimagesearch
| |----__init__.py
| |----nn
| | |----__init__.py
| | |----conv
| | | |----__init__.py
| | | |----lenet.py
| | | |----minivggnet.py
| | | |----shallownet.py
打开minivggnet.py
写入如下代码
from keras.models import Sequential
from keras.layers import BatchNormalization
from keras.layers.convolutional import Conv2D
from keras.layers.convolutional import MaxPooling2D
from keras.layers.core import Activation
from keras.layers.core import Flatten
from keras.layers.core import Dense
from keras.layers.core import Dropout
from keras import backend as K
class MiniVGGNet:
@staticmethod
def build(width, height, depth, classes):
model = Sequential()
inputShape = (height, width, depth)
chanDim = -1
if K.image_data_format == "channels_first":
inputShape = (depth, height, width)
chanDim = 1
model.add(Conv2D(32, (3, 3), padding="same", input_shape=inputShape))
model.add(Activation("relu"))
model.add(BatchNormalization(axis=chanDim))
model.add(Conv2D(32, (3, 3), padding="same"))
model.add(Activation("relu"))
model.add(BatchNormalization(axis=chanDim))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Conv2D(64, (3, 3), padding="same"))
model.add(Activation("relu"))
model.add(BatchNormalization(axis=chanDim))
model.add(Conv2D(64, (3, 3), padding="same"))
model.add(Activation("relu"))
model.add(BatchNormalization(axis=chanDim))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(512))
model.add(Activation("relu"))
model.add(BatchNormalization())
model.add(Dropout(0.5))
model.add(Dense(classes))
model.add(Activation("softmax"))
return model
在CIFAR-10数据集上使用MiniVGGNet
创建minivggnet_cifar-10.py文件,写入如下代码:
import matplotlib
from sklearn.preprocessing import LabelBinarizer
from sklearn.metrics import classification_report
from sklearn.model_selection import train_test_split
from nn.conv.minivggnet import MiniVGGNet
from tensorflow.keras.optimizers import SGD
from keras.datasets import cifar10
import matplotlib.pyplot as plt
import numpy as np
matplotlib.use("Agg")
print("[INFO] loading CIFAR-10 data...")
((trainX, trainY), (testX, testY)) = cifar10.load_data()
trainX = trainX.astype("float") /255.0
testX = testX.astype("float") / 255.0
lb = LabelBinarizer()
trainY = lb.fit_transform(trainY)
testY = lb.transform(testY)
labelNames = ["airplane", "automobile", "bird", "cat", "deer", "dog",
"frog", "horse", "ship", "truck"]
print("[INFO] compiling model...")
opt = SGD(learning_rate=0.01, decay=0.01/40, momentum=0.9, nesterov=True)
model = MiniVGGNet.build(width=32, height=32, depth=3, classes=10)
model.compile(loss="categorical_crossentropy", optimizer=opt, metrics=["accuracy"])
print("[INFO] training network...")
H = model.fit(trainX,trainY,validation_data=(testX, testY), batch_size=64, epochs=40, verbose=1)
print("[INFO] evaluating network....")
predictions = model.predict(testX, batch_size=64)
print(classification_report(testY.argmax(axis=1), predictions.argmax(axis=1), target_names=labelNames))
plt.style.use("ggplot")
plt.figure()
plt.plot(np.arange(0, 40), H.history["loss"], label="train_loss")
plt.plot(np.arange(0, 40), H.history["val_loss"], label="val_loss")
plt.plot(np.arange(0, 40), H.history["accuracy"], label="train_accuracy")
plt.plot(np.arange(0, 40), H.history["val_accuracy"], label="val_accuracy")
plt.title("Training Loss And Accuracy On CIFAR-10")
plt.xlabel("Epoch")
plt.ylabel("Loss/Accuracy")
plt.legend()
plt.savefig(r"E:\\PycharmProjects\\DLstudy\\result\\MiniVGGNet_On_Cifar10.png")
在获取cifar-10数据集时可能会报错,我们需要到网页https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz中手动下载,然后放在.keras/datasets/目录下,并将cifar-10-python.tar.gz重命名为cifar-10-batches-py.tar.gz,这样每次运行程序检测到本地已经有下载好的数据集,就不会再去网络上下载了。
运行结果
E:\\DLstudy\\Scripts\\python.exe E:/PycharmProjects/DLstudy/run/minivggnet_cifar10.py
2021-10-22 19:44:58.806381: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cudart64_110.dll'; dlerror: cudart64_110.dll not found
2021-10-22 19:44:58.806748: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
[INFO] loading CIFAR-10 data...
[INFO] compiling model...
2021-10-22 19:45:22.204239: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'nvcuda.dll'; dlerror: nvcuda.dll not found
2021-10-22 19:45:22.204718: W tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303)
2021-10-22 19:45:22.407002: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:169] retrieving CUDA diagnostic information for host: DESKTOP-VBBSMRF
2021-10-22 19:45:22.407654: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:176] hostname: DESKTOP-VBBSMRF
2021-10-22 19:45:22.454679: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
[INFO] training network...
2021-10-22 19:45:31.153369: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:185] None of the MLIR Optimization Passes are enabled (registered 2)
Epoch 1/40
782/782 [==============================] - 324s 410ms/step - loss: 1.6101 - accuracy: 0.4619 - val_loss: 1.4320 - val_accuracy: 0.5093
Epoch 2/40
782/782 [==============================] - 297s 380ms/step - loss: 1.1239 - accuracy: 0.6119 - val_loss: 1.1982 - val_accuracy: 0.5898
Epoch 3/40
782/782 [==============================] - 296s 378ms/step - loss: 0.9459 - accuracy: 0.6684 - val_loss: 0.9293 - val_accuracy: 0.6719
Epoch 4/40
782/782 [==============================] - 273s 348ms/step - loss: 0.8489 - accuracy: 0.7030 - val_loss: 0.7934 - val_accuracy: 0.7203
Epoch 5/40
782/782 [==============================] - 252s 323ms/step - loss: 0.7798 - accuracy: 0.7266 - val_loss: 0.7070 - val_accuracy: 0.7471
Epoch 6/40
782/782 [==============================] - 257s 329ms/step - loss: 0.7207 - accuracy: 0.7451 - val_loss: 0.7138 - val_accuracy: 0.7534
Epoch 7/40
782/782 [==============================] - 328s 419ms/step - loss: 0.6782 - accuracy: 0.7621 - val_loss: 0.6627 - val_accuracy: 0.7709
Epoch 8/40
782/782 [==============================] - 286s 366ms/step - loss: 0.6377 - accuracy: 0.7759 - val_loss: 0.6518 - val_accuracy: 0.7737
Epoch 9/40
782/782 [==============================] - 268s 343ms/step - loss: 0.6082 - accuracy: 0.7863 - val_loss: 0.6610 - val_accuracy: 0.7720
Epoch 10/40
782/782 [==============================] - 271s 347ms/step - loss: 0.5835 - accuracy: 0.7935 - val_loss: 0.6093 - val_accuracy: 0.7878
Epoch 11/40
782/782 [==============================] - 270s 345ms/step - loss: 0.5516 - accuracy: 0.8035 - val_loss: 0.6036 - val_accuracy: 0.7903
Epoch 12/40
782/782 [==============================] - 251s 321ms/step - loss: 0.5255 - accuracy: 0.8129 - val_loss: 0.5873 - val_accuracy: 0.7979
Epoch 13/40
782/782 [==============================] - 251s 321ms/step - loss: 0.5093 - accuracy: 0.8178 - val_loss: 0.5878 - val_accuracy: 0.7981
Epoch 14/40
782/782 [==============================] - 284s 363ms/step - loss: 0.4881 - accuracy: 0.8274 - val_loss: 0.5716 - val_accuracy: 0.8056
Epoch 15/40
782/782 [==============================] - 289s 370ms/step - loss: 0.4730 - accuracy: 0.8321 - val_loss: 0.5920 - val_accuracy: 0.8014
Epoch 16/40
782/782 [==============================] - 331s 423ms/step - loss: 0.4581 - accuracy: 0.8374 - val_loss: 0.5892 - val_accuracy: 0.8005
Epoch 17/40
782/782 [==============================] - 272s 348ms/step - loss: 0.4394 - accuracy: 0.8434 - val_loss: 0.5592 - val_accuracy: 0.8095
Epoch 18/40
782/782 [==============================] - 269s 344ms/step - loss: 0.4253 - accuracy: 0.8488 - val_loss: 0.5580 - val_accuracy: 0.8139
Epoch 19/40
782/782 [==============================] - 296s 378ms/step - loss: 0.4098 - accuracy: 0.8548 - val_loss: 0.5629 - val_accuracy: 0.8128
Epoch 20/40
782/782 [==============================] - 290s 371ms/step - loss: 0.3983 - accuracy: 0.8574 - val_loss: 0.5820 - val_accuracy: 0.8075
Epoch 21/40
782/782 [==============================] - 270s 345ms/step - loss: 0.3898 - accuracy: 0.8616 - val_loss: 0.5691 - val_accuracy: 0.8119
Epoch 22/40
782/782 [==============================] - 307s 392ms/step - loss: 0.3791 - accuracy: 0.8642 - val_loss: 0.5596 - val_accuracy: 0.8137
Epoch 23/40
782/782 [==============================] - 308s 393ms/step - loss: 0.3712 - accuracy: 0.8687 - val_loss: 0.5546 - val_accuracy: 0.8186
Epoch 24/40
782/782 [==============================] - 285s 364ms/step - loss: 0.3537 - accuracy: 0.8734 - val_loss: 0.5523 - val_accuracy: 0.8210
Epoch 25/40
782/782 [==============================] - 265s 339ms/step - loss: 0.3509 - accuracy: 0.8742 - val_loss: 0.5577 - val_accuracy: 0.8182
Epoch 26/40
782/782 [==============================] - 268s 343ms/step - loss: 0.3405 - accuracy: 0.8776 - val_loss: 0.5586 - val_accuracy: 0.8193
Epoch 27/40
782/782 [==============================] - 255s 326ms/step - loss: 0.3295 - accuracy: 0.8825 - val_loss: 0.5367 - val_accuracy: 0.8214
Epoch 28/40
782/782 [==============================] - 257s 329ms/step - loss: 0.3228 - accuracy: 0.8850 - val_loss: 0.5467 - val_accuracy: 0.8218
Epoch 29/40
782/782 [============================以上是关于2021-10-22计算机视觉:4更深的卷积神经网络:MiniVGGNet的主要内容,如果未能解决你的问题,请参考以下文章
Coursera 深度学习 吴恩达 deep learning.ai 笔记整理(4-1)—— 卷积神经网络
CS231n 卷积神经网络与计算机视觉 9 卷积神经网络结构分析