Keras-在预训练好网络模型上进行fine-tune

Posted 起床oO

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Keras-在预训练好网络模型上进行fine-tune相关的知识,希望对你有一定的参考价值。

在深度学习的学习过程中,可能会用到一些已经训练好的模型,比如Alex Net,google Net,VGG,Resnet等,那我们怎样对这些训练好的模型进行fine-tune来提高准确率呢?

参考文章:https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html

使用已经训练好的VGG16模型来帮助我们进行这个分类任务,因为要分类的是猫,狗这类物体,而VGG net是在ImageNet上训练的,而imageNet实际上已经包含了这2种物体(猫,狗)了。

方法

首先载入VGG-16的权重

接下来在初始化好的VGG网络上添加我们预训练好的模型

最后将最后一个卷积块的层数冻结,然后以很低的学习率开始训练(我们只选择最后一个卷积块进行训练,因为训练样本很少,而VGG模型层数很多,全部训练肯定不能训练好,会过拟合)。其次fine-tune是由于在一个已经训练好的模型上进行的,故权值更新应该是一个小范围的,以免破坏预训练好的特征。

首先构造VGG16模型

# build the VGG16 network
model = Sequential()
model.add(ZeroPadding2D((1, 1), input_shape=(3, img_width, img_height)))

model.add(Convolution2D(64, 3, 3, activation=relu, name=conv1_1))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(64, 3, 3, activation=relu, name=conv1_2))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))

model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(128, 3, 3, activation=relu, name=conv2_1))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(128, 3, 3, activation=relu, name=conv2_2))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))

model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(256, 3, 3, activation=relu, name=conv3_1))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(256, 3, 3, activation=relu, name=conv3_2))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(256, 3, 3, activation=relu, name=conv3_3))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))

model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(512, 3, 3, activation=relu, name=conv4_1))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(512, 3, 3, activation=relu, name=conv4_2))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(512, 3, 3, activation=relu, name=conv4_3))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))

model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(512, 3, 3, activation=relu, name=conv5_1))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(512, 3, 3, activation=relu, name=conv5_2))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(512, 3, 3, activation=relu, name=conv5_3))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))

加载VGG16训练好的权重(我们只要全连接以前的权重):

# load the weights of the VGG16 networks
# (trained on ImageNet, won the ILSVRC competition in 2014)
# note: when there is a complete match between your model definition
# and your weight savefile, you can simply call model.load_weights(filename)
assert os.path.exists(weights_path), Model weights not found (see "weights_path" variable in script).
f = h5py.File(weights_path)
for k in range(f.attrs[nb_layers]):
    if k >= len(model.layers):
        # we don‘t look at the last (fully-connected) layers in the savefile
        break
    g = f[layer_{}.format(k)]
    weights = [g[param_{}.format(p)] for p in range(g.attrs[nb_params])]
    model.layers[k].set_weights(weights)
f.close()
print(Model loaded.)

然后再VGG16结构基础上添加一个简单的分类器及预训练好的模型:

# build a classifier model to put on top of the convolutional model
top_model = Sequential()
top_model.add(Flatten(input_shape=model.output_shape[1:]))
top_model.add(Dense(256, activation=relu))
top_model.add(Dropout(0.5))
top_model.add(Dense(1, activation=sigmoid))

# note that it is necessary to start with a fully-trained
# classifier, including the top classifier,
# in order to successfully do fine-tuning
top_model.load_weights(top_model_weights_path)

# add the model on top of the convolutional base
model.add(top_model)

把随后一个卷积块前的权重设置为不训练:

# set the first 25 layers (up to the last conv block)
# to non-trainable (weights will not be updated)
for layer in model.layers[:25]:
    layer.trainable = False

# compile the model with a SGD/momentum optimizer
# and a very slow learning rate.
model.compile(loss=binary_crossentropy,
              optimizer=optimizers.SGD(lr=1e-4, momentum=0.9),
              metrics=[accuracy])

这样一个很简单的fine-tune在50个epoch后就可以达到一个大概0.94的accuracy

以上是关于Keras-在预训练好网络模型上进行fine-tune的主要内容,如果未能解决你的问题,请参考以下文章

Keras 没有在整个数据集上进行训练

Huggingface 微调 - 如何在预训练的基础上构建自定义模型

在 keras 的输出阶段组合多个预训练模型

使用来自 Keras 应用程序的模型,无需预训练权重

Keras - 分类器未从预训练模型的转移值中学习

使用 GPU 在 EC2 实例上训练 Keras 模型很慢