在预训练模型前添加 Conv 层会产生 ValueError

Posted

技术标签:

【中文标题】在预训练模型前添加 Conv 层会产生 ValueError【英文标题】:Adding Conv Layer in front of pretrained model gives ValueError 【发布时间】:2019-01-21 13:06:13 【问题描述】:

我想将一个预训练的 VGG16 模型与一个特殊的输入块相结合,它是一个输入层和一个卷积层。目标是在灰度图像上使用预训练的 RGB VGG16 imagenet 模型:

from keras.applications.vgg16 import VGG16
from keras.layers.convolutional import Conv2D
from keras.layers import Input
from keras.models import Model

img_height = 299
img_width = 299

def input_block(img_height = 299, img_width = 299):
    input_shape = (img_height, img_width, 1)
    img_input = Input(shape=input_shape, name = 'grayscale_input_layer')
    x = Conv2D(3, (3,3),  padding= 'same', name = 'grayscale_RGB_layer')(img_input)
    return x

pretrained_model = VGG16(weights = 'imagenet', include_top=False, input_tensor = input_block(img_height, img_width))

当我将VGG16() 的权重初始化设置为'None' 时,模型正确构建,具有以下所需结构:

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
grayscale_input_layer (Input (None, 299, 299, 1)       0         
_________________________________________________________________
grayscale_RGB_layer (Conv2D) (None, 299, 299, 3)       30        
_________________________________________________________________
block1_conv1 (Conv2D)        (None, 299, 299, 64)      1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, 299, 299, 64)      36928     
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, 149, 149, 64)      0         
_________________________________________________________________
block2_conv1 (Conv2D)        (None, 149, 149, 128)     73856     
_________________________________________________________________
block2_conv2 (Conv2D)        (None, 149, 149, 128)     147584    
_________________________________________________________________
block2_pool (MaxPooling2D)   (None, 74, 74, 128)       0         
_________________________________________________________________
block3_conv1 (Conv2D)        (None, 74, 74, 256)       295168    
_________________________________________________________________
block3_conv2 (Conv2D)        (None, 74, 74, 256)       590080    
_________________________________________________________________
block3_conv3 (Conv2D)        (None, 74, 74, 256)       590080    
_________________________________________________________________
block3_pool (MaxPooling2D)   (None, 37, 37, 256)       0         
_________________________________________________________________
block4_conv1 (Conv2D)        (None, 37, 37, 512)       1180160   
_________________________________________________________________
block4_conv2 (Conv2D)        (None, 37, 37, 512)       2359808   
_________________________________________________________________
block4_conv3 (Conv2D)        (None, 37, 37, 512)       2359808   
_________________________________________________________________
block4_pool (MaxPooling2D)   (None, 18, 18, 512)       0         
_________________________________________________________________
block5_conv1 (Conv2D)        (None, 18, 18, 512)       2359808   
_________________________________________________________________
block5_conv2 (Conv2D)        (None, 18, 18, 512)       2359808   
_________________________________________________________________
block5_conv3 (Conv2D)        (None, 18, 18, 512)       2359808   
_________________________________________________________________
block5_pool (MaxPooling2D)   (None, 9, 9, 512)         0         
=================================================================
Total params: 14,714,718
Trainable params: 14,714,718
Non-trainable params: 0
_________________________________________________________________
None

但是,当我将权重初始化设置为'imagenet' 时, 我收到以下错误:

ValueError: 您正在尝试将包含 13 层的权重文件加载到具有 14 层的模型中。

这个错误是有道理的,因为我在 VGG16 模型前面添加了两层而不是单层。

作为一种解决方法,我尝试了以下方法:

def input_block_model(img_height = 299, img_width = 299):
    input_shape = (img_height, img_width, 1)
    img_input = Input(shape=input_shape, name = 'grayscale_input_layer')
    x = Conv2D(3, (3,3),  padding= 'same', name = 'grayscale_RGB_layer')(img_input)
    model = Model(img_input, x, name='input_block_model')
    return model

input_model = input_block_model(299,299)
pretrained_model = VGG16(weights = "imagenet", include_top=False)
combined_model = Model(input_model.input, 
pretrained_model(input_model.output))
print(combined_model.summary())

那么,模型结构是:

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
grayscale_input_layer (Input (None, 299, 299, 1)       0         
_________________________________________________________________
grayscale_RGB_layer (Conv2D) (None, 299, 299, 3)       30        
_________________________________________________________________
vgg16 (Model)                multiple                  14714688  
=================================================================
Total params: 14,714,718
Trainable params: 14,714,718
Non-trainable params: 0
_________________________________________________________________
None

这种结构的缺点是我无法在 VGG16 模型中设置层的属性。例如,我想冻结此模型中的某些层,我无法通过combined_model.layers 访问这些层。有没有人有一个可行的解决方案,让我得到与 'None' 初始化一样的模型结构,但使用预训练的 ImageNet 权重?

【问题讨论】:

当然你可以使用combined_model.layers[2].layers访问VGG16模型的层。 【参考方案1】:

您可以使用上面评论中提到的combined_model.layers[2].layers 冻结或训练图层。您可以将模型简化如下:

```

img_input = Input(shape=(img_height, img_width, 1), name = 'grayscale_input_layer')
x = Conv2D(3, (3,3),  padding= 'same', name = 'grayscale_RGB_layer')(img_input)
x = VGG16(weights = None, include_top=False)(x)
model = Model(img_input, x)
model.summary()

for layer in model.layers[2].layers:
    layer.trainable = False

```

【讨论】:

你能说一下为什么我们在Conv2D前面使用(img_input)吗? python语法如何?

以上是关于在预训练模型前添加 Conv 层会产生 ValueError的主要内容,如果未能解决你的问题,请参考以下文章

微调(Fine-tune)原理

Huggingface 微调 - 如何在预训练的基础上构建自定义模型

Keras-在预训练好网络模型上进行fine-tune

用VGG19 预训练模型对一张图片进行卷积的时候,怎么获取这个图片的14*14*512的向量????急急急

向Google地图添加路况图层会产生额外费用吗?

向预先训练的说话人识别模型中添加新说话人