卷积神经网络架构——对吗?
Posted
技术标签:
【中文标题】卷积神经网络架构——对吗?【英文标题】:Convolutional Neural Net Architecture - correct? 【发布时间】:2020-08-14 16:15:22 【问题描述】:我正在尝试训练卷积神经网络。因此,我使用了 646 个图像/车牌的数据集,其中包含 8 个字符(0-9、A-Z;没有字母“O”和空格,总共 36 个可能的字符)。这些是我的训练数据X_train
。它们的形状是(646, 40, 200, 3)
,颜色代码为 3。我将它们调整为相同的形状。
我还有一个数据集,其中包含这些图像的标签,我将其单热编码为形状为 (646, 8, 36)
的 numpy 数组。该数据是我的y_train
数据。
现在,我正在尝试应用如下所示的神经网络: 架构取自这篇论文:https://ieeexplore.ieee.org/abstract/document/8078501
我排除了批量标准化部分,因为这部分对我来说不是最有趣的部分。但我对图层的顶部非常不确定。这意味着最后一个池化层之后的部分以model.add(Flatten())
开头...
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3), input_shape = (40, 200, 3), activation = "relu"))
model.add(Conv2D(32, kernel_size=(3, 3), activation = "relu"))
model.add(Conv2D(32, kernel_size=(3, 3), activation = "relu"))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(64, kernel_size=(3, 3), activation = "relu"))
model.add(Conv2D(64, kernel_size=(3, 3), activation = "relu"))
model.add(Conv2D(64, kernel_size=(3, 3), activation = "relu"))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(128, kernel_size=(3, 3), activation = "relu"))
model.add(Conv2D(128, kernel_size=(3, 3), activation = "relu"))
model.add(Conv2D(128, kernel_size=(3, 3), activation = "relu"))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(16000, activation = "relu"))
model.add(Dense(128, activation = "relu"))
model.add(Dense(36, activation = "relu"))
model.add(Dense(8*36, activation="Softmax"))
model.add(keras.layers.Reshape((8, 36)))
非常感谢您!
【问题讨论】:
【参考方案1】:假设下图与您的模型架构相匹配,则代码可用于创建模型。确保输入图像有一些填充。
import tensorflow as tf
from tensorflow.keras.models import Sequential, Model
from tensorflow.keras.layers import Conv2D, Flatten, MaxPooling2D, Dense, Input, Reshape, Concatenate
def create_model(input_shape = (40, 200, 3)):
input_img = Input(shape=input_shape)
model = Conv2D(32, kernel_size=(3, 3), input_shape = (40, 200, 3), activation = "relu")(input_img)
model = Conv2D(32, kernel_size=(3, 3), padding="same", activation = "relu")(model)
model = Conv2D(32, kernel_size=(3, 3), padding="same", activation = "relu")(model)
model = MaxPooling2D(pool_size=(2, 2))(model)
model = Conv2D(64, kernel_size=(3, 3), padding="same", activation = "relu")(model)
model = Conv2D(64, kernel_size=(3, 3), padding="same", activation = "relu")(model)
model = Conv2D(64, kernel_size=(3, 3), padding="same", activation = "relu")(model)
model = MaxPooling2D(pool_size=(2, 2))(model)
model = Conv2D(128, kernel_size=(3, 3), padding="same", activation = "relu")(model)
model = Conv2D(128, kernel_size=(3, 3), padding="same", activation = "relu")(model)
model = Conv2D(128, kernel_size=(3, 3), padding="same", activation = "relu")(model)
model = MaxPooling2D(pool_size=(2, 2))(model)
backbone = Flatten()(model)
branches = []
for i in range(8):
branches.append(backbone)
branches[i] = Dense(16000, activation = "relu", name="branch_"+str(i)+"_Dense_16000")(branches[i])
branches[i] = Dense(128, activation = "relu", name="branch_"+str(i)+"_Dense_128")(branches[i])
branches[i] = Dense(36, activation = "softmax", name="branch_"+str(i)+"_output")(branches[i])
output = Concatenate(axis=1)(branches)
output = Reshape((8, 36))(output)
model = Model(input_img, output)
return model
【讨论】:
非常感谢!训练这种模型的计算成本有多高? 我这里也有类似的问题,也是赏金的。我认为架构会非常相似? ***.com/questions/62695857/… 模型看起来很大。尤其是导致 16,000*8 个全连接节点的 flatten 层。它需要相当多的 GPU 内存才能进行有效的训练。 好的,谢谢!我认为我的项目不需要这么大的数字。另外一个问题:对于您上面描述的模型,我们需要使用函数式 API 和Sequential model
,对吗?
是的,这确实使用了 keras 中定义的顺序模型。只是一种不同的编码方式。以上是关于卷积神经网络架构——对吗?的主要内容,如果未能解决你的问题,请参考以下文章
[人工智能-深度学习-28]:卷积神经网络CNN - 网络架构与描述方法
卷积神经网络与Transformer结合,东南大学提出视频帧合成新架构 ConvTransformer