如何使用tensorflow实现多类语义分割
Posted
技术标签:
【中文标题】如何使用tensorflow实现多类语义分割【英文标题】:How to implement multi-class semantic segmentation using tensorflow 【发布时间】:2020-04-16 07:25:50 【问题描述】:我正在尝试使用 tensorflow 和 tflearn 或 Keras 执行多类语义分割(我尝试了这两个 API)。与此处类似的问题 (How to load Image Masks (Labels) for Image Segmentation in Keras)
我必须用 3 个不同的类别分割图像的不同部分:海(0 类)、船(1 类)、天空(2 类)。
我有 100 张灰度图像(尺寸 400x400)。对于每张图片,我都有对应的标签和 3 个类别。最后,我得到了形状为 (100, 400, 400) 的图像和形状为 (100,400,400,3) 的标签。 (此处解释为:How to implement multi-class semantic segmentation?)
为了能够使用语义分割,我使用了一种热编码(比如这里:https://www.jeremyjordan.me/semantic-segmentation/),我最终得到了这个:
train_images.shape: (100,400,400,1)
train_labels.shape: (100,400,400,3)
其中标签如下:sea [1,0,0];船 [0,1,0],天空 [0,0,1]
但是,每次我尝试训练时都会收到此错误:
ValueError: Cannot feed value of shape (22, 240, 240, 3) for Tensor 'TargetsData/Y:0', which has shape '(?, 240, 240, 2)'
我用这个加载模型:
model = TheNet(input_shape=(None, 400, 40, 1))
编辑:这是我使用的模型
使用 TFlearn:
def TheNet(input_size = (80, 400, 400, 2), feature_map=8, kernel_size=5, keep_rate=0.8, lr=0.001, log_dir ="logs",savedir="Results/Session_Dump"):
# level 0 input
layer_0a_input = tflearn.layers.core.input_data(input_size) #shape=[None,n1,n2,n3,1])
# level 1 down
layer_1a_conv = tflearn_conv_2d(net=layer_0a_input, nb_filter=feature_map, kernel=5, stride=1, activation=False)
layer_1a_stack = tflearn_merge_2d([layer_0a_input]*feature_map, "concat")
layer_1a_stack = tflearn.activations.prelu(layer_1a_stack)
layer_1a_add = tflearn_merge_2d([layer_1a_conv,layer_1a_stack], "elemwise_sum")
layer_1a_down = tflearn_conv_2d(net=layer_1a_add, nb_filter=feature_map*2, kernel=2, stride=2, dropout=keep_rate)
# level 2 down
layer_2a_conv = tflearn_conv_2d(net=layer_1a_down, nb_filter=feature_map*2, kernel=kernel_size, stride=1, dropout=keep_rate)
layer_2a_conv = tflearn_conv_2d(net=layer_2a_conv, nb_filter=feature_map*2, kernel=kernel_size, stride=1, dropout=keep_rate)
layer_2a_add = tflearn_merge_2d([layer_1a_down,layer_2a_conv], "elemwise_sum")
layer_2a_down = tflearn_conv_2d(net=layer_2a_add, nb_filter=feature_map*4, kernel=2, stride=2, dropout=keep_rate)
# level 3 down
layer_3a_conv = tflearn_conv_2d(net=layer_2a_down, nb_filter=feature_map*4, kernel=kernel_size, stride=1, dropout=keep_rate)
layer_3a_conv = tflearn_conv_2d(net=layer_3a_conv, nb_filter=feature_map*4, kernel=kernel_size, stride=1, dropout=keep_rate)
layer_3a_conv = tflearn_conv_2d(net=layer_3a_conv, nb_filter=feature_map*4, kernel=kernel_size, stride=1, dropout=keep_rate)
layer_3a_add = tflearn_merge_2d([layer_2a_down,layer_3a_conv], "elemwise_sum")
layer_3a_down = tflearn_conv_2d(net=layer_3a_add, nb_filter=feature_map*8, kernel=2, stride=2, dropout=keep_rate)
# level 4 down
layer_4a_conv = tflearn_conv_2d(net=layer_3a_down, nb_filter=feature_map*8, kernel=kernel_size, stride=1, dropout=keep_rate)
layer_4a_conv = tflearn_conv_2d(net=layer_4a_conv, nb_filter=feature_map*8, kernel=kernel_size, stride=1, dropout=keep_rate)
layer_4a_conv = tflearn_conv_2d(net=layer_4a_conv, nb_filter=feature_map*8, kernel=kernel_size, stride=1, dropout=keep_rate)
layer_4a_add = tflearn_merge_2d([layer_3a_down,layer_4a_conv], "elemwise_sum")
layer_4a_down = tflearn_conv_2d(net=layer_4a_add, nb_filter=feature_map*16,kernel=2,stride=2,dropout=keep_rate)
# level 5
layer_5a_conv = tflearn_conv_2d(net=layer_4a_down, nb_filter=feature_map*16, kernel=kernel_size, stride=1, dropout=keep_rate)
layer_5a_conv = tflearn_conv_2d(net=layer_5a_conv, nb_filter=feature_map*16, kernel=kernel_size, stride=1, dropout=keep_rate)
layer_5a_conv = tflearn_conv_2d(net=layer_5a_conv, nb_filter=feature_map*16, kernel=kernel_size, stride=1, dropout=keep_rate)
layer_5a_add = tflearn_merge_2d([layer_4a_down,layer_5a_conv], "elemwise_sum")
layer_5a_up = tflearn_deconv_2d(net=layer_5a_add, nb_filter=feature_map*8, kernel=2, stride=2, dropout=keep_rate)
# level 4 up
layer_4b_concat = tflearn_merge_2d([layer_4a_add,layer_5a_up], "concat")
layer_4b_conv = tflearn_conv_2d(net=layer_4b_concat, nb_filter=feature_map*16, kernel=kernel_size, stride=1, dropout=keep_rate)
layer_4b_conv = tflearn_conv_2d(net=layer_4b_conv, nb_filter=feature_map*16, kernel=kernel_size, stride=1, dropout=keep_rate)
layer_4b_conv = tflearn_conv_2d(net=layer_4b_conv, nb_filter=feature_map*16, kernel=kernel_size, stride=1, dropout=keep_rate)
layer_4b_add = tflearn_merge_2d([layer_4b_conv,layer_4b_concat], "elemwise_sum")
layer_4b_up = tflearn_deconv_2d(net=layer_4b_add, nb_filter=feature_map*4, kernel=2, stride=2, dropout=keep_rate)
# level 3 up
layer_3b_concat = tflearn_merge_2d([layer_3a_add,layer_4b_up], "concat")
layer_3b_conv = tflearn_conv_2d(net=layer_3b_concat, nb_filter=feature_map*8, kernel=kernel_size, stride=1, dropout=keep_rate)
layer_3b_conv = tflearn_conv_2d(net=layer_3b_conv, nb_filter=feature_map*8, kernel=kernel_size, stride=1, dropout=keep_rate)
layer_3b_conv = tflearn_conv_2d(net=layer_3b_conv, nb_filter=feature_map*8, kernel=kernel_size, stride=1, dropout=keep_rate)
layer_3b_add = tflearn_merge_2d([layer_3b_conv,layer_3b_concat], "elemwise_sum")
layer_3b_up = tflearn_deconv_2d(net=layer_3b_add, nb_filter=feature_map*2, kernel=2, stride=2, dropout=keep_rate)
# level 2 up
layer_2b_concat = tflearn_merge_2d([layer_2a_add,layer_3b_up], "concat")
layer_2b_conv = tflearn_conv_2d(net=layer_2b_concat, nb_filter=feature_map*4, kernel=kernel_size, stride=1, dropout=keep_rate)
layer_2b_conv = tflearn_conv_2d(net=layer_2b_conv, nb_filter=feature_map*4, kernel=kernel_size, stride=1, dropout=keep_rate)
layer_2b_add = tflearn_merge_2d([layer_2b_conv,layer_2b_concat], "elemwise_sum")
layer_2b_up = tflearn_deconv_2d(net=layer_2b_add, nb_filter=feature_map, kernel=2, stride=2, dropout=keep_rate)
# level 1 up
layer_1b_concat = tflearn_merge_2d([layer_1a_add,layer_2b_up], "concat")
layer_1b_conv = tflearn_conv_2d(net=layer_1b_concat, nb_filter=feature_map*2, kernel=kernel_size, stride=1, dropout=keep_rate)
layer_1b_add = tflearn_merge_2d([layer_1b_conv,layer_1b_concat], "elemwise_sum")
# level 0 classifier
layer_0b_conv = tflearn_conv_2d(net=layer_1b_add, nb_filter=2, kernel=5, stride=1, dropout=keep_rate)
layer_0b_clf = tflearn.layers.conv.conv_2d(layer_0b_conv, 2, 1, 1, activation="softmax")
# Optimizer
regress = tflearn.layers.estimator.regression(layer_0b_clf, optimizer='adam', loss=dice_loss_2d, learning_rate=lr) # categorical_crossentropy/dice_loss_3d
model = tflearn.models.dnn.DNN(regress, tensorboard_dir=log_dir)
# Saving the model
if not os.path.lexists(savedir+"weights"):
os.makedirs(savedir+"weights")
model.save(savedir+"weights/weights_session")
return model
使用 Keras:
def TheNet(input_shape, nb_kernel, kernel_size, dropout, lr, log_dir ="logs",savedir="Results/Session_Dump"):
layer_0 = keras.Input(shape = input_shape)
#LVL 1 Down
layer_1_conv = Cust_2D_Conv(layer_0, nb_kernel, kernel_size, stride=1)
layer_1_stak = keras.layers.concatenate([layer_0,layer_0,layer_0,layer_0,layer_0,layer_0,layer_0,layer_0])
layer_1_stak = keras.layers.PReLU()(layer_1_stak)
layer_1_addd = keras.layers.Multiply()([layer_1_conv,layer_1_stak])
layer_1_down = Cust_2D_Conv(layer_1_addd, nb_kernel=nb_kernel*2, kernel_size=3, stride=2, dropout=0.2)
#LVL 2 Down
layer_2_conv = Cust_2D_Conv(layer_1_down, nb_kernel=nb_kernel*2, kernel_size=5, stride=1, dropout=0.2)
layer_2_conv = Cust_2D_Conv(layer_2_conv, nb_kernel=nb_kernel*2, kernel_size=5, stride=1, dropout=0.2)
layer_2_addd = keras.layers.Multiply()([layer_2_conv,layer_1_down])
layer_2_down = Cust_2D_Conv(layer_2_addd, nb_kernel=nb_kernel*4, kernel_size=3, stride=2, dropout=0.2)
#LVL 3 Down
layer_3_conv = Cust_2D_Conv(layer_2_down, nb_kernel=nb_kernel*4, kernel_size=5, stride=1, dropout=0.2)
layer_3_conv = Cust_2D_Conv(layer_3_conv, nb_kernel=nb_kernel*4, kernel_size=5, stride=1, dropout=0.2)
layer_3_conv = Cust_2D_Conv(layer_3_conv, nb_kernel=nb_kernel*4, kernel_size=5, stride=1, dropout=0.2)
layer_3_addd = keras.layers.Multiply()([layer_3_conv,layer_2_down])
layer_3_down = Cust_2D_Conv(layer_3_addd, nb_kernel=nb_kernel*8, kernel_size=3, stride=2, dropout=0.2)
#LVL 4 Down
layer_4_conv = Cust_2D_Conv(layer_3_down, nb_kernel=nb_kernel*8, kernel_size=5, stride=1, dropout=0.2)
layer_4_conv = Cust_2D_Conv(layer_4_conv, nb_kernel=nb_kernel*8, kernel_size=5, stride=1, dropout=0.2)
layer_4_conv = Cust_2D_Conv(layer_4_conv, nb_kernel=nb_kernel*8, kernel_size=5, stride=1, dropout=0.2)
layer_4_addd = keras.layers.Multiply()([layer_4_conv,layer_3_down])
layer_4_down = Cust_2D_Conv(layer_4_addd, nb_kernel=nb_kernel*16, kernel_size=3, stride=2, dropout=0.2)
#LVL 5 Down
layer_5_conv = Cust_2D_Conv(layer_4_down, nb_kernel=nb_kernel*16, kernel_size=5, stride=1, dropout=0.2)
layer_5_conv = Cust_2D_Conv(layer_5_conv, nb_kernel=nb_kernel*16, kernel_size=5, stride=1, dropout=0.2)
layer_5_conv = Cust_2D_Conv(layer_5_conv, nb_kernel=nb_kernel*16, kernel_size=5, stride=1, dropout=0.2)
layer_5_addd = keras.layers.Multiply()([layer_5_conv,layer_4_down])
layer_5_up = Cust_2D_DeConv(layer_5_addd, nb_kernel=nb_kernel*8, kernel_size=3, stride=2, dropout=0.2)
#LVL 4 Up
layer_4b_concat = keras.layers.concatenate([layer_5_up, layer_4_addd])
layer_4b_conv = Cust_2D_Conv(layer_4b_concat, nb_kernel=nb_kernel*16, kernel_size=5, stride=1, dropout=0.2)
layer_4b_conv = Cust_2D_Conv(layer_4b_conv, nb_kernel=nb_kernel*16, kernel_size=5, stride=1, dropout=0.2)
layer_4b_conv = Cust_2D_Conv(layer_4b_conv, nb_kernel=nb_kernel*16, kernel_size=5, stride=1, dropout=0.2)
layer_4b_addd = keras.layers.Multiply()([layer_4b_conv,layer_4b_concat])
layer_4b_up = Cust_2D_DeConv(layer_4b_addd, nb_kernel=nb_kernel*4, kernel_size=3, stride=2, dropout=0.2)
#LVL 3 Up
layer_3b_concat = keras.layers.concatenate([layer_4b_up, layer_3_addd])
layer_3b_conv = Cust_2D_Conv(layer_3b_concat, nb_kernel=nb_kernel*8, kernel_size=5, stride=1, dropout=0.2)
layer_3b_conv = Cust_2D_Conv(layer_3b_conv, nb_kernel=nb_kernel*8, kernel_size=5, stride=1, dropout=0.2)
layer_3b_conv = Cust_2D_Conv(layer_3b_conv, nb_kernel=nb_kernel*8, kernel_size=5, stride=1, dropout=0.2)
layer_3b_addd = keras.layers.Multiply()([layer_3b_conv,layer_3b_concat])
layer_3b_up = Cust_2D_DeConv(layer_3b_addd, nb_kernel=nb_kernel*2, kernel_size=3, stride=2, dropout=0.2)
#LVL 2 Up
layer_2b_concat = keras.layers.concatenate([layer_3b_up, layer_2_addd])
layer_2b_conv = Cust_2D_Conv(layer_2b_concat, nb_kernel=nb_kernel*4, kernel_size=5, stride=1, dropout=0.2)
layer_2b_conv = Cust_2D_Conv(layer_2b_conv, nb_kernel=nb_kernel*4, kernel_size=5, stride=1, dropout=0.2)
layer_2b_addd = keras.layers.Multiply()([layer_2b_conv,layer_2b_concat])
layer_2b_up = Cust_2D_DeConv(layer_2b_addd, nb_kernel=nb_kernel, kernel_size=3, stride=2, dropout=0.2)
#LVL 1 Up
layer_1b_concat = keras.layers.concatenate([layer_2b_up, layer_1_addd])
layer_1b_conv = Cust_2D_Conv(layer_1b_concat, nb_kernel=nb_kernel*2, kernel_size=5, stride=1, dropout=0.2)
layer_1b_addd = keras.layers.Multiply()([layer_1b_conv,layer_1b_concat])
#LVL 0
layer_0b_conv = Cust_2D_Conv(layer_1b_addd, nb_kernel=2, kernel_size=5, stride=1, dropout=0.2)
layer_0b_clf= keras.layers.Conv2D(2, 1, 1, activation="softmax")(layer_0b_conv)
model = keras.Model(inputs=layer_0, outputs=layer_0b_clf, name='Keras_model')
model.compile(loss=dice_loss_2d,
optimizer=keras.optimizers.Adam(),
metrics=['accuracy','categorical_accuracy'])
return model
我一直在四处寻找解决方案,但没有什么很清楚。
有人有什么想法或建议吗?
【问题讨论】:
你的train_labels.shape: (100,400,400,3)
,不应该是(100, 3)
吗?
请提供您的型号。也祝你圣诞快乐!
@AhmadBaracat,标签也是图像,我想执行像素分割,所以我有 100 张宽度为 400、高度为 400 和 3 个通道的图像(每个我想要标记的东西一个) .
【参考方案1】:
对于谁可能面临同样的问题,我找到了解决方案
问题不在于输入形状per-say。输入图像和标签的输入形状必须分别为 (100, 400, 400, 1) 和 (100, 400, 400, 3)。
但是,问题在于模型和模型的输出形状必须与模型的输入相匹配。在原始帖子中显示的代码中,输出形状直接来自这一行:
layer_0b_clf = tflearn.layers.conv.conv_2d(layer_0b_conv, 2, 1, 1, activation="softmax")
产生输出形状 (?,400,400,2),因此与评估的标签形状不匹配(即 (100, 400, 400, 3)。解决方案是输出通道数的变化来自以下模型:
- 用于 TFlearn:conv_2d(layer_0b_conv, 3, 1, 1, activation="softmax")
layer_0b_clf = tflearn.layers.conv.conv_2d(layer_0b_conv, 3, 1, 1, activation="softmax")
- 对于 Keras:Conv2D(3, 1, 1, activation="softmax")
layer_0b_clf= keras.layers.Conv2D(3, 1, 1, activation="softmax")(layer_0b_conv)
希望对某人有所帮助。
感谢您的 cmets 和阅读。
【讨论】:
以上是关于如何使用tensorflow实现多类语义分割的主要内容,如果未能解决你的问题,请参考以下文章