Keras中语义分割的不平衡数据?

Posted

技术标签:

【中文标题】Keras中语义分割的不平衡数据?【英文标题】:Imbalanced data for semantic segmentation in Keras? 【发布时间】:2019-09-05 12:48:37 【问题描述】:

我是 keras 的新手,现在已经学习了大约 3 周。如果我的问题听起来有点愚蠢,我深表歉意。

我目前正在做 512x512 的语义医学图像分割。我正在使用此链接中的 UNet https://github.com/zhixuhao/unet 。基本上,我想从图像中分割出大脑(所以分为两类分割,背景与前景)

我对网络进行了一些修改,并且得到了一些我很满意的结果。但是我认为我可以通过对前景施加更多的权重来改善分割结果,因为大脑的像素数量远小于背景像素的数量。在某些情况下,大脑不会出现在图像中,尤其是位于底部切片中的大脑。

不知道https://github.com/zhixuhao/unet中的哪部分代码需要修改

如果有人能帮我解决这个问题,我将不胜感激。提前非常感谢!

import numpy as np
import os
import skimage.io as io
import skimage.transform as trans
import numpy as np
from keras.models import *
from keras.layers import *
from keras.optimizers import *
from keras.callbacks import ModelCheckpoint, LearningRateScheduler
from keras import backend as keras


def unet(pretrained_weights=None, input_size=(256, 256, 1)):
  inputs = Input(input_size)
  conv1 = Conv2D(64, 3, activation='relu', padding='same', kernel_initializer='he_normal')(inputs)
  conv1 = BatchNormalization()(conv1)
  conv1 = Conv2D(64, 3, activation='relu', padding='same', kernel_initializer='he_normal')(conv1)
  conv1 = BatchNormalization()(conv1)
  pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)

  conv2 = Conv2D(128, 3, activation='relu', padding='same', kernel_initializer='he_normal')(pool1)
  conv2 = BatchNormalization()(conv2)
  conv2 = Conv2D(128, 3, activation='relu', padding='same', kernel_initializer='he_normal')(conv2)
  conv2 = BatchNormalization()(conv2)
  pool2 = MaxPooling2D(pool_size=(2, 2))(conv2)

  conv3 = Conv2D(256, 3, activation='relu', padding='same', kernel_initializer='he_normal')(pool2)
  conv3 = BatchNormalization()(conv3)
  conv3 = Conv2D(256, 3, activation='relu', padding='same', kernel_initializer='he_normal')(conv3)
  conv3 = BatchNormalization()(conv3)
  pool3 = MaxPooling2D(pool_size=(2, 2))(conv3)

  conv4 = Conv2D(512, 3, activation='relu', padding='same', kernel_initializer='he_normal')(pool3)
  conv4 = BatchNormalization()(conv4)
  conv4 = Conv2D(512, 3, activation='relu', padding='same', kernel_initializer='he_normal')(conv4)
  conv4 = BatchNormalization()(conv4)
  drop4 = Dropout(0.5)(conv4)
  pool4 = MaxPooling2D(pool_size=(2, 2))(drop4)

  conv5 = Conv2D(1024, 3, activation='relu', padding='same', kernel_initializer='he_normal')(pool4)
  conv5 = BatchNormalization()(conv5)
  conv5 = Conv2D(1024, 3, activation='relu', padding='same', kernel_initializer='he_normal')(conv5)
  conv5 = BatchNormalization()(conv5)
  drop5 = Dropout(0.5)(conv5)

  up6 = Conv2D(512, 2, activation='relu', padding='same', kernel_initializer='he_normal')(
      UpSampling2D(size=(2, 2))(drop5))
  merge6 = concatenate([drop4, up6], axis=3)
  conv6 = Conv2D(512, 3, activation='relu', padding='same', kernel_initializer='he_normal')(merge6)
  conv6 = BatchNormalization()(conv6)
  conv6 = Conv2D(512, 3, activation='relu', padding='same', kernel_initializer='he_normal')(conv6)
  conv6 = BatchNormalization()(conv6)

  up7 = Conv2D(256, 2, activation='relu', padding='same', kernel_initializer='he_normal')(UpSampling2D(size=(2, 2))(conv6))
  merge7 = concatenate([conv3, up7], axis=3)
  conv7 = Conv2D(256, 3, activation='relu', padding='same', kernel_initializer='he_normal')(merge7)
  conv7 = BatchNormalization()(conv7)
  conv7 = Conv2D(256, 3, activation='relu', padding='same', kernel_initializer='he_normal')(conv7)
  conv7 = BatchNormalization()(conv7)

  up8 = Conv2D(128, 2, activation='relu', padding='same', kernel_initializer='he_normal')(UpSampling2D(size=(2, 2))(conv7))
  merge8 = concatenate([conv2, up8], axis=3)
  conv8 = Conv2D(128, 3, activation='relu', padding='same', kernel_initializer='he_normal')(merge8)
  conv8 = BatchNormalization()(conv8)
  conv8 = Conv2D(128, 3, activation='relu', padding='same', kernel_initializer='he_normal')(conv8)
  conv8 = BatchNormalization()(conv8)

  up9 = Conv2D(64, 2, activation='relu', padding='same', kernel_initializer='he_normal')(UpSampling2D(size=(2, 2))(conv8))
  merge9 = concatenate([conv1, up9], axis=3)
  conv9 = Conv2D(64, 3, activation='relu', padding='same', kernel_initializer='he_normal')(merge9)
  conv9 = BatchNormalization()(conv9)
  conv9 = Conv2D(64, 3, activation='relu', padding='same', kernel_initializer='he_normal')(conv9)
  conv9 = BatchNormalization()(conv9)
  conv9 = Conv2D(2, 3, activation='relu', padding='same', kernel_initializer='he_normal')(conv9)
  conv9 = BatchNormalization()(conv9)


  conv10 = Conv2D(1, 1, activation='sigmoid')(conv9)

  model = Model(input=inputs, output=conv10)

  model.compile(optimizer=Adam(lr=1e-4), loss='binary_crossentropy', metrics=['accuracy'])

  # model.summary()

  if (pretrained_weights):
      model.load_weights(pretrained_weights)

  return model

这是 main.py

from model2 import *
from data2 import *
from keras.models import load_model

class_weight= 0:0.10, 1:0.90
myGene = trainGenerator(2,'data/brainTIF/trainNew','image','label',save_to_dir = None)
model = unet()
model_checkpoint = ModelCheckpoint('unet_brainTest_e10_s5.hdf5', 
monitor='loss')
model.fit_generator(myGene,steps_per_epoch=5,epochs=10,callbacks = [model_checkpoint])

testGene = testGenerator("data/brainTIF/test3")
results = model.predict_generator(testGene,18,verbose=1)
saveResult("data/brainTIF/test_results3",results)

【问题讨论】:

你能添加一些你已经尝试过的代码和你需要帮助的特定sn-ps吗?看看这里如何询问 MCVE ***.com/help/mcve 我已经编辑了原始帖子并包含了我的代码 你可以尝试像here那样使用自定义损失函数 使用focal_loss。正是为了那个东西! 谢谢@Nain 我找到了这个代码
 def focus_loss(gamma=2., alpha=.25): def focus_loss_fixed(y_true, y_pred): pt_1 = tf.where(tf.equal(y_true, 1), y_pred, tf.ones_like(y_pred)) pt_0 = tf.where(tf.equal(y_true, 0), y_pred, tf.zeros_like(y_pred)) pt_1 = K.clip(pt_1, 1e-3, .999) pt_0 = K.clip(pt_0, 1e-3, .999) 返回 -K.sum(alpha * K.pow(1. - pt_1, gamma) * K。 log(pt_1))-K.sum((1-alpha) * K.pow( pt_0, gamma) * K.log(1. - pt_0)) return focus_loss_fixed <br>什么是y_true,y_pred?他们是班级权重吗?
【参考方案1】:

作为class_weight 二进制类的一个选项,您还可以使用合成过采样技术 (SMOTE) 处理不平衡类,从而增加少数群体的规模:

from imblearn.over_sampling import SMOTE

sm = SMOTE()
x, y = sm.fit_sample(X_train, Y_train)

【讨论】:

谢谢@Rubens_Zimbres,但这是用于图像分割还是图像分类?就我而言,它是语义图像分割

以上是关于Keras中语义分割的不平衡数据?的主要内容,如果未能解决你的问题,请参考以下文章

不平衡数据的 class_weight - Keras

keras 图像预处理不平衡数据

weka中的不平衡数据集?不工作

从重采样到数据合成:如何处理机器学习中的不平衡分类问题?

scikit-learn 中的不平衡

大型多类 NLP 分类的不平衡数据和样本量