虽然我使用的是 StratifiedKFold,但准确度始终为 0.5

Posted

技术标签:

【中文标题】虽然我使用的是 StratifiedKFold,但准确度始终为 0.5【英文标题】:Although I'm using StratifiedKFold, accuracy is always 0.5 【发布时间】:2019-09-25 06:13:04 【问题描述】:

我正在使用预训练的 ResNet50 模型对malaria dataset 进行分类。我在它之后添加了两个密集层,分别有 1024、2048 个单元和一个使用 softmax 函数的分类层(使用 sigmoid 的结果更糟)。我使用 StratifiedKFold 来验证这个模型,但在第一次折叠后准确率始终为 0.5。

第一次折叠后,所有的时代都是这样的:

22047/22047  [==============================] - 37s 3ms/step - loss: 8.0596 - acc: 0.5000

这是我的模型:

height = 100 #dimensions of image
width = 100
channel = 3 #RGB
classes = 2

batch_size = 64 #vary depending on the GPU
epochs = 10
folds = 5
optimizer = "Adam"
metrics = ["accuracy"]
loss = 'categorical_crossentropy'

random_state = 1377
chanDim = -1

model = ResNet50(include_top=False, weights="imagenet", input_shape=(height, width, channel))

# Get the ResNet50 layers up to res5c_branch2c
model = Model(input=model.input, output=model.get_layer('res5c_branch2c').output)

for layer in model.layers:
    layer.trainable = False 

Flatten1 = Flatten()(model.output)

F1 = Dense(1024, activation='relu')(Flatten1)
D1 = Dropout(0.5)(F1)

F2 = Dense(2048, activation='relu')(D1)
D2 = Dropout(0.2)(F2)

F3 = Dense(classes, activation='softmax')(D2)

model = Model(inputs = model.input, outputs = F3)

# Compile the model
model.compile(loss = loss, optimizer = optimizer, metrics = metrics)

这是验证部分:

# Create a model compatible with sklearn
model = KerasClassifier(build_fn=customResnetBuild, epochs=epochs, batch_size=batch_size)
kfold = StratifiedKFold(n_splits=folds, shuffle=False, random_state=random_state)

# Make a custom score for classification report method to get results for mean of the all folds
def classification_report_with_accuracy_score(y_true, y_pred):
    originalclass.extend(y_true)
    predictedclass.extend(y_pred)
    return accuracy_score(y_true, y_pred) # return accuracy score

scores = cross_val_score(model, data, labels, cv=kfold, error_score="raise", scoring=make_scorer(classification_report_with_accuracy_score) )
print(classification_report(originalclass, predictedclass)) 

结果

Mean of results:  0.6404469896025613
          precision    recall  f1-score   support

       0       0.86      0.34      0.48     13781
       1       0.59      0.94      0.72     13779

   micro avg       0.64      0.64      0.64     27560
   macro avg       0.72      0.64      0.60     27560
weighted avg       0.72      0.64      0.60     27560

【问题讨论】:

【参考方案1】:

This 就是答案。概括一下问题是#parameters 多于#dataset 和usage of trainable=false 是错误的。

【讨论】:

以上是关于虽然我使用的是 StratifiedKFold,但准确度始终为 0.5的主要内容,如果未能解决你的问题,请参考以下文章

StratifiedKFold vs StratifiedShuffleSplit vs StratifiedKFold + Shuffle

使用 StratifiedKFold 创建训练/测试/验证拆分

GridSearchCV + StratifiedKfold 在 TFIDF 的情况下

GridSearchCV 真的使用了 StratifiedKFold 吗?

StratifiedKFold的混淆矩阵和分类报告

如何使用 RandomizedSearchCV 正确实现 StratifiedKFold