使用类权重的网格搜索和 KerasClassifier
Posted
技术标签:
【中文标题】使用类权重的网格搜索和 KerasClassifier【英文标题】:Grid search and KerasClassifier using class weights 【发布时间】:2019-06-06 04:47:57 【问题描述】:我正在尝试使用 scikit-learn RandomizedSearchCV
函数和 Keras KerasClassifier
包装器进行网格搜索,以解决我的不平衡多类分类问题。但是,当我尝试将class_weight
作为输入时,fit 方法给了我以下错误:
RuntimeError: Cannot clone object <keras.wrappers.scikit_learn.KerasClassifier object at 0x000002AA3C676710>, as the constructor either does not set or modifies parameter class_weight
以下是我用来构建KerasClassifier
的函数和RandomizedSearchCV
的脚本:
build_fn:
import keras as k
def build_keras_model(loss = 'sparse_categorical_crossentropy', metrics = ['accuracy'], optimiser = 'adam',
learning_rate = 0.001, n_neurons = 30, n_layers = 1, n_classes = 3,
l1_reg = 0.001, l2_reg = 0.001, batch_norm = False, dropout = None,
input_shape = (8,)):
model = k.models.Sequential()
model.add(k.layers.Dense(n_neurons,
input_shape = input_shape,
kernel_regularizer = k.regularizers.l1_l2(l1 = l1_reg, l2 = l2_reg),
activation = 'relu'))
if batch_norm is True:
model.add(k.layers.BatchNormalization())
if dropout is not None:
model.add(k.layers.Dropout(dropout))
i = 1
while i < n_layers:
model.add(k.layers.Dense(n_neurons,
kernel_regularizer = k.regularizers.l1_l2(l1 = l1_reg, l2 = l2_reg),
activation = 'relu'))
if batch_norm is True:
model.add(k.layers.BatchNormalization())
if dropout is not None:
model.add(k.layers.Dropout(dropout))
i += 1
del i
model.add(k.layers.Dense(n_classes, activation = 'softmax'))
if optimiser == 'adam':
koptimiser = k.optimizers.Adam(lr = learning_rate)
elif optimiser == 'adamax':
koptimiser = k.optimizers.Adamax(lr = learning_rate)
elif optimiser == 'nadam':
koptimiser = k.optimizers.Nadam(lr = learning_rate)
else:
print('Unknown optimiser type')
model.compile(optimizer = koptimiser, loss = loss, metrics = metrics)
model.summary()
return model
脚本:
import scipy as sp
from sklearn.utils.class_weight import compute_class_weight
from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.model_selection import RandomizedSearchCV
parameters =
'optimiser': ['adam', 'adamax', 'nadam'],
'learning_rate': sp.stats.uniform(0.0005, 0.0015),
'epochs': sp.stats.randint(500, 1501),
'n_neurons': sp.stats.randint(20, 61),
'n_layers': sp.stats.randint(1, 3),
'n_classes': [3],
'batch_size': sp.stats.randint(1, 11),
'l1_reg': sp.stats.reciprocal(1e-3, 1e1),
'l2_reg': sp.stats.reciprocal(1e-3, 1e1),
'batch_norm': [False],
'dropout': [None],
'metrics': [['accuracy']],
'loss': ['sparse_categorical_crossentropy'],
'input_shape': [(training_features.shape[1],)]
class_weights = compute_class_weight('balanced', np.unique(training_targets),
training_targets[target_label[0]])
class_weights = dict(enumerate(class_weights))
keras_model = KerasClassifier(build_fn = build_keras_model, verbose = 0, class_weight = class_weights)
clf = RandomizedSearchCV(keras_model, parameters, n_iter = 1, scoring = 'f1_micro',
n_jobs = 1, cv = 5, random_state = random_state)
clf.fit(training_features, training_targets.values[:, 0])
model = clf.best_estimator_
【问题讨论】:
啊,你有没有试过在 fit 方法中传递 class_weights:grid_result = clf.fit(training_features, training_targets.values[:, 0], clf__class_weight=class_weights)
当我尝试这样做时,出现以下错误:TypeError: Unrecognized keyword arguments: 'clf__class_weight': 0: 1.76, 1: 0.6285714285714286, 2: 1.1891891891891893
并且没有 clf__ 前缀?
是的,我现在试过了,它奏效了。谢谢!
好的,我将它作为答案发布,只是为了完成问题
【参考方案1】:
要在这种情况下使用KerasClassifier
传递class_weights,应该在fit 方法中传递class_weights,然后将其转发给keras 模型。
grid_result = clf.fit(training_features, training_targets.values[:, 0], class_weight=class_weights)
在旧版本中,必须使用 clf__ 前缀来传递它们:
grid_result = clf.fit(training_features, training_targets.values[:, 0], clf__class_weight=class_weights)
【讨论】:
【参考方案2】:当使用 KerasClassifier 时,要使用类权重,即使对于 GridSearch,使用 fit_params 功能添加多个参数,因为 build_fn 调用模型函数,不接受参数。
`
classifier = KerasClassifier(build_fn = build_classifier, epochs=20, batch_size = 128)
accuracies = cross_val_score(estimator=classifier, X = X_train, y = y_train, cv = 3,
n_jobs = -1, verbose=0,
fit_params = 'callbacks': [EarlyStopping()],
class_weight:class_weights)
`
【讨论】:
以上是关于使用类权重的网格搜索和 KerasClassifier的主要内容,如果未能解决你的问题,请参考以下文章