如何将 KerasClassifier、Hyperopt 和 Sklearn 交叉验证放在一起

Posted

技术标签:

【中文标题】如何将 KerasClassifier、Hyperopt 和 Sklearn 交叉验证放在一起【英文标题】:How to put KerasClassifier, Hyperopt and Sklearn cross-validation together 【发布时间】:2019-11-10 22:01:40 【问题描述】:

我正在使用 sklearn 在 Keras 模型上执行超参数调整优化 (hyperopt) 任务。我正在尝试使用 Sklearn 交叉验证来优化 KerasClassifiers,一些代码如下:

def create_model():
    model = Sequential()
    model.add(
        Dense(output_dim=params['units1'],
              input_dim=features_.shape[1],
              kernel_initializer="glorot_uniform"))
    model.add(Activation(params['activation']))
    model.add(Dropout(params['dropout1']))
    model.add(BatchNormalization())
    ...
    model.compile(loss='binary_crossentropy',
                  optimizer='adam',
                  metrics=['accuracy'])

    return model

现在我要做的是使用以下方式将 Hyperopt 参数传递给 KerasClassifier

def objective(params, n_folds=N_FOLDS):
    """Objective function for Hyperparameter Optimization"""

    # Keep track of evals
    global ITERATION

    ITERATION += 1

    clf = KerasClassifier(build_fn=create_model,**params)

    start = timer()

    # Perform n_folds cross validation
    cv_results = cross_val_score(clf,
                                 features_,
                                 labels,
                                 cv=5
                                 ).mean()

    run_time = timer() - start

    # Loss must be minimized
    loss = -cv_results

    # Dictionary with information for evaluation
    return 
        'loss': loss,
        'params': params,
        'iteration': ITERATION,
        'train_time': run_time,
        'status': STATUS_OK
    

我将搜索空间定义为:

space = 'units1': hp.choice('units1', [64, 128, 256, 512]),
    'units2': hp.choice('units2', [64, 128, 256, 512]),
    'dropout1': hp.choice('dropout1', [0.25, 0.5, 0.75]),
    'dropout2': hp.choice('dropout2', [0.25, 0.5, 0.75]),
    'batch_size': hp.choice('batch_size', [10, 20, 40, 60, 80, 100]),
    'nb_epochs': hp.choice('nb_epochs', [10, 50, 100]),
    'optimizer': opt_search_space,
    'activation': 'relu' 

运行优化

best = fmin(fn = objective, space = space, algo = tpe.suggest, 
            max_evals = MAX_EVALS, trials = bayes_trials, rstate = np.random.RandomState(50))

但它没有给出这个错误:

ValueError:激活不是合法参数

正确的做法是什么?

【问题讨论】:

【参考方案1】:

将超参数作为create_model函数的输入参数。然后你可以喂params dict。还要在搜索空间中将键 nb_epochs 更改为 epochs。阅读有关其他有效参数 here 的更多信息。

试试下面的简化示例。

import numpy as np
import pandas as pd
from sklearn.datasets import make_classification
from sklearn.model_selection import cross_val_score
from tensorflow.keras import Sequential
from tensorflow.keras.wrappers.scikit_learn import KerasClassifier
from tensorflow.keras.callbacks import EarlyStopping
from tensorflow.keras.layers import Dense, Dropout

import time

def timer():
   now = time.localtime(time.time())
   return now[5]


X, y = make_classification(n_samples=1000, n_classes=2,
                           n_informative=4, weights=[0.7, 0.3],
                           random_state=0)

定义keras模型:

def create_model(units1, activation, dropout):
    model = Sequential()
    model.add(Dense(units1,
                    input_dim=X.shape[1],
                    kernel_initializer="glorot_uniform",
                    activation=activation))
    model.add(Dropout(dropout))
    model.add(Dense(1,activation='sigmoid'))

    model.compile(loss='binary_crossentropy',
                  optimizer='adam',
                  metrics=['accuracy'])

    return model
def objective(params, n_folds=2):
    """Objective function for Hyperparameter Optimization"""

    # Keep track of evals
    global ITERATION

    ITERATION += 1

    clf = KerasClassifier(build_fn=create_model,**params)

    start = timer()

    # Perform n_folds cross validation
    cv_results = cross_val_score(clf, X, y,
                                 cv=5, 
                                 ).mean()

    run_time = timer() - start

    # Loss must be minimized
    loss = -cv_results

    # Dictionary with information for evaluation
    return 
        'loss': loss,
        'params': params,
        'iteration': ITERATION,
        'train_time': run_time,
        'status': STATUS_OK
    

from hyperopt import fmin, tpe, hp, Trials, STATUS_OK

space = 'units1': hp.choice('units1', [12, 64]),
         'dropout': hp.choice('dropout1', [0.25, 0.5]),
         'batch_size': hp.choice('batch_size', [10, 20]),
         'epochs': hp.choice('nb_epochs', [2, 3]),
         'activation': 'relu'
        

global ITERATION
ITERATION = 0

bayes_trials = Trials()

best = fmin(fn = objective, space = space, algo = tpe.suggest, 
            max_evals = 5, trials = bayes_trials, rstate = np.random.RandomState(50))

【讨论】:

感谢您的回答,请在下面查看我的其他问题 很高兴它有帮助!当你有一个新问题时,请提出一个新问题。不要发布问题作为答案。 我问我的新问题 (***.com/questions/56855499/…),请看一下,谢谢。

以上是关于如何将 KerasClassifier、Hyperopt 和 Sklearn 交叉验证放在一起的主要内容,如果未能解决你的问题,请参考以下文章

将 input_dim 传递给 KerasClassifier(sklearn 包装器/接口)

尽管一切正常,但 KerasClassifier 无法拟合模型

使用类权重的网格搜索和 KerasClassifier

为啥我得到 AttributeError:'KerasClassifier' 对象没有属性 'model'?

KerasClassifier 对象没有属性模型

sklearn RandomizedSearchCV 与流水线 KerasClassifier