在带有 sklearn/Keras 的神经网络上使用 skopt 进行超参数优化

Posted

技术标签:

【中文标题】在带有 sklearn/Keras 的神经网络上使用 skopt 进行超参数优化【英文标题】:Hyperparameter optimization using skopt on Neural Networks with sklearn/Keras 【发布时间】:2020-09-03 05:31:47 【问题描述】:

我在使用 skopt 库时遇到问题。我尝试优化神经网络的大小,即神经元和层大小,但是我得到的结果与预期相反。优化不喜欢这些参数,它会产生奇怪的结果,比如 NN 有 4 层,每层有 1 个神经元和 RMSE 8*1e-4。我知道使用高斯过程和皮尔逊指数 0.999 的 RMSE 的最佳值为 4*1e-4,但我无法得到相同的结果。有什么想法吗?

from skopt import gbrt_minimize
#--------- Hyperparameters -------------------
space = [ 
         Real(1e-6, 1e-1,   name='alpha' ),
         Real(1e-6,1e-2 ,   name='learning_rate_init', prior ='log-uniform', ),
         Real(0.9, 0.9999 , name='beta_1'),
         Real(0.95, 1 ,     name='beta_2'),
         Integer(1,4,       name='n_hidden_layer'),
         Integer(1,700,     name='n_neurons_per_layer')     ]
#------------------------------------------------------

#---------Initialization --------------------

alpha_0 = 0.0001
learning_rate_init_0 = 0.001
beta_1_0 = 0.9001
beta_2_0 = 0.999
n_hidden_layer_0 = 3
n_neurons_per_layer_0 = 400

default_parameters = [ alpha_0,
                      learning_rate_init_0,
                      beta_1_0,
                      beta_2_0,
                      n_hidden_layer_0,
                      n_neurons_per_layer_0,  ]
#------------------------------------------------------

#---------MLPRegressor model --------------------

EPOCHS = 400

mlp = MLPRegressor(hidden_layer_sizes=(400,400,400),
                      activation='relu',
                      solver='adam',
                      alpha=alpha_0, #L2 penalty (regularization term) paramete
                      batch_size=16,
                      learning_rate='invscaling', #Only used when solver=’sgd’.
                      learning_rate_init=learning_rate_init_0,
                      power_t=0.5, #Only used when solver=’sgd’.
                      max_iter= EPOCHS, #EPOCHS
                      shuffle=True,
                      random_state=42,
                      tol=1e-12,
                      verbose=0,
                      warm_start=False,
                      momentum=0.9,    #only used when sgd
                      nesterovs_momentum=True, #only used when sgd
                      early_stopping=False,    #will automatically set aside 10% of training data as validation# 10% too much
                      validation_fraction=0.1,
                      beta_1=beta_1_0,
                      beta_2=beta_2_0,
                      epsilon=1e-08,
                      n_iter_no_change=20,
                      max_fun=15000   ) #Only used when solver=’lbfgs’.
#------------------------------------------------------

@use_named_args(space)

def objective(**params):
    n_neurons=params['n_neurons_per_layer']
    n_layers=params['n_hidden_layer']

    # create the hidden layers as a tuple with length n_layers and n_neurons per layer
    params['hidden_layer_sizes']=(n_neurons,)*n_layers

    # the parameters are deleted to avoid an error from the MLPRegressor
    params.pop('n_neurons_per_layer')
    params.pop('n_hidden_layer')

    mlp.set_params(**params)


    y_pred = cross_val_predict(mlp,
                             x_main,
                             y_main,
                             cv=5,       # 5-Kfold cross validation
                             n_jobs=-1,)

    rmse = np.sqrt(mean_squared_error(y_main_o,y_pred))
    pearson = float(correlation.result(y_pred, y_main_o))
    obj_pear = 1.0-pearson
    print(f"Cross-validated Pearson's: pearson ")
    print(f"Cross-validated score (RMSE): rmse")
    print('-'*40)

    return rmse+obj_pear  #+obj_pear          #RMSE as objective function
#------------------------------------------------------

n_calls = 110

#verbose = VerboseCallback(n_total=n_calls)
delta = DeltaXStopper(1e-6)




mlp_gbrt = gbrt_minimize(objective, 
                       space,
                       n_calls=n_calls,
                       n_random_starts=100,
                       acq_func='EI',
                       acq_optimizer='auto',
                       x0=default_parameters,
                       y0=None,
                       random_state=42,
                       verbose=1,
                       callback= [delta],
                       n_points=10000,
                       xi=1e-1,
                       kappa=1.96,
                       n_jobs=-1,
                       model_queue_size=None)

【问题讨论】:

你能分享一下你的优化结果吗? 【参考方案1】:

您的目标函数需要返回 -RMSE。

【讨论】:

以上是关于在带有 sklearn/Keras 的神经网络上使用 skopt 进行超参数优化的主要内容,如果未能解决你的问题,请参考以下文章

为啥 jquery.animate 在 textarea 上使闪烁的光标消失?

如何在 Swift 上使这段代码异步

如何在 Android 上使背景透明 20%

如何在 Android 上使图像透明?

如何在 Blazorise 上使验证日期不为空

如何在 iPhone 上使形状不规则的透明 CALayer 变暗?