XGBoost 提前停止 cv 与 GridSearchCV

Posted 2023-03-12

技术标签:

【中文标题】XGBoost 提前停止 cv 与 GridSearchCV【英文标题】：XGBoost early stopping cv versus GridSearchCV 【发布时间】：2017-09-18 10:23:45 【问题描述】：

我正在尝试 XGBoost 来解决回归问题。在超参数调优的过程中，无论num_boost_round 参数设置为什么，XGBoost 的提前停止 cv 都不会停止我的代码/数据。。此外，它产生 RMSE 分数比 GridSearchCV 差。 我在这里做错了什么？ 而且，如果我没有做错任何事，相比 GridSearchCV，提前停止 cv 有什么优势？

GridSearchCV：

import math
def RMSE(y_true, y_pred):
    rmse = math.sqrt(mean_squared_error(y_true, y_pred))
    print 'RMSE: %2.3f' % rmse
    return rmse
scorer = make_scorer(RMSE, greater_is_better=False)

cv_params = 'max_depth': [2,8], 'min_child_weight': [1,5]
ind_params = 'learning_rate': 0.01, 'n_estimators': 1000, 
              'seed':0, 'subsample': 0.8, 'colsample_bytree': 0.8,
             'reg_alpha':0, 'reg_lambda':1 #regularization => L1 : alpha, L2 : lambda
optimized_GBM = GridSearchCV(xgb.XGBRegressor(**ind_params), 
                             cv_params, 
                             scoring = scorer, 
                             cv = 5, verbose=1,
                             n_jobs = 1)
optimized_GBM.fit(train_X, train_Y)
optimized_GBM.grid_scores_

输出：

[mean: -62.42736, std: 5.18004, params: 'max_depth': 2, 'min_child_weight': 1,
 mean: -62.42736, std: 5.18004, params: 'max_depth': 2, 'min_child_weight': 5,
 mean: -57.11358, std: 3.62918, params: 'max_depth': 8, 'min_child_weight': 1,
 mean: -57.12148, std: 3.64145, params: 'max_depth': 8, 'min_child_weight': 5]

XGBoost 简历：

our_params = 'eta': 0.01, 'max_depth':8, 'min_child_weight':1,
              'seed':0, 'subsample': 0.8, 'colsample_bytree': 0.8, 
             'objective': 'reg:linear', 'booster':'gblinear', 
              'eval_metric':'rmse',
             'silent':False
num_rounds=1000

cv_xgb = xgb.cv(params = our_params, 
                dtrain = train_mat, 
                num_boost_round = num_rounds, 
                nfold = 5,
                metrics = ['rmse'], # Make sure you enter metrics inside a list or you may encounter issues!
                early_stopping_rounds = 100, # Look for early stopping that minimizes error
               verbose_eval = True) 

print cv_xgb.shape
print cv_xgb.tail(5)

输出：

(1000, 4)
     test-rmse-mean  test-rmse-std  train-rmse-mean  train-rmse-std
995       89.937926       0.263546        89.932823        0.062540
996       89.937773       0.263537        89.932671        0.062537
997       89.937622       0.263526        89.932517        0.062535
998       89.937470       0.263516        89.932364        0.062532
999       89.937317       0.263510        89.932210        0.062525

【问题讨论】：

【参考方案1】：

XGboost 忽略 num_boost_rounds（指定提前停止时）并继续适应时，我遇到了同样的问题。我敢打赌这是一个错误。

至于早停优于GridSearchCV的优势：

优点是你不必为 num_boost_rounds 尝试一系列值，但你最好自动停止。

提前停止旨在找到最佳的提升迭代次数。如果您为 num_boost_round 指定一个非常大的数字（即 10000），而最佳树数是 5261，它将停止在 5261+early_stopping_rounds，从而为您提供一个非常接近最优值的模型。

如果您想在不提前停止回合的情况下使用 GridSearchCV 找到相同的最优值，您将不得不尝试许多不同的 num_boost_rounds 值（即 100,200,300,...,5000,5100,5200,5300,...等... ）。这将需要更长的时间。

提早停止正在利用的特性是存在一个最佳的提升步骤数，之后验证错误开始增加。所以....

为什么它不适用于您的情况？

在没有数据的情况下无法准确地说出来，但很可能是以下几种原因的结合：

num_boost_round 太小（您会遇到 xgboost 重置并重新开始的错误，从而创建一个永无止境的循环） early_stopping_rounds 太大（可能您的数据具有强烈振荡的收敛行为。尝试较小的值，看看 CV 误差是否足够好）您的验证数据可能有些奇怪

为什么在 GridSearchCV 和 xgboost.cv 之间看到不同的结果？

如果没有完整的示例很难判断，但是您是否检查了仅在两个接口之一中指定的变量的所有默认值（例如 'reg_alpha':0, 'reg_lambda':1, 'objective ': 'reg:linear', 'booster':'gblinear') 以及您对 RMSE 的定义是否与 xgboost 的定义完全匹配？

【讨论】：

以上是关于XGBoost 提前停止 cv 与 GridSearchCV的主要内容，如果未能解决你的问题，请参考以下文章