XGBoost 最佳迭代

Posted 2023-03-12

技术标签:

【中文标题】XGBoost 最佳迭代【英文标题】：XGBoost Best Iteration 【发布时间】：2019-01-28 00:43:33 【问题描述】：

我正在使用 XGBoost 算法进行回归，

clf = XGBRegressor(eval_set = [(X_train, y_train), (X_val, y_val)],
                       early_stopping_rounds = 10, 
                       n_estimators = 10,                    
                       verbose = 50)

clf.fit(X_train, y_train, verbose=False)
print("Best Iteration: ".format(clf.booster().best_iteration))

它正确地训练自己，但打印函数引发以下错误，

TypeError: 'str' object is not callable

如何获得模型的最佳迭代次数？

此外，如何打印每个 round的training error？

【问题讨论】：

【参考方案1】：

对于您的 TypeError：使用 get_booster() 而不是 booster()

print("Best Iteration: ".format(clf.get_booster().best_iteration))

要在预测时使用最佳迭代次数，您有一个名为ntree_limit 的参数，它指定要使用的助推器数量。训练过程中产生的值是best_ntree_limit，可以在训练你的模型后调用它：clg.get_booster().best_ntree_limit。更具体地说，当您预测时，请使用：

best_iteration = clg.get_booster().best_ntree_limit
predict(data, ntree_limit=best_iteration)

如果您在.fit() 命令中指定这些参数，则可以打印您的训练和评估过程

clf.fit(X_train, y_train,
        eval_set = [(X_train, y_train), (X_val, y_val)],
        eval_metric = 'rmse',
        early_stopping_rounds = 10, verbose=True)

注意： early_stopping_rounds 参数应该在.fit() 命令中而不是在XGBRegressor() 实例化中。

另一个注意事项： verbose = 50 in XGBRegressor() 是多余的。 verbose 变量应该在您的 .fit() 函数中，并且是 True 或 False。对于 verbose=True 的作用，read here 在详细部分下。它直接影响你的第三个问题。

【讨论】：

【参考方案2】：

您的错误是XGBRegressor 的booster 属性是指定要使用的助推器类型的字符串，而不是实际的助推器实例。来自文档：

助推器：字符串 指定要使用的助推器：gbtree、gblinear 或 dart。

为了获得实际的助推器，您可以致电get_booster()：

>>> clf.booster
'gbtree'
>>> clf.get_booster()
<xgboost.core.Booster object at 0x118c40cf8>
>>> clf.get_booster().best_iteration
9
>>> print("Best Iteration: ".format(clf.get_booster().best_iteration))
Best Iteration: 9

我不确定你问题的后半部分，即：

另外，如何打印**每一轮**的训练误差？

但希望你没有被阻止！

【讨论】：

以上是关于XGBoost 最佳迭代的主要内容，如果未能解决你的问题，请参考以下文章