调用 XGBoost .fit 后的 Python sklearn NotFittedError
Posted
技术标签:
【中文标题】调用 XGBoost .fit 后的 Python sklearn NotFittedError【英文标题】:Python sklearn NotFittedError after XGBoost .fit has been called 【发布时间】:2020-10-19 00:02:45 【问题描述】:我正在尝试在 XGBoost 拟合模型上使用 sklearn plot_partial_dependence 函数,即在调用 .fit 之后。但我不断收到错误:
NotFittedError:此 XGBRegressor 实例尚未安装。在使用此估算器之前,使用适当的参数调用“fit”。
这是我使用虚拟数据集所采取的步骤。
带有虚拟数据的完整示例:
import numpy as np
# dummy dataset
from sklearn.datasets import make_regression
X_train, y_train = make_regression(n_samples = 1000, n_features = 10)
# Import xgboost
import xgboost as xgb
# Initialize the model
model_xgb_1 = xgb.XGBRegressor(max_depth = 5,
learning_rate = 0.01,
n_estimators = 100,
objective = 'reg:squarederror',
booster = 'gbtree')
# Fit the model
# Not assigning to a new variable
model_xgb_1.fit(X_train, y_train)
# Just to check that .predict can be called and works
# without error
print(np.sum(model_xgb_1.predict(X_train)))
# the above works ok and prints the output
#This next step throws an error:
from sklearn.inspection import plot_partial_dependence
plot_partial_dependence(model_xgb_1, X_train, [0])
输出:
662.3468
NotFittedError:此 XGBRegressor 实例尚未安装。在使用此估算器之前,使用适当的参数调用“fit”。
更新
booster = 'gblinear' 时的解决方法
# CHANGE 1/2: Use booster = 'gblinear'
# as no coef are returned for the case of 'gbtree'
model_xgb_1 = xgb.XGBRegressor(max_depth = 5,
learning_rate = 0.01,
n_estimators = 100,
objective = 'reg:squarederror',
booster = 'gblinear')
# Fit the model
# Not assigning to a new variable
model_xgb_1.fit(X_train, y_train)
# Just to check that .predict can be called and works
# without error
print(np.sum(model_xgb_1.predict(X_train)))
# the above works ok and prints the output
#This next step throws an error:
from sklearn.inspection import plot_partial_dependence
plot_partial_dependence(model_xgb_1, X_train, [0])
# CHANGE 2/2
# Add the following:
model_xgb_1.coef__ = model_xgb_1.coef_
model_xgb_1.intercept__ = model_xgb_1.intercept_
# Now call plot_partial_dependence --- It works ok
from sklearn.inspection import plot_partial_dependence
plot_partial_dependence(model_xgb_1, X_train, [0])
【问题讨论】:
在 xgboost 中可能没有正确考虑 sklearn 检查模型是否适合的方式。如果是这样,这可能会通过使用更新版本的 xgb 来解决。 【参考方案1】:为避免此错误,请勿将拟合模型影响到变量。
# Import xgboost
import xgboost as xgb
# Initialize the model
model_xgb_1 = xgb.XGBRegressor(max_depth = max_depth,
learning_rate = shrinkage,
n_estimators = nTrees,
objective = 'reg:squarederror',
booster = 'gbtree')
# Fit the model
model_xgb_1.fit(X_train, y_train)
# Just to check that .predict can be called and works
# without error
model_xgb_1.predict(X_train)
# the above works ok and prints the output
#This next step throws an error:
from sklearn.inspection import plot_partial_dependence
plot_partial_dependence(model_xgb_1, X_train, [0])
【讨论】:
进行了更改,但这并不能解决错误。我还将编辑问题以反映它。【参考方案2】:from sklearn.ensemble import VotingRegressor
XGB_v=VotingRegressor([("reg",XGB)],)
XGB_RMR=PartialDependenceDisplay.from_estimator(
XGB_v, x_train, features,
feature_names=["a"],line_kw="color": "blue"
)
这将帮助您解决问题。
【讨论】:
以上是关于调用 XGBoost .fit 后的 Python sklearn NotFittedError的主要内容,如果未能解决你的问题,请参考以下文章