GridSearchCV 没有属性 best_estimator_

Posted

技术标签:

【中文标题】GridSearchCV 没有属性 best_estimator_【英文标题】:GridSearchCV has no attribute best_estimator_ 【发布时间】:2020-09-18 19:40:27 【问题描述】:

我在 20 分钟前工作时不断收到有关该属性的错误。我不确定会出现什么问题,当我在单独的笔记本上设置代码时,它运行并且 GridSearchCV 顺利移动。我需要更新 Scikit-Learn 吗?我发布了整个代码,因为我相信它是必不可少的,以防缺少一些细节。任何帮助表示赞赏。

import pandas as pd
train_data = pd.read_csv("~/Desktop/Personal/Data/train.csv")
test_features = pd.read_csv("~/Desktop/Personal/Data/test.csv")
test_survived = pd.read_csv("~/Desktop/Personal/Data/gender_submission.csv")

from sklearn.preprocessing import LabelEncoder
from sklearn.impute import SimpleImputer
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from sklearn.model_selection import GridSearchCV

def data_process(data):
    data = data.drop("Cabin", 1)
    data = data.drop("Embarked", 1)
    data = data.drop("Ticket",1)
    data = data.drop("Name", 1)
    data = data.drop("PassengerId", 1)
    data["Sex"] = LabelEncoder().fit_transform(data["Sex"])

    numerical_attr = ["Age", "Pclass", "SibSp", "Parch", "Fare"]

    for attr in numerical_attr:
        data[attr].fillna(round(data[attr].mean(), 0), inplace=True)
    return data

train_data = data_process(train_data)

test_features = data_process(test_features).to_numpy()

test_survived = test_survived.drop("PassengerId", 1).to_numpy()

full_train_features = train_data.drop("Survived", 1).to_numpy()

full_train_survived = train_data.drop(["Age", "Pclass", "SibSp", "Parch", "Fare", "Sex"], 1).to_numpy().ravel()

train_set,test_set = train_test_split(train_data, test_size = 0.3, random_state = 1)

part_train_set_features = train_set.drop("Survived", 1).to_numpy()
part_train_set_survived = train_set.drop(["Age", "Pclass", "SibSp", "Parch", "Fare", "Sex"], 1).to_numpy().ravel()

val_set_features = test_set.drop("Survived", 1).to_numpy()
val_set_survived = test_set.drop(["Age", "Pclass", "SibSp", "Parch", "Fare", "Sex"], 1).to_numpy().ravel()

log_reg = LogisticRegression(solver = 'liblinear')

log_reg.fit(part_train_set_features, part_train_set_survived)

predict_log_reg_base = log_reg.predict(val_set_features)
accuracy_log_reg_base = accuracy_score(predict_log_reg_base, val_set_survived)
print(accuracy_log_reg_base)


fixed_range1 = range(1,21)
c_values = [i/10 for i in fixed_range1]

fixed_range2 = range(10,21)
max_iter_values = [i*10 for i in fixed_range2]

parameters_log_reg = 'C' : c_values, 'penalty' : ['l1', 'l2'], 'max_iter' : max_iter_values

log_reg_best = GridSearchCV(LogisticRegression(solver = 'liblinear'), parameters_log_reg, return_train_score = True)

final_log_reg = log_reg_best.best_estimator_

【问题讨论】:

这能回答你的问题吗? How to get Best Estimator on GridSearchCV (Random Forest Classifier Scikit) 【参考方案1】:

你需要先适应它:

# define
log_reg_best = GridSearchCV(LogisticRegression(solver = 'liblinear'), parameters_log_reg, return_train_score = True)
# fit
log_reg_best.fit(part_train_set_features, part_train_set_survived)
# get best estimator
final_log_reg = log_reg_best.best_estimator_

【讨论】:

以上是关于GridSearchCV 没有属性 best_estimator_的主要内容,如果未能解决你的问题,请参考以下文章

Python:Ridge 回归 - ''Ridge' 对象在使用 GridSearchCV 后没有属性 'coef_'

AttributeError:“GridSearchCV”对象没有属性“best_params_”

如何从 gridsearchcv 绘制决策树?

使用 GridSearchCV 制作自定义记分器

使用 GridsearchCV 为管道中的最佳模型提取 MLPRegressor 属性 (n_iter_)?

为啥 GridSearchCV 没有给出最好的分数? - Scikit 学习