grid-search建模过程中自动调优

Posted 2021-02-07 chengziaichiyu

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了grid-search建模过程中自动调优相关的知识，希望对你有一定的参考价值。

1.梯度下降法

先是将需要调整的参数以字典形式存储到param_grid列表中，梯度下降法可调整参数，可参考以下链接

https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.SGDClassifier.html#sklearn.linear_model.SGDClassifier

# 导入GridSearch包

from sklearn.model_selection import GridSearchCV

sgd = SGDClassifier(max_iter=1000)

# 存储需要调整参数

param_grid = [

    {‘loss‘: [‘hinge‘, ‘log‘, ‘modified_huber‘, ‘squared_hinge‘,‘perceptron‘], ‘penalty‘: [‘none‘, ‘l2‘, ‘l1‘,‘elasticnet‘]},

  ]

# 这里使用十折交叉验证，选择评价指标

sgd_grid_search = GridSearchCV(sgd, param_grid, cv=10,

                           scoring=‘precision_macro‘)

# 实例化

sgd_grid_search.fit(tfidf_train_features, train_label)

# 显示调参过程，取的是十折交叉验证的平均值

for mean_score, params in zip(cvres["mean_test_score"], cvres["params"]):

    print(mean_score, params)

2.LogisticRegression

逻辑回归参数链接可参考：

https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html

在选择参数时，需注意

‘newton-cg‘，‘lbfgs‘和‘sag‘只处理L2，而‘liblinear‘和‘saga‘处理L1

from sklearn.linear_model import LogisticRegression

LR = LogisticRegression(max_iter=100,random_state=0,penalty=‘l1‘)

param_grid = [

    {‘C‘:[1,2,3,4,5],

     ‘solver‘:[‘liblinear‘,‘saga‘] },

  ]

LR_grid_search = GridSearchCV(LR, param_grid, cv=10,

                           scoring=‘precision_macro‘)

LR_grid_search.fit(tfidf_train_features, train_label)

cvres = LR_grid_search.cv_results_

for mean_score, params in zip(cvres["mean_test_score"], cvres["params"]):

    print(mean_score, params)

其他属性

grid.best_score_ #查看最佳分数(此处为f1_score)

grid.best_params_ #查看最佳参数

grid.best_estimator_ # 获取最佳模型

predict_y=best_model.predict(Test_X) # 进行预测

metrics.f1_score(y, predict_y) # 评分

以上是关于grid-search建模过程中自动调优的主要内容，如果未能解决你的问题，请参考以下文章