管道参数无效

Posted

技术标签:

【中文标题】管道参数无效【英文标题】:Invalid parameter for Pipeline 【发布时间】:2020-02-22 05:52:33 【问题描述】:

实现我的第一个管道,返回此错误:“无效参数 randomforestclassifier for estimator Pipeline”

skf = StratifiedKFold(n_splits=3, shuffle=True, random_state=SEED)

classifier = Pipeline([
    ('vectorizer', CountVectorizer(max_features=8000, ngram_range=(1, 5))),
    ('clf', RandomForestClassifier(n_estimators=10, random_state=15, n_jobs=-1))])

min_samples_leaf = [5, 6, 7, 8]
max_features = [0.3, 0.4, 0.5, 0.6, 0.7]
rfc_params = 'randomforestclassifier__min_samples_leaf': min_samples_leaf, 
              'randomforestclassifier__max_features':max_features

class_grid = GridSearchCV(classifier, param_grid = rfc_params, 
                          cv=skf, scoring='roc_auc', n_jobs=-1)
class_grid.fit(X_text, y_text)

【问题讨论】:

【参考方案1】:

这有帮助, 如果有人知道原因或其他解决方案 - 欢迎

vectorizer = CountVectorizer(max_features=8000, ngram_range=(1, 5)
clf = RandomForestClassifier(n_estimators=10, random_state=15, n_jobs=-1)               
classifier = make_pipeline(vectorizer, clf)

【讨论】:

【参考方案2】:

此错误是由于您命名分类器的方式不匹配造成的。您在管道中将其称为“clf”,但在“rfc_params”中引用了“randomforestclassifier”。

只需将“rfc_params”字典中的“clf”替换为“randomforestclassifier”即可。

工作示例:

from sklearn.model_selection import StratifiedKFold, GridSearchCV
from sklearn.pipeline import Pipeline
from sklearn.ensemble import RandomForestClassifier
from sklearn.feature_extraction.text import CountVectorizer

X_text = ["first text", "text", "text", "last text"]
y_text = [0, 1, 0, 1]

skf = StratifiedKFold(n_splits=2, shuffle=True, random_state=42)

classifier = Pipeline([
    ('vectorizer', CountVectorizer(max_features=8000, ngram_range=(1, 5))),
    ('clf', RandomForestClassifier(n_estimators=10, random_state=15, n_jobs=-1))])

min_samples_leaf = [5, 6, 7, 8]
max_features = [0.3, 0.4, 0.5, 0.6, 0.7]

rfc_params = 
    'clf__min_samples_leaf': min_samples_leaf, 
    'clf__max_features':max_features


class_grid = GridSearchCV(classifier, param_grid = rfc_params, 
                          cv=skf, scoring='roc_auc', n_jobs=-1)
class_grid.fit(X_text, y_text)

【讨论】:

以上是关于管道参数无效的主要内容,如果未能解决你的问题,请参考以下文章

GridSearchCV 和 ValueError:估计器管道的参数 alpha 无效

估算器管道 (SVR) 的参数无效

sklearn中估计器管道的参数clf无效

Angular4无效的管道参数

ValueError:使用 GridSearch 参数时估计器 CountVectorizer 的参数模型无效

OpenProcess 错误 87 参数无效