ValueError:未知标签类型
Posted
技术标签:
【中文标题】ValueError:未知标签类型【英文标题】:ValueError: Unknown label type 【发布时间】:2018-12-16 07:03:54 【问题描述】:我正在尝试对 SVR 进行网格搜索,并按照Parameter estimation using grid search with cross-validation¶ 中给出的教程进行操作,但出现错误:
ValueError: Unknown label type: (array([[0.0970681 ],
[0.04160906],
[0.00209168],
...,
[0.92857565],
[0.64930691],
[0.20325924]]), array([6.38226813, 6.18596882, 6.03850002, ..., 4.68553846, 7.06541915,
7.8636379 ]))
我的代码是:
param = 'kernel': ['rbf'], 'gamma': [1e-2, 1e-3, 1e-4, 1e-5],
'C': [1, 10, 100, 1000]
regressor_1 = SVR(C=1)
TS_split = TimeSeriesSplit(n_splits=3)
scoring='neg_mean_squared_error'
clf = GridSearchCV(regressor_1, param, cv=cv=timeseries_split, verbose=True)
X_gridsearch = pre.MinMaxScaler(feature_range=(0,1)).fit(X_feature)
scaled_X_gridsearch = X_gridsearch.transform(X_feature)
y_gridsearch = pre.MinMaxScaler(feature_range=(0,1)).fit(y_label)
scaled_y_gridsearch = y_gridsearch.transform(y_label)
for scoring in scoring:
print("Hypter Parameters for %s" % scoring)
clf.fit(scaled_X_gridsearch,scaled_y_gridsearch )
print (scaled_y_gridsearch )
print (clf.best_params_)
mean = clf.cv_results_['mean_test_score']
std = clf.cv_results_['std_test_score']
for mean, std, params in zip(mean, std, clf.cv_results_['params']):
print("%0.3f (+/-%0.03f) for %r"
% (mean, std * 2, params))
print("Detailed classification report:")
y_true, y_pred= y_test, clf.predict(X_test)
print(classification_report(y_true, y_pred))
我的 scaled_y_gridsearch 数据是:
[0.11321139]
[0.07218848]
...
[0.64844211]
[0.4926122 ]
[0.4030334 ]]
我的 scaled_X_gridsearch 数据是:
[[0.2681013 ]
[0.03454225]
[0.02062136]
...
[0.92857565]
[0.64930691]
[0.20325924]]
完整的回溯错误信息是:
50 y_true, y_pred= y_test, clf.predict(X_test)
---> 51 print(classification_report(y_true, y_pred))
52
53
~/anaconda3_501/lib/python3.6/site-packages/sklearn/metrics/classification.py in classification_report(y_true, y_pred, labels, target_names, sample_weight, digits)
1419
1420 if labels is None:
-> 1421 labels = unique_labels(y_true, y_pred)
1422 else:
1423 labels = np.asarray(labels)
~/anaconda3_501/lib/python3.6/site-packages/sklearn/utils/multiclass.py in unique_labels(*ys)
95 _unique_labels = _FN_UNIQUE_LABELS.get(label_type, None)
96 if not _unique_labels:
---> 97 raise ValueError("Unknown label type: %s" % repr(ys))
98
99 ys_labels = set(chain.from_iterable(_unique_labels(y) for y in ys))
我不确定为什么这可能是我尽可能地遵循 Scikit 学习的例子。对此的帮助将不胜感激。
【问题讨论】:
你能把y_pred
中得到的值显示出来并打印出y_pred
的形状吗
@MohammedKashif 嗨,我的 y_pred 是:[6.38226813 6.18596882 6.03850002 ... 4.68553846 7.06541915 7.8636379 ] 而我的 y_shape 是 (4200)
【参考方案1】:
Classification report
不是用于回归,而是用于分类类型的问题。结帐this link 并在“回归指标”下查看,例如
r2_score、mean_squared_error、mean square log error等
如果这是分类问题,请将分类器从 SVR
更改为 SVC
,这应该可以工作。
【讨论】:
@Mohammed kashif 谢谢!以上是关于ValueError:未知标签类型的主要内容,如果未能解决你的问题,请参考以下文章
ValueError:未知标签类型:DecisionTreeClassifier() 中的“连续”