选择KBest错误:ValueError:未知标签类型:(数组([0.55,0.84,0.72,0.54,0.59,0.77,0.85,1.03,1.62,3.04,3.6]),)

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了选择KBest错误:ValueError:未知标签类型:(数组([0.55,0.84,0.72,0.54,0.59,0.77,0.85,1.03,1.62,3.04,3.6]),)相关的知识,希望对你有一定的参考价值。

import pandas as pd
import numpy as np
from sklearn.feature_selection import SelectKBest ,chi2 

label_ds=pd.read_csv("D:/intern/bll_beijing.csv")  
array = label_ds.values

label_X  = array[:,1:]
label_y = array[:,0]

test = SelectKBest(score_func=chi2, k=4)
fit = test.fit(label_X, label_y)

我收到了这个:

Traceback (most recent call last):

    fit = test.fit(label_X, label_y)
  File "C:UsersTOSHIBAAppDataLocalProgramsPythonPython35libsite-packagessklearnfeature_selectionunivariate_selection.py", line 349, in fit
    score_func_ret = self.score_func(X, y)
  File "C:UsersTOSHIBAAppDataLocalProgramsPythonPython35libsite-packagessklearnfeature_selectionunivariate_selection.py", line 217, in chi2
    Y = LabelBinarizer().fit_transform(y)
  File "C:UsersTOSHIBAAppDataLocalProgramsPythonPython35libsite-packagessklearnpreprocessinglabel.py", line 307, in fit_transform
    return self.fit(y).transform(y)
  File "C:UsersTOSHIBAAppDataLocalProgramsPythonPython35libsite-packagessklearnpreprocessinglabel.py", line 284, in fit
    self.classes_ = unique_labels(y)
  File "C:UsersTOSHIBAAppDataLocalProgramsPythonPython35libsite-packagessklearnutilsmulticlass.py", line 97, in unique_labels
    raise ValueError("Unknown label type: %s" % repr(ys))

ValueError: Unknown label type: (array([0.55, 0.84, 0.72, 0.54, 0.59, 0.77, 0.85, 1.03, 1.62, 3.04, 3.6 ]),)

[Finished in 3.4s]

[ 0.55, 0.84, 0.72, 0.54, 0.59, 0.77, 0.85, 1.03, 1.62, 3.04, 3.6 ]是csv文档的第一列。

它出什么问题了?

答案

这个label_y有连续的价值观。

但是您已将评分函数指定为chi2。根据documentation of chi2,这仅适用于分类任务。

计算每个非负特征和类之间的卡方统计量。

对于回归任务,您可以使用以下内容:

以上是关于选择KBest错误:ValueError:未知标签类型:(数组([0.55,0.84,0.72,0.54,0.59,0.77,0.85,1.03,1.62,3.04,3.6]),)的主要内容,如果未能解决你的问题,请参考以下文章

ValueError:未知标签类型

MLP 分类器:“ValueError:未知标签类型”

伯努利朴素贝叶斯错误:ValueError:未知标签类型:(array([0, 0, 0, ..., 0, 0, 0], dtype=object),)

ValueError:未知标签类型:拟合数据时的“连续多输出”

ValueError:未知标签类型:“未知”-标签是数字

ValueError:未知标签类型:RandomForestClassifier 中的“未知”