sklearn出现错误(LogisticRegression模型选择)
Posted
技术标签:
【中文标题】sklearn出现错误(LogisticRegression模型选择)【英文标题】:There is an error with sklearn (LogisticRegression model selection) 【发布时间】:2021-02-27 05:55:34 【问题描述】: import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.preprocessing import StandardScaler
Dt = pd.read_csv("D:\wisc_bc_data.csv")
'''
print(Dt.shape)
print(Dt.head())
'''
def changer(x):
if x == 'B':
return 0
else:
return 1
Dt['diagnosis'] = Dt['diagnosis'].map(lambda x: changer(x))
features = Dt[2:12]
Diagnosis = Dt['diagnosis']
train_features, test_features, train_labels, test_labels = train_test_split(features, Diagnosis) 'this line emits error code'
'''
this is my code and i used dataset from here: https://gomguard.tistory.com/52
'''
我想拆分数据以进行逻辑回归。但是,出现了这样的错误代码:
ValueError Traceback(最近一次调用最后一次) 在 ----> 1 train_features, test_features, train_labels, test_labels = train_test_split(features, Diagnosis)
D:\python\lib\site-packages\sklearn\model_selection_split.py in train_test_split(*arrays, **options) 2116 raise TypeError(“传递的参数无效:%s”% str(选项)) 2117 -> 2118 个数组 = 可索引(*数组) 2119 2120 n_samples = _num_samples(数组[0])
D:\python\lib\site-packages\sklearn\utils\validation.py in indexable(*iterables) 第246章 247 结果 = [_make_indexable(X) for X in iterables] --> 248 check_consistent_length(*结果) 249 返回结果 250
D:\python\lib\site-packages\sklearn\utils\validation.py in check_consistent_length(*arrays) 210 如果 len(uniques) > 1: 211 raise ValueError("发现输入变量的数量不一致" --> 212 " 样本: %r" % [int(l) for l in lengths]) 213 214
ValueError:发现样本数量不一致的输入变量:[10, 569] 我该如何解决?
【问题讨论】:
【参考方案1】:我认为features = Dt[2:12]
会导致您的错误。
您的尝试是对特征进行切片,但 python 将代码解释为切片记录。
所以,把代码改成Dt.iloc[:, 2:12
]。
【讨论】:
以上是关于sklearn出现错误(LogisticRegression模型选择)的主要内容,如果未能解决你的问题,请参考以下文章
python-sklearn中出现“ValueError:预期的二维数组,得到一维数组”错误[重复]
sklearn ShuffleSplit 出现“__init__() 参数 'n_splits' 的多个值”错误