为啥逻辑回归会抛出转换错误(valueerror)?

Posted

技术标签:

【中文标题】为啥逻辑回归会抛出转换错误(valueerror)?【英文标题】:Why does the Logistic Regression throws conversion error (valueerror)?为什么逻辑回归会抛出转换错误(valueerror)? 【发布时间】:2021-04-05 13:50:15 【问题描述】:

我正在为一家公司使用 Logistic 回归,以使用其客户数据找出导致客户流失的特定变量。

应用分析方法和评估方法。注释显示结果中两种方法的数据

x = dF.drop("Churn", axis=1)
y = dF["Churn"]

from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression

x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.3, random_state=1)

logmodel = LogisticRegression()
logmodel.fit(x_train, y_train)

输出:

    C:\Users\Rebecca\Anaconda3\lib\site-packages\sklearn\linear_model\logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.
      FutureWarning)

    ValueError                                
    Traceback (most recent call last)
    <ipython-input-268-0c050c82a577> in <module>
    ----> 1 logmodel.fit(x_train, y_train)

    ~\Anaconda3\lib\site- 
     packages\sklearn\linear_model\logistic.py in fit(self, X, y, sample_weight)
      1530 
      1531         X, y = check_X_y(X, y, 
    accept_sparse='csr', dtype=_dtype, order="C",
    -> 1532                          accept_large_sparse=solver != 'liblinear')
       1533         check_classification_targets(y)
       1534         self.classes_ = np.unique(y)

    ~\Anaconda3\lib\site-packages\sklearn\utils\validation.py in check_X_y(X, y, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, multi_output, ensure_min_samples, ensure_min_features, y_numeric, warn_on_dtype, estimator)
    717                     ensure_min_features=ensure_min_features,
    718                     warn_on_dtype=warn_on_dtype,
    --> 719                     estimator=estimator)
    720     if multi_output:
    721         y = check_array(y, 'csr', force_all_finite=True, ensure_2d=False,

    ~\Anaconda3\lib\site-packages\sklearn\utils\validation.py in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, warn_on_dtype, estimator)
    494             try:
    495                 warnings.simplefilter('error', ComplexWarning)
    --> 496                 array = np.asarray(array, dtype=dtype, order=order)
    497             except ComplexWarning:
    498                 raise ValueError("Complex data not supported\n"

    ~\Anaconda3\lib\site-packages\numpy\core\numeric.py in asarray(a, dtype, order)
    536 
    537     """
    --> 538     return array(a, dtype, copy=False, order=order)
    539 
    540 

    ValueError: could not convert string to float: 'Bank transfer (automatic)'

【问题讨论】:

数据框df的内容是什么? 变量“银行转账(自动)”是字符串格式,你应该做标签编码。 @BhaskarDhariyal 你能告诉我怎么做吗? @Yatin customerID SeniorCitizen Partner Dependents tenure InternetService OnlineBackup Contract PaperlessBilling PaymentMethod ... TechSupport_No Internet 服务 TechSupport_Yes StreamingTV_No StreamingTV_No Internet 服务 StreamingTV_Yes StreamingMovies_No StreamingMovies_No Internet 服务 StreamingMovies_Yes Churn_No Churn_Yes 请将其编辑到您的问题中... 【参考方案1】:

也许这会有所帮助

from sklearn import preprocessing
le = preprocessing.LabelEncoder()
dF['Bank transfer (automatic)'] =  le.fit_transform(dF['Bank transfer (automatic)'])
x = dF.drop("Churn", axis=1)
y = dF["Churn"]

from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression

x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.3, random_state=1)

logmodel = LogisticRegression()
logmodel.fit(x_train, y_train)

【讨论】:

以上是关于为啥逻辑回归会抛出转换错误(valueerror)?的主要内容,如果未能解决你的问题,请参考以下文章

逻辑回归中的概率校准错误:ValueError:无法将字符串转换为浮点数:'OLIFE'

为啥 BluetoothSetLocalServiceInfo 会抛出错误 1314?

如何提前判断 CountVectorizer 是不是会抛出 ValueError: empty words?

为啥从 BigDecimal 转换为 DECIMAL 时 impala-jdbc 会抛出异常?

为啥 jsonwebtoken 会抛出“无效签名”错误?

为啥 MERGE 语句会抛出唯一键约束错误