SKlearn 线性回归系数等于 0

Posted 2023-03-12

技术标签:

【中文标题】SKlearn 线性回归系数等于 0【英文标题】：SKlearn linear regression coeffs equals 0 【发布时间】：2018-08-18 23:32:28 【问题描述】：

在最简单的线性回归示例中存在问题。在输出处，系数为零，我做错了什么？感谢您的帮助。

import sklearn.linear_model as lm
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

x = [25,50,75,100]
y = [10.5,17,23.25,29]
pred = [27,41,22,33]
df = pd.DataFrame('x':x, 'y':y, 'pred':pred)
x = df['x'].values.reshape(1,-1)
y = df['y'].values.reshape(1,-1)
pred = df['pred'].values.reshape(1,-1)
plt.scatter(x,y,color='black')
clf = lm.LinearRegression(fit_intercept =True)
clf.fit(x,y)


m=clf.coef_[0]
b=clf.intercept_
print("slope=",m, "intercept=",b)

输出：

slope= [ 0.  0.  0.  0.] intercept= [ 10.5   17.    23.25  29.  ]

【问题讨论】：

【参考方案1】：

想一想。鉴于您返回了多个系数，这表明您有多个因素。由于它是单一回归，因此问题在于输入数据的形状。您最初的重塑使全班认为您有 4 个变量，每个变量只有一个观察值。

试试这样的：

import sklearn.linear_model as lm
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

x = np.array([25,99,75,100, 3, 4, 6, 80])[..., np.newaxis]
y = np.array([10.5,17,23.25,29, 1, 2, 33, 4])[..., np.newaxis]

clf = lm.LinearRegression()
clf.fit(x,y)
clf.coef_

输出：

array([[ 0.09399429]])

【讨论】：

很好地解释了！ +1 :)【参考方案2】：

正如@jrjames83 在重塑后的回答中所解释的那样 (.reshape(1,-1))，您提供的数据集包含一个样本（行）和四个特征（列）：

In [103]: x.shape
Out[103]: (1, 4)

很可能你想用这种方式重塑它：

In [104]: x = df['x'].values.reshape(-1, 1)

In [105]: x.shape
Out[105]: (4, 1)

这样您就有四个样本和一个特征...

或者，您可以将 DataFrame 列传递给您的模型，如下所示（无需使用其他变量污染您的内存）：

In [98]: clf = lm.LinearRegression(fit_intercept =True)

In [99]: clf.fit(df[['x']],df['y'])
Out[99]: LinearRegression(copy_X=True, fit_intercept=True, n_jobs=1, normalize=False)

In [100]: clf.coef_
Out[100]: array([0.247])

In [101]: clf.intercept_
Out[101]: 4.5

【讨论】：

以上是关于SKlearn 线性回归系数等于 0的主要内容，如果未能解决你的问题，请参考以下文章