如何在 Python 中绘制逻辑回归的决策边界?
Posted
技术标签:
【中文标题】如何在 Python 中绘制逻辑回归的决策边界?【英文标题】:How to plot decision boundary for logistic regression in Python? 【发布时间】:2020-03-28 01:43:54 【问题描述】:我正在尝试为逻辑回归中的边界分类绘制决策边界,但我不太明白应该怎么做。
这是我生成的一个数据集,我在其上使用 numpy 应用逻辑回归
import numpy as np
import matplotlib.pyplot as plt
# class 0:
# covariance matrix and mean
cov0 = np.array([[5,-4],[-4,4]])
mean0 = np.array([2.,3])
# number of data points
m0 = 1000
# class 1
# covariance matrix
cov1 = np.array([[5,-3],[-3,3]])
mean1 = np.array([1.,1])
# number of data points
m1 = 1000
# generate m gaussian distributed data points with
# mean and cov.
r0 = np.random.multivariate_normal(mean0, cov0, m0)
r1 = np.random.multivariate_normal(mean1, cov1, m1)
X = np.concatenate((r0,r1))
应用逻辑回归后,我发现最好的 theta 是:
thetas = [1.2182441664666837, 1.3233825647558795, -0.6480886684022018]
我尝试通过以下方式绘制决策边界:
yy = -(thetas[0] + thetas[1]*X)/thetas[1][2]
plt.plot(X,yy)
但是,显示的图表与预期的斜率相反:
提前致谢
【问题讨论】:
【参考方案1】:我认为你的女仆 2 错误:
yy = -(thetas[0] + thetas[1]*X)/thetas[1][2]
为什么是thetas[1][2]
而不是theta[2]
?
为什么要转换你的完整数据集的 X 呢?
您只能将转换应用于最小 x 和最大 x:
minx = np.min(X[:, 0])
maxx = np.max(X[:, 1])
## compute transformation :
y1 = -(thetas[0] + thetas[1]*minx) / thetas[2]
y2 = -(thetas[0] + thetas[1]*maxx) / thetas[2]
## then plot the line [(minx, y1), (maxx, y2)]
plt.plot([minx, maxx], [y1, y2], c='black')
使用 sklearn LogisticRegression 完成工作代码:
import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LogisticRegression
# Youre job :
# =============
# class 0:
# covariance matrix and mean
cov0 = np.array([[5,-4],[-4,4]])
mean0 = np.array([2.,3])
# number of data points
m0 = 1000
# class 1
# covariance matrix
cov1 = np.array([[5,-3],[-3,3]])
mean1 = np.array([1.,1])
# number of data points
m1 = 1000
# generate m gaussian distributed data points with
# mean and cov.
r0 = np.random.multivariate_normal(mean0, cov0, m0)
r1 = np.random.multivariate_normal(mean1, cov1, m1)
X = np.concatenate((r0,r1))
## Added lines :
Y = np.concatenate((np.zeros(m0), np.ones(m1)))
model = LogisticRegression().fit(X,Y)
coefs =list(model.intercept_)
coefs.extend(model.coef_[0].tolist())
xmin = np.min(X[:, 0])
xmax = np.max(X[:, 0])
def bound(x):
return -(coefs[0] + coefs[1] * x) / coefs[2]
p1 = np.array([xmin, bound(xmin)])
p2 = np.array([xmax, bound(xmax)])
plt.plot(r0[:, 0], r0[:, 1], ls='', marker='.', c='red')
plt.plot(r1[:, 0], r1[:, 1], ls ='', marker='.', c='blue')
plt.plot([p1[0], p1[1]], [p2[0], p2[1]], c='black')
plt.show()
更新
final plot link
【讨论】:
嘿,谢谢您的回答!线的斜率和我上面贴的一样,斜率不应该是负数吗? 嘿,斜率是负数,因为如果您将 beta0 视为截距,beta1 和 beta2 都是负数,所以当您进行转换时:-(coefs[0] + coefs[1] * x) / coefs[2]
斜率变为 - (coefs[1]/coefs[2])
,这是负数。
以上是关于如何在 Python 中绘制逻辑回归的决策边界?的主要内容,如果未能解决你的问题,请参考以下文章