使用 Scipy 进行线性回归曲线拟合 - 不知道出了啥问题
Posted
技术标签:
【中文标题】使用 Scipy 进行线性回归曲线拟合 - 不知道出了啥问题【英文标题】:Linear Regression Curve Fitting With Scipy - Not sure what is wrong使用 Scipy 进行线性回归曲线拟合 - 不知道出了什么问题 【发布时间】:2020-03-29 10:40:25 【问题描述】:我一直在尝试学习如何使用线性回归和 scipy 拟合曲线。
这是我从另一个帮助别人的好心用户那里得到的一些代码。
我的问题在这里:我为xData
和yData
拟合了一些我自己的数据。 I get this wrongly fitted curve on my data.
但是如果我翻转xData
和yData
,那么I get this better fitted curve.
如何修复它以使我的曲线适合我原来的 xData
和 yData
位置?
import numpy, scipy, matplotlib
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
from scipy.optimize import differential_evolution
import warnings
import math
# det value in dbm
ytoBeConverted = numpy.array([7.76,5.00,1.70,-1.33,-4.77,-7.75,-10.78,-13.76,-16.70,-19.97,-23.04,-25.88,-28.92,-32.05,-34.67,-37.08,-39.33])
#power meter value
lst = []
for y in ytoBeConverted:
lst.append(math.pow(10, (y/10)))
############ These X and Y data points don't work, but if I flip them as X and Y, it works##########
yData = numpy.asarray(lst)
xData = numpy.array([0.8475,0.7108,0.3853,0.2108,0.1026,0.0537,0.0277,0.0147,0.0079,0.0043,0.0027,0.0019,0.0015,0.0013,0.0012,0.0011,0.0011])
def func(x, a, b, Offset): # Sigmoid A With Offset from zunzun.com
return 1.0 / (1.0 + numpy.exp(-a * (x-b))) + Offset
# function for genetic algorithm to minimize (sum of squared error)
def sumOfSquaredError(parameterTuple):
warnings.filterwarnings("ignore") # do not print warnings by genetic algorithm
val = func(xData, *parameterTuple)
return numpy.sum((yData - val) ** 2.0)
def generate_Initial_Parameters():
# min and max used for bounds
maxX = max(xData)
minX = min(xData)
maxY = max(yData)
minY = min(yData)
parameterBounds = []
parameterBounds.append([minX, maxX]) # seach bounds for a
parameterBounds.append([minX, maxX]) # seach bounds for b
parameterBounds.append([0.0, maxY]) # seach bounds for Offset
# "seed" the numpy random number generator for repeatable results
result = differential_evolution(sumOfSquaredError, parameterBounds, seed=3)
return result.x
# generate initial parameter values
geneticParameters = generate_Initial_Parameters()
# curve fit the test data
fittedParameters, pcov = curve_fit(func, xData, yData, geneticParameters)
print('Parameters', fittedParameters)
modelPredictions = func(xData, *fittedParameters)
absError = modelPredictions - yData
SE = numpy.square(absError) # squared errors
MSE = numpy.mean(SE) # mean squared errors
RMSE = numpy.sqrt(MSE) # Root Mean Squared Error, RMSE
Rsquared = 1.0 - (numpy.var(absError) / numpy.var(yData))
print('RMSE:', RMSE)
print('R-squared:', Rsquared)
print()
##########################################################
# graphics output section
def ModelAndScatterPlot(graphWidth, graphHeight):
f = plt.figure(figsize=(graphWidth/100.0, graphHeight/100.0), dpi=100)
axes = f.add_subplot(111)
# first the raw data as a scatter plot
axes.plot(xData, yData, 'D')
# create data for the fitted equation plot
xModel = numpy.linspace(min(xData), max(xData))
yModel = func(xModel, *fittedParameters)
# now the model as a line plot
axes.plot(xModel, yModel)
axes.set_xlabel('Power Meter Value (mW)') # X axis data label
axes.set_ylabel('Detector Value') # Y axis data label
plt.show()
plt.close('all') # clean up after using pyplot
graphWidth = 800
graphHeight = 600
ModelAndScatterPlot(graphWidth, graphHeight)
【问题讨论】:
你的因变量是什么,自变量是什么?你想预测什么? 我不确定这是否能回答您的问题,但我想根据我提供的 x 值预测 y 值。更具体地说,我试图找到望远镜数据、探测器值(X)和功率计值(Y)之间的关系。检测器值随着功率计值而减小,我希望能够在我的样本数据点之间插入和预测值。更具体地说,我可以在示例中使用我正确拟合的曲线从 Y 预测 X 吗? 混淆是首先将值[7.76, 5.00 ...]
命名为x,然后对它们执行一些操作并将它们称为y。然后再次提供[0.845, 071...]
作为 x。其次,在第二张图像中,通过反转 x 和 y,y 轴标签是 [0.845, 071...]
的“检测器值”。 [0.845, 071...]
的值是 x 还是 y?
[0.845, 071...]
是 x 值。我图片中“错误拟合”的曲线是我希望修复的曲线。
@VivekKumar 请看我对这个问题的回答。
【参考方案1】:
数据似乎不是 S 形,因此您的代码中的方程不能很好地拟合数据。我对您的数据进行了方程搜索,三参数双曲型方程得到了一个好的拟合。这是您使用此方程式的代码,其中更新了新方程式的遗传算法搜索范围。
import numpy, scipy, matplotlib
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
from scipy.optimize import differential_evolution
import warnings
import math
# det value in dbm
xtoBeConverted = numpy.array([7.76,5.00,1.70,-1.33,-4.77,-7.75,-10.78,-13.76,-16.70,-19.97,-23.04,-25.88,-28.92,-32.05,-34.67,-37.08,-39.33])
#power meter value
lst = []
for x in xtoBeConverted:
lst.append(math.pow(10, (x/10)))
############ These X and Y data points don't work, but if I flip them as X and Y, it works##########
yData = numpy.asarray(lst)
xData = numpy.array([0.8475,0.7108,0.3853,0.2108,0.1026,0.0537,0.0277,0.0147,0.0079,0.0043,0.0027,0.0019,0.0015,0.0013,0.0012,0.0011,0.0011])
def func(x, a, b, c): # Hyperbolic F from zunzun.com
return a * x / (b + x) + c * x
# function for genetic algorithm to minimize (sum of squared error)
def sumOfSquaredError(parameterTuple):
warnings.filterwarnings("ignore") # do not print warnings by genetic algorithm
val = func(xData, *parameterTuple)
return numpy.sum((yData - val) ** 2.0)
def generate_Initial_Parameters():
# min and max used for bounds
maxX = max(xData)
minX = min(xData)
maxY = max(yData)
minY = min(yData)
parameterBounds = []
parameterBounds.append([-1.0, 0.0]) # seach bounds for a
parameterBounds.append([-1.0, 0.0]) # seach bounds for b
parameterBounds.append([minY, maxY]) # seach bounds for c
# "seed" the numpy random number generator for repeatable results
result = differential_evolution(sumOfSquaredError, parameterBounds, seed=3)
return result.x
# generate initial parameter values
geneticParameters = generate_Initial_Parameters()
# curve fit the data
fittedParameters, pcov = curve_fit(func, xData, yData, geneticParameters)
print('Parameters', fittedParameters)
modelPredictions = func(xData, *fittedParameters)
absError = modelPredictions - yData
SE = numpy.square(absError) # squared errors
MSE = numpy.mean(SE) # mean squared errors
RMSE = numpy.sqrt(MSE) # Root Mean Squared Error, RMSE
Rsquared = 1.0 - (numpy.var(absError) / numpy.var(yData))
print('RMSE:', RMSE)
print('R-squared:', Rsquared)
print()
##########################################################
# graphics output section
def ModelAndScatterPlot(graphWidth, graphHeight):
f = plt.figure(figsize=(graphWidth/100.0, graphHeight/100.0), dpi=100)
axes = f.add_subplot(111)
# first the raw data as a scatter plot
axes.plot(xData, yData, 'D')
# create data for the fitted equation plot
xModel = numpy.linspace(min(xData), max(xData))
yModel = func(xModel, *fittedParameters)
# now the model as a line plot
axes.plot(xModel, yModel)
axes.set_xlabel('Power Meter Value (mW)') # X axis data label
axes.set_ylabel('Detector Value') # Y axis data label
plt.show()
plt.close('all') # clean up after using pyplot
graphWidth = 800
graphHeight = 600
ModelAndScatterPlot(graphWidth, graphHeight)
【讨论】:
以上是关于使用 Scipy 进行线性回归曲线拟合 - 不知道出了啥问题的主要内容,如果未能解决你的问题,请参考以下文章