使用 Scipy 进行线性回归曲线拟合 - 不知道出了啥问题

Posted

技术标签:

【中文标题】使用 Scipy 进行线性回归曲线拟合 - 不知道出了啥问题【英文标题】:Linear Regression Curve Fitting With Scipy - Not sure what is wrong使用 Scipy 进行线性回归曲线拟合 - 不知道出了什么问题 【发布时间】:2020-03-29 10:40:25 【问题描述】:

我一直在尝试学习如何使用线性回归和 scipy 拟合曲线。

这是我从另一个帮助别人的好心用户那里得到的一些代码。

我的问题在这里:我为xDatayData 拟合了一些我自己的数据。 I get this wrongly fitted curve on my data. 但是如果我翻转xDatayData,那么I get this better fitted curve.

如何修复它以使我的曲线适合我原来的 xDatayData 位置?

import numpy, scipy, matplotlib
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
from scipy.optimize import differential_evolution
import warnings
import math
# det value in dbm
ytoBeConverted = numpy.array([7.76,5.00,1.70,-1.33,-4.77,-7.75,-10.78,-13.76,-16.70,-19.97,-23.04,-25.88,-28.92,-32.05,-34.67,-37.08,-39.33])
#power meter value
lst = []
for y in ytoBeConverted:
    lst.append(math.pow(10, (y/10)))

############ These X and Y data points don't work, but if I flip them as X and Y, it works##########
yData = numpy.asarray(lst)
xData = numpy.array([0.8475,0.7108,0.3853,0.2108,0.1026,0.0537,0.0277,0.0147,0.0079,0.0043,0.0027,0.0019,0.0015,0.0013,0.0012,0.0011,0.0011])


def func(x, a, b, Offset): # Sigmoid A With Offset from zunzun.com
    return  1.0 / (1.0 + numpy.exp(-a * (x-b))) + Offset


# function for genetic algorithm to minimize (sum of squared error)
def sumOfSquaredError(parameterTuple):
    warnings.filterwarnings("ignore") # do not print warnings by genetic algorithm
    val = func(xData, *parameterTuple)
    return numpy.sum((yData - val) ** 2.0)


def generate_Initial_Parameters():
    # min and max used for bounds
    maxX = max(xData)
    minX = min(xData)
    maxY = max(yData)
    minY = min(yData)

    parameterBounds = []
    parameterBounds.append([minX, maxX]) # seach bounds for a
    parameterBounds.append([minX, maxX]) # seach bounds for b
    parameterBounds.append([0.0, maxY]) # seach bounds for Offset

    # "seed" the numpy random number generator for repeatable results
    result = differential_evolution(sumOfSquaredError, parameterBounds, seed=3)
    return result.x

# generate initial parameter values
geneticParameters = generate_Initial_Parameters()

# curve fit the test data
fittedParameters, pcov = curve_fit(func, xData, yData, geneticParameters)

print('Parameters', fittedParameters)

modelPredictions = func(xData, *fittedParameters) 
absError = modelPredictions - yData

SE = numpy.square(absError) # squared errors
MSE = numpy.mean(SE) # mean squared errors
RMSE = numpy.sqrt(MSE) # Root Mean Squared Error, RMSE
Rsquared = 1.0 - (numpy.var(absError) / numpy.var(yData))
print('RMSE:', RMSE)
print('R-squared:', Rsquared)

print()


##########################################################
# graphics output section
def ModelAndScatterPlot(graphWidth, graphHeight):
    f = plt.figure(figsize=(graphWidth/100.0, graphHeight/100.0), dpi=100)
    axes = f.add_subplot(111)

    # first the raw data as a scatter plot
    axes.plot(xData, yData,  'D')

    # create data for the fitted equation plot
    xModel = numpy.linspace(min(xData), max(xData))
    yModel = func(xModel, *fittedParameters)

    # now the model as a line plot
    axes.plot(xModel, yModel)

    axes.set_xlabel('Power Meter Value (mW)') # X axis data label
    axes.set_ylabel('Detector Value') # Y axis data label

    plt.show()
    plt.close('all') # clean up after using pyplot

graphWidth = 800
graphHeight = 600
ModelAndScatterPlot(graphWidth, graphHeight) 

【问题讨论】:

你的因变量是什么,自变量是什么?你想预测什么? 我不确定这是否能回答您的问题,但我想根据我提供的 x 值预测 y 值。更具体地说,我试图找到望远镜数据、探测器值(X)和功率计值(Y)之间的关系。检测器值随着功率计值而减小,我希望能够在我的样本数据点之间插入和预测值。更具体地说,我可以在示例中使用我正确拟合的曲线从 Y 预测 X 吗? 混淆是首先将值[7.76, 5.00 ...]命名为x,然后对它们执行一些操作并将它们称为y。然后再次提供[0.845, 071...] 作为 x。其次,在第二张图像中,通过反转 x 和 y,y 轴标签是 [0.845, 071...] 的“检测器值”。 [0.845, 071...] 的值是 x 还是 y? [0.845, 071...] 是 x 值。我图片中“错误拟合”的曲线是我希望修复的曲线。 @VivekKumar 请看我对这个问题的回答。 【参考方案1】:

数据似乎不是 S 形,因此您的代码中的方程不能很好地拟合数据。我对您的数据进行了方程搜索,三参数双曲型方程得到了一个好的拟合。这是您使用此方程式的代码,其中更新了新方程式的遗传算法搜索范围。

import numpy, scipy, matplotlib
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
from scipy.optimize import differential_evolution
import warnings
import math
# det value in dbm
xtoBeConverted = numpy.array([7.76,5.00,1.70,-1.33,-4.77,-7.75,-10.78,-13.76,-16.70,-19.97,-23.04,-25.88,-28.92,-32.05,-34.67,-37.08,-39.33])
#power meter value
lst = []
for x in xtoBeConverted:
    lst.append(math.pow(10, (x/10)))

############ These X and Y data points don't work, but if I flip them as X and Y, it works##########
yData = numpy.asarray(lst)
xData = numpy.array([0.8475,0.7108,0.3853,0.2108,0.1026,0.0537,0.0277,0.0147,0.0079,0.0043,0.0027,0.0019,0.0015,0.0013,0.0012,0.0011,0.0011])


def func(x, a, b, c): # Hyperbolic F from zunzun.com
    return  a * x / (b + x) + c * x


# function for genetic algorithm to minimize (sum of squared error)
def sumOfSquaredError(parameterTuple):
    warnings.filterwarnings("ignore") # do not print warnings by genetic algorithm
    val = func(xData, *parameterTuple)
    return numpy.sum((yData - val) ** 2.0)


def generate_Initial_Parameters():
    # min and max used for bounds
    maxX = max(xData)
    minX = min(xData)
    maxY = max(yData)
    minY = min(yData)

    parameterBounds = []
    parameterBounds.append([-1.0, 0.0]) # seach bounds for a
    parameterBounds.append([-1.0, 0.0]) # seach bounds for b
    parameterBounds.append([minY, maxY]) # seach bounds for c

    # "seed" the numpy random number generator for repeatable results
    result = differential_evolution(sumOfSquaredError, parameterBounds, seed=3)
    return result.x


# generate initial parameter values
geneticParameters = generate_Initial_Parameters()

# curve fit the data
fittedParameters, pcov = curve_fit(func, xData, yData, geneticParameters)

print('Parameters', fittedParameters)

modelPredictions = func(xData, *fittedParameters) 
absError = modelPredictions - yData

SE = numpy.square(absError) # squared errors
MSE = numpy.mean(SE) # mean squared errors
RMSE = numpy.sqrt(MSE) # Root Mean Squared Error, RMSE
Rsquared = 1.0 - (numpy.var(absError) / numpy.var(yData))
print('RMSE:', RMSE)
print('R-squared:', Rsquared)

print()


##########################################################
# graphics output section
def ModelAndScatterPlot(graphWidth, graphHeight):
    f = plt.figure(figsize=(graphWidth/100.0, graphHeight/100.0), dpi=100)
    axes = f.add_subplot(111)

    # first the raw data as a scatter plot
    axes.plot(xData, yData,  'D')

    # create data for the fitted equation plot
    xModel = numpy.linspace(min(xData), max(xData))
    yModel = func(xModel, *fittedParameters)

    # now the model as a line plot
    axes.plot(xModel, yModel)

    axes.set_xlabel('Power Meter Value (mW)') # X axis data label
    axes.set_ylabel('Detector Value') # Y axis data label

    plt.show()
    plt.close('all') # clean up after using pyplot

graphWidth = 800
graphHeight = 600
ModelAndScatterPlot(graphWidth, graphHeight)

【讨论】:

以上是关于使用 Scipy 进行线性回归曲线拟合 - 不知道出了啥问题的主要内容,如果未能解决你的问题,请参考以下文章

使用scipy来进行曲线拟合

使用scipy来进行曲线拟合

使用scipy来进行曲线拟合

离散点怎么拟合成曲线啊

谁能给一个java编写的利用最小二乘法进行曲线拟合的算法?

Python最小二乘法拟合与作图