线性回归推导
Posted spxcds
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了线性回归推导相关的知识,希望对你有一定的参考价值。
矩阵求导相关资料
因为我们所求的都是梯度, 所以, 本文采用的求导方法为分母布局
首先, 要求的拟合函数为
[ egin{align*} y = X*W + b end{align*} ]
其中
[
egin{align*}
mathbf{X}=egin{bmatrix}x_{1}\ x_{2}\ vdots\ x_{m} end{bmatrix}, mathbf{W}=egin{bmatrix}w_{1} end{bmatrix}, mathbf{b}=egin{bmatrix}b_{1} end{bmatrix}
end{align*}
]
为了方便, 可以把X增广一维, 变为
[
egin{align*}
mathbf{X}=egin{bmatrix}x_{1} & 1\ x_{2}& 1\ vdots\ x_{m}& 1end{bmatrix}, mathbf{W}=egin{bmatrix}w_{1} \ b1 end{bmatrix}
end{align*}
]
那么, 拟合函数就变成了
[
egin{align*}
y = X * W
end{align*}
]
常见矩阵求导公式
首先给出一个m*n的矩阵A和一个n*1的列向量x
[ egin{align*} A=egin{bmatrix} a_{11} & a_{12} & cdots & x_{1n}a_{21} & a_{22} & cdots & x_{2n}\vdots & vdots & & vdotsa_{m1} & a_{m2} & cdots& x_{mn}\end{bmatrix} end{align*} ]
[ egin{align*} x=egin{bmatrix} x_1 x_2 \vdots x_n end{bmatrix} end{align*} ]
[ egin{align*} Ax = egin{bmatrix} a_{11}x_1 + a_{12}x_2 + cdots + a_{1n}x_na_{21}x_1 + a_{22}x_2 + cdots + a_{2n}x_n\vdots a_{m1}x_1 + a_{m2}x_2 + cdots + a_{mn}x_n\end{bmatrix} end{align*} ]
Ax对x求偏导, 得
[ egin{align*} frac{partial Ax}{partial x} = egin{bmatrix} a_{11} & a_{21} & cdots & a_{m1} a_{12} & a_{22} & cdots & a_{m2} \vdots & vdots & & vdots a_{1n} & a_{2n} & cdots & a_{mn} \end{bmatrix} = A^T end{align*} ]
Ax对(x^T)求偏导, 得
[ egin{align*} frac{partial Ax}{partial x^T} = egin{bmatrix} a_{11} & a_{12} & cdots & x_{1n}a_{21} & a_{22} & cdots & x_{2n}\vdots & vdots & & vdotsa_{m1} & a_{m2} & cdots& x_{mn}\end{bmatrix} = A end{align*} ]
[ egin{align*} frac{partial x^T A}{partial x} &= left[ left( frac{partial x^T A}{partial x} ight)^T ight]^T &= left[ frac{(partial x^T A)^T}{partial x^T} ight]^T &= left[ frac{partial A^T x}{partial x^T} ight]^T &= (A^T)^T &= A end{align*} ]
[
egin{align*}
x^Tx = egin{bmatrix}x_{11}^2 + x_{22}^2 + x_{nn}^2end{bmatrix} \frac{partial x^Tx}{partial x} = 2x
end{align*}
]
标量对向量复合函数求导公式为
[
egin{align*}
u=u(x), v=v(x) \frac{partial uv}{partial x} = frac{partial u}{partial x}v + ufrac{partial v}{partial x} \frac{partial x^Tx}{partial x} = frac{partial x^T}{partial x}x + x^Tfrac{partial x}{partial x} = x+(x^T)^T= 2x
end{align*}
]
Loss函数求导
Loss函数为
[
egin{align*}
L(W) = frac{1}{2m} (XW - y)^2 = frac{1}{2m} (XW - y)^T(X W - y)
end{align*}
]
解法1
设
[
egin{align*}
Z = f(W) = XW - y
end{align*}
]
其导函数为(注意顺序不能乱)
[
egin{align*}
frac{partial L(W)}{partial W} = frac{partial f(W)}{ partial W} frac{partial L(f(W))}{ partial f(W)}
end{align*}
]
其中
[
egin{align*}
frac{partial f(W) }{partial W} &= frac{partial (XW - y)}{partial W} &= frac{partial XW}{partial W} - frac{partial y}{partial W} &= X^T - 0 &= X^T
end{align*}
]
[
egin{align*}
frac{partial L(f(W))}{partial f(W)} &= frac{1}{2m}frac{partial Z^TZ}{partial Z} &= frac{1}{2m}2Z &= frac{1}{m} (XW - y)
end{align*}
]
因此
[
egin{align*}
frac{partial L(W)}{partial W} &= frac{partial f(W)}{ partial W} frac{partial L(f(W))}{ partial f(W)} &= frac{1}{m}X^T(XW - y)
end{align*}
]
解法2
[ egin{align*} frac{partial f(W) }{partial W} &= frac{1}{2m}frac{partial (XW - y)^T(X W - y)}{partial W} &= frac{1}{2m}left[frac{partial W^TX^TXW}{partial W} - frac{partial y^TXW}{partial W} - frac{partial W^TX^Ty}{partial W} + frac{partial y^Ty}{partial W} ight]&= frac{1}{2m}left[(frac{partial W^TX^TX}{partial W}W + W^TX^TXfrac{partial W}{partial W}) - (y^TX)^T - (X^Ty) + 0 ight]&= frac{1}{2m}left[X^TXW + (W^TX^TX)^T - 2X^Ty ight]&= frac{1}{2m}left[2X^T(XW - y) ight] &= frac{1}{m}X^T(XW - y) end{align*} ]
以上是关于线性回归推导的主要内容,如果未能解决你的问题,请参考以下文章
机器学习线性回归——岭回归解决过拟合问题(理论+图解+公式推导)