机器学习|回归
Posted 奇葩星人
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了机器学习|回归相关的知识,希望对你有一定的参考价值。
回归和分类不同,分类是将一个实数向量集映射到一个二元(或者多元)集合,而回归是将实数向量集映射到实数集。
损失函数
这里介绍了二乘误差(SE):。对于给定的数据集,假设为线性的情况,即 ,我们把它考虑成优化的问题,则目标是让平均二乘误差(MSE)最小,目标函数为:
我们要寻找:
最小二乘法
我们记 ,则:
我们令 ,则有:
仿照前面,我们进行规则化,即:
我们令 ,则有:
当然,对于优化问题,我们也可以使用梯度下降算法。
这一章作业的代码实现部分主要是计算梯度:
# In all the following definitions:
# x is d by n : input data
# y is 1 by n : output regression values
# th is d by 1 : weights
# th0 is 1 by 1 or scalar
def lin_reg(x, th, th0):
return np.dot(th.T, x) + th0
def square_loss(x, y, th, th0):
return (y - lin_reg(x, th, th0))**2
def mean_square_loss(x, y, th, th0):
# the axis=1 and keepdims=True are important when x is a full matrix
return np.mean(square_loss(x, y, th, th0), axis = 1, keepdims = True)
def ridge_obj(x, y, th, th0, lam):
return np.mean(square_loss(x, y, th, th0), axis = 1, keepdims = True) + lam * np.linalg.norm(th)**2
def d_lin_reg_th(x, th, th0):
return x
def d_square_loss_th(x, y, th, th0):
return -2*(y - lin_reg(x, th, th0))*d_lin_reg_th(x, th, th0)
def d_mean_square_loss_th(x, y, th, th0):
return np.mean(d_square_loss_th(x, y, th, th0),axis=1, keepdims=True)
def d_lin_reg_th0(x, th, th0):
return np.ones((1, x.shape[1]))
def d_square_loss_th0(x, y, th, th0):
return -2*(y - lin_reg(x, th, th0))*d_lin_reg_th0(x, th, th0)
def d_mean_square_loss_th0(x, y, th, th0):
return np.mean(d_square_loss_th0(x, y, th, th0), axis=1, keepdims=True)
def d_ridge_obj_th(x, y, th, th0, lam):
return d_mean_square_loss_th(x, y, th, th0) + 2 * lam * th
def d_ridge_obj_th0(x, y, th, th0, lam):
return d_mean_square_loss_th0(x, y, th, th0)
#Concatenates the gradients with respect to theta and theta_0
def ridge_obj_grad(x, y, th, th0, lam):
grad_th = d_ridge_obj_th(x, y, th, th0, lam)
grad_th0 = d_ridge_obj_th0(x, y, th, th0, lam)
return np.vstack([grad_th, grad_th0])
另外,还实现了之前所述的随机梯度下降算法:
def sgd(X, y, J, dJ, w0, step_size_fn, max_iter):
"""Implements stochastic gradient descent
Inputs:
X: a standard data array (d by n)
y: a standard labels row vector (1 by n)
J: a cost function whose input is a data point (a column vector),
a label (1 by 1) and a weight vector w (a column vector) (in that
order), and which returns a scalar.
dJ: a cost function gradient (corresponding to J) whose input is a
data point (a column vector), a label (1 by 1) and a weight vector
w (a column vector) (also in that order), and which returns a
column vector.
w0: an initial value of weight vector www, which is a column
vector.
step_size_fn: a function that is given the (zero-indexed)
iteration index (an integer) and returns a step size.
max_iter: the number of iterations to perform
Returns: a tuple (like gd):
w: the value of the weight vector at the final step
fs: the list of values of JJJ found during all the iterations
ws: the list of values of www found during all the iterations
"""
w = w0
fs = []
ws = []
for i in range(max_iter):
j = np.random.randint(0, X.shape[1])
prev_f, prev_grad = J(X[:, j:j+1], y[:, j:j+1], w), dJ(X[:, j:j+1], y[:, j:j+1], w)
fs.append(prev_f); ws.append(w)
if i == max_iter-1:
return w, fs, ws
step = step_size_fn(i)
w = w - step * prev_grad
把回归应用于之前计算汽车油耗的案例中,我们利用网络搜索法进行参数选择,选择最合适的参数:
for order in [1, 2, 3]:
for feature_type in [0, 1]:
if order != 3:
for lam in [0,0.01,0.02,0.03,0.04,0.05,0.06,0.07,0.08,0.09,0.1]:
# for lam in range(0, 220, 20):
auto_data_poly = hw5.make_polynomial_feature_fun(order)(auto_data[feature_type])
score = hw5.xval_learning_alg(auto_data_poly, auto_values, lam, 10)
# print((feature_type, order, lam, score))
file.writelines(str(feature_type) + ',' + str(order) + ',' + str(lam) + ',' + str(score[0,0]) + '\n')
else:
# for lam in [0,0.01,0.02,0.03,0.04,0.05,0.06,0.07,0.08,0.09,0.1]:
for lam in range(0, 220, 20):
auto_data_poly = hw5.make_polynomial_feature_fun(order)(auto_data[feature_type])
score = hw5.xval_learning_alg(auto_data_poly, auto_values, lam, 10)
file.writelines(str(feature_type) + ',' + str(order) + ',' + str(lam) + ',' + str(score[0,0]) + '\n')
作业中简单介绍了这些参数的影响。次数过低可能会造成系统误差,过高可能会过拟合; 一项就是在过拟合和减小训练误差之间权衡,一般默认 也是因为次数变高的时候,若要训练误差小,则 往往会变得很大。
以上是关于机器学习|回归的主要内容,如果未能解决你的问题,请参考以下文章