CS231n笔记1--Softmax Loss 和 Multiclass SVM Loss

Posted 2022-12-10 LiemZuvon

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了CS231n笔记1--Softmax Loss 和 Multiclass SVM Loss相关的知识，希望对你有一定的参考价值。

Softmax Loss 和 Multiclass SVM Loss

Softmax Loss 和 Multiclass SVM Loss

Softmax Loss

给出 $（x_i, y_i）$ ，其中 $x_i$ 是图像， $y_i$ 是图像的类别（整数）， $s = f（x_i,W）$ ，其中 $s$ 是网络的输出，则定义误差如下：
$P(Y = k|X = x_i) = \\dfrace^s_k\\sum_je^s^j \\\\ L_i = -logP(Y=y_i | X = x_i)$
例如 $s = [3.2, 5.1, -1.7]$ ,则 $p = [0.13, 0.87 , 0.00]$ ,可得 $L_i=-log(0.13)=0.89$

向量化Python代码

def softmax_loss(x, y):
  """
  Computes the loss and gradient for softmax classification.

  Inputs:
  - x: Input data, of shape (N, C) where x[i, j] is the score for the jth class
    for the ith input.
  - y: Vector of labels, of shape (N,) where y[i] is the label for x[i] and
    0 <= y[i] < C

  Returns a tuple of:
  - loss: Scalar giving the loss
  - dx: Gradient of the loss with respect to x
  """
  probs = np.exp(x - np.max(x, axis=1, keepdims=True))
  probs /= np.sum(probs, axis=1, keepdims=True)
  N = x.shape[0]
  loss = -np.sum(np.log(probs[np.arange(N), y])) / N
  dx = probs.copy()
  dx[np.arange(N), y] -= 1
  dx /= N
  return loss, dx

Multiclass SVM Loss

给出 $（x_i, y_i）$ ，其中 $x_i$ 是图像， $y_i$ 是图像的类别（整数）， $s = f（x_i,W）$ ，其中 $s$ 是网络的输出，则定义误差如下：
$L_i = \\sum_j \\neq y_i max(0, s_j-s_y_i+1)$
例如 $s = [3,2,5], y_i = 0$ ,那么 $L_i = max(0, 2-3+1)+max(0,5-3+1)=3$

思考：
question1：如果允许 $j=y_i$ ，结果会怎么样？如果使用平均数而非求和又会怎么样？
ans1：如果允许 $j=y_i$ ，也就是加上 $max(0, s_y_i-s_y_i+1)=1$ ；如果使用平均数，就是令结果乘于一个常数；这两种情况将导致误差与原误差不同，但是，由于都是正相关的，所以对于我们最后希望得到的模型没有影响，利用这样的特性，我们可以简化我们的代码。

question2：在初期，我们设置Weights近似于零，导致 s也近似于0，那么误差会是多少？
ans2:由于s也近似于0，也即 $s_y_i ~= s_j$ ,那么 mCS231n assignment3 Q4 Style Transfer

『cs231n』计算机视觉基础

CS231n课程笔记翻译

『cs231n』线性分类器最优化

[转] 贺完结！CS231n官方笔记授权翻译总集篇发布

cs231n 学习笔记 by qscqesze