Pytorch学习笔记3.深度学习基础

Posted 2021-09-04 贪钱算法还我头发

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了Pytorch学习笔记3.深度学习基础相关的知识，希望对你有一定的参考价值。

文章目录

13.梯度
14.激活函数
15.感知机
16.链式法则
17.反向传播
18.2D函数优化实例
19.Logistic Regression
20.交叉熵
21.多分类
22.全连接层
23.激活函数与GPU加速
24. 测试

根据龙良曲Pytorch学习视频整理，视频链接：
【计算机-AI】PyTorch学这个就够了！
(好课推荐)深度学习与PyTorch入门实战——主讲人龙良曲

13.梯度

导数 derivative
偏微分 partial derivate
梯度 gradient（向量）

How to search for minima?

$\\theta_{t+1}=\\theta_t-\\alpha_t\\triangledown f(\\theta_t)$

Optimizer performance

initialization status 何恺明初始化方法
learning rate （learning_rate_decay）
momentum

14.激活函数

连续不可导
Sigmoid / Logistic $\\sigma'=\\sigma(1-\\sigma)$
torch.sigmoid()
F.sigmoid() (import torch.nn.functional as F)
Tanh
torch.tanh()
Relu
torch.relu()
F.relu() (import torch.nn.functional as F)

Typical Loss

Mean Squared Error
MSE $\\sum [y-(xw+b)]^2$
$L2-norm=||y-(xw+b)||_2$
Cross Entropy Loss
binary
multi-class
+softmax
Leave it to Logistic Regression Part
Softmax
soft version of max
$S(y_i)=\\frac{e^{y_i}}{\\sum_je^{y_j}}$
$\\frac{\\partial p_i}{\\partial p_j}=\\left\\{\\begin{matrix} p_i(1-p_i)&i=j \\\\ -p_j*p_i& i\\neq j \\end{matrix}\\right.$

Gradient API

torch.autograd.grad(loss, [w1, w2,...])
loss.backward()

import torch
import torch.nn.functional as F

x = torch.ones(1)
w = torch.full([1], 2.)
mse = F.mse_loss(torch.ones(1), x*w)
print(mse)  # tensor(1., grad_fn=<MseLossBackward>)

# torch.autograd.grad(mse, [w]) # RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

print(w.requires_grad_())   # tensor([2.], requires_grad=True)

# print(torch.autograd.grad(mse, [w]))    # 动态图未更新会报错RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

mse = F.mse_loss(torch.ones(1), x*w)
# print(torch.autograd.grad(mse, [w]))    # (tensor([2.]),

mse.backward()
print(w.grad)   # tensor([2.])


a = torch.rand(3, requires_grad=True)
print(a)    # tensor([0.0377, 0.4542, 0.1386], requires_grad=True)
p = F.softmax(a, dim=0)
# p.backward()    # 报错 RuntimeError: grad can be implicitly created only for scalar outputs
# retain_graph=True 不会清除计算图
print(torch.autograd.grad(p[0], [a], retain_graph=True))    # (tensor([ 0.1998, -0.1156, -0.0843]),)
print(torch.autograd.grad(p[1], [a], retain_graph=True))    # (tensor([-0.1156,  0.2434, -0.1278]),)
print(torch.autograd.grad(p[2], [a], retain_graph=True))    # (tensor([-0.0843, -0.1278,  0.2121]),)

15.感知机

单一输出感知机求导

$\\frac{\\partial E}{\\partial w_{j0}}=(O_0-t)O_0(1-O_0)x^0_j$

多输出Loss层 (Multi-output Perception)

$\\frac{\\partial E}{\\partial w_{jk}}=(O_k-t_k)O_k(1-O_k)x^0_j$

import torch
import torch.nn.functional as F

x = torch.randn(1, 10)
# w = torch.randn(1, 10, requires_grad=True)  # 单一层感知机
w = torch.randn(2, 10, requires_grad=True)  # 多输出Loss层
o = torch.sigmoid(x@w.t())
print(o.shape)  # torch.Size([1, 2])

loss = F.mse_loss(torch.ones(1, 1), o)  # broadcasting
print(loss.shape)   # torch.Size([])
print(loss)   # tensor(0.2094, grad_fn=<MseLossBackward>)

loss.backward()
print(w.grad)
"""
tensor([[-2.0498e-01,  2.4619e-02, -8.0208e-04, -1.3723e-01, -1.3014e-01,
         -1.4648e-01, -7.5119e-02,  4.9381e-02,  2.7161e-01,  4.8075e-02],
        [-4.8705e-03,  5.8495e-04, -1.9058e-05, -3.2607e-03, -3.0922e-03,
         -3.4804e-03, -1.7849e-03,  1.1733e-03,  6.4536e-03,  1.1423e-03]])
"""

16.链式法则

import torch
import torch.nn.functional as F

x = torch.tensor(1.)
w1 = torch.tensor(2., requires_grad=True)
b1 = torch.tensor(1.)
w2 = torch.tensor(2., requires_grad=True)
b2 = torch.tensor(1.)
y1 = x * w1 + b1
y2 = y1 * w2 + b2

dy2_dy1 = torch.autograd.grad(y2, [y1], retain_graph=True)[0]
dy1_dw1 = torch.autograd.grad(y1, [w1], retain_graph=True)[0]
dy2_dw1 = torch.autograd.grad(y2, [w1], retain_graph=True)[0]

print(dy2_dy1 * dy1_dw1)    # tensor(2.)
print(dy2_dw1)  # tensor(2.)

17.反向传播

For an output layer node k $\\in$ K $\\frac{\\partial E}{\\partial W_{jk}}=O_j\\delta_k$

where $\\delta _k = O_k(1-O_k)(O_k-t_k)$

以上是关于Pytorch学习笔记3.深度学习基础的主要内容，如果未能解决你的问题，请参考以下文章

PyTorch学习笔记：简介与基础知识

PyTorch学习笔记基础知识

学习笔记：深度学习——基于PyTorch的BERT应用实践

「深度学习一遍过」必修18：基于pytorch的语义分割模型实现

PyTorch深度学习——逻辑斯蒂回归（分类问题）（B站刘二大人P6学习笔记）

对比学习:《深度学习之Pytorch》《PyTorch深度学习实战》+代码