PyTorch：模型训练-模型参数parameters

Posted 2022-08-13 -柚子皮-

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了PyTorch：模型训练-模型参数parameters相关的知识，希望对你有一定的参考价值。

获取模型参数的不同方法

1、model.named_parameters()，迭代打印model.named_parameters()将会打印每一次迭代元素的名字和param

for name, param in model.named_parameters():
print(name,param.requires_grad)
param.requires_grad=False

[named_parameters(prefix: str = '', recurse: bool = True) → Iterator[Tuple[str, torch.Tensor]]]
2、model.parameters()，迭代打印model.parameters()将会打印每一次迭代元素的param而不会打印名字，这是他和named_parameters的区别，两者都可以用来改变requires_grad的属性

for param in model.parameters():
print(param.requires_grad)
param.requires_grad=False

[parameters(recurse: bool = True) → Iterator[torch.nn.parameter.Parameter]]
3、model.state_dict().items() 每次迭代打印该选项的话，会打印所有的name和param，但是这里的所有的param都是requires_grad=False,没有办法改变requires_grad的属性，所以改变requires_grad的属性只能通过上面的两种方式。

for name, param in model.state_dict().items():
print(name,param.requires_grad=True)

获取参数个数

def get_parameter_number(net):
total_num = sum(p.numel() for p in net.parameters())
trainable_num = sum(p.numel() for p in net.parameters() if p.requires_grad)
return 'Total': total_num, 'Trainable': trainable_num

def get_parameter_number_details(net):
trainable_num_details = name: p.numel() for name, p in net.named_parameters() if p.requires_grad
return 'Trainable': trainable_num_details

model = DCN(...)
print(get_parameter_number(model))
print(get_parameter_number_details(model))

Note: torch.numel(input) → int Returns the total number of elements in the input tensor.

模型参数初始化

神经网络的初始化是训练流程的重要基础环节，会对模型的性能、收敛性、收敛速度等产生重要的影响。

初始化方法

高斯初始化[TORCH.RANDN]

pytorch内置的torch.nn.init方法，常用的初始化操作，例如正态分布、均匀分布、xavier初始化、kaiming初始化等都已经实现，可以直接使用。
nn.init.xavier_uniform(net1[0].weight)

xavier初始化[nn.init.xavier_uniform_(tensor, gain=1.0)]

参数声明及初始化

方式1

self.kernel = nn.Parameter(torch.Tensor(num_pairs, 1))
nn.init.xavier_uniform_(self.kernel)

方式2

route_weights = nn.Parameter(torch.randn(num_capsules, num_route_nodes, in_channels, out_channels))

两种常用的初始化操作

使用pytorch内置的torch.nn.init方法

修改nn.Linear默认的均匀分布初始化为正态分布：

self.linear_layers = nn.ModuleList(
[nn.Linear(hidden_units[i], hidden_units[i + 1]) for i in range(len(hidden_units) - 1)])
for name, tensor in self.linear_layers.named_parameters():
if 'weight' in name:
nn.init.normal_(tensor, mean=0, std=init_std) #init_std=0.0001

[https://pytorch.org/docs/stable/_modules/torch/nn/init.html]

更加灵活的初始化借助numpy

对于自定义的初始化方法，有时tensor的功能不如numpy强大灵活，故可以借助numpy实现初始化方法，再转换到tensor上使用。
for layer in net1.modules():
if isinstance(layer, nn.Linear): # 判断是否是线性层
param_shape = layer.weight.shape
layer.weight.data = torch.from_numpy(np.random.normal(0, 0.5, size=param_shape))
# 定义为均值为 0，方差为 0.5 的正态分布

冻结某些层的参数

在加载预训练模型的时候，想冻结前面几层，使其参数在训练过程中不发生变化。
需要先知道每一层的名字：

net = Network()
for name, value in net.named_parameters():
print('name: 0,\\t grad: 1'.format(name, value.requires_grad))
name: cnn.VGG_16.convolution1_1.weight, grad: True
name: cnn.VGG_16.convolution1_1.bias, grad: True
name: cnn.VGG_16.convolution1_2.weight, grad: True
name: cnn.VGG_16.convolution1_2.bias, grad: True
name: cnn.VGG_16.convolution2_1.weight, grad: True
name: cnn.VGG_16.convolution2_1.bias, grad: True
name: cnn.VGG_16.convolution2_2.weight, grad: True
name: cnn.VGG_16.convolution2_2.bias, grad: True

后面的True表示该层的参数可训练。

我们定义一个要冻结的层的列表：
no_grad = [
'cnn.VGG_16.convolution1_1.weight',
'cnn.VGG_16.convolution1_1.bias',
'cnn.VGG_16.convolution1_2.weight',
'cnn.VGG_16.convolution1_2.bias'
]

冻结方法如下：
for name, value in net.named_parameters():
if name in no_grad:
value.requires_grad = False
else:
value.requires_grad = True

冻结后再打印每层的信息：
name: cnn.VGG_16.convolution1_1.weight, grad: False
name: cnn.VGG_16.convolution1_1.bias, grad: False
name: cnn.VGG_16.convolution1_2.weight, grad: False
name: cnn.VGG_16.convolution1_2.bias, grad: False
name: cnn.VGG_16.convolution2_1.weight, grad: True
name: cnn.VGG_16.convolution2_1.bias, grad: True
name: cnn.VGG_16.convolution2_2.weight, grad: True
name: cnn.VGG_16.convolution2_2.bias, grad: True

可以看到前两层的weight和bias的requires_grad都为False，表示它们不可训练。
最后在定义优化器时，只对requires_grad为True的层的参数进行更新。

optimizer = optim.Adam(filter(lambda p: p.requires_grad, net.parameters()), lr=0.01)

[https://www.zhihu.com/question/311095447/answer/589307812]

[https://mp.weixin.qq.com/s/o-1XZ3OeyDwZp67EdPHu5g]

加载内置预训练模型

torchvision.models模块的子模块中包含以下模型：AlexNet、VGG、ResNet、SqueezeNet、DenseNet。
导入这些模型的方法为：
import torchvision.models as models
resnet18 = models.resnet18()
alexnet = models.alexnet()
vgg16 = models.vgg16()
有一个很重要的参数为pretrained，默认为False，表示只导入模型的结构，其中的权重是随机初始化的。
如果pretrained 为 True，表示导入的是在ImageNet数据集上预训练的模型。
resnet18 = models.resnet18(pretrained=True)

更多的模型可以查看：https://pytorch-cn.readthedocs.io/zh/latest/torchvision/torchvision-models/

from: -柚子皮-

ref:

以上是关于PyTorch：模型训练-模型参数parameters的主要内容，如果未能解决你的问题，请参考以下文章