如何获得每一层的权重形状?

Posted

技术标签:

【中文标题】如何获得每一层的权重形状?【英文标题】:How to get weights shape for each layer? 【发布时间】:2019-02-27 09:19:54 【问题描述】:

有一个很好的问题,如何在 pytorch 中获取模型摘要 Model summary in pytorch 但它不输出权重形状。

是否也可以输出每一层的权重形状?

【问题讨论】:

【参考方案1】:

看起来有可能,这里有一个例子:

import torch
from torchvision import models

m = models.resnet18()
print(m)
print('-'*60)
for l in list(m.named_parameters()):
    print(l[0], ':', l[1].detach().numpy().shape)

哪些输出:

ResNet(
  (conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
  (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (relu): ReLU(inplace)
  (maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
  (layer1): Sequential(
    (0): BasicBlock(
      (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace)
      (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    )
    (1): BasicBlock(
      (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace)
      (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    )
  )
  (layer2): Sequential(
    (0): BasicBlock(
      (conv1): Conv2d(64, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace)
      (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (downsample): Sequential(
        (0): Conv2d(64, 128, kernel_size=(1, 1), stride=(2, 2), bias=False)
        (1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
    (1): BasicBlock(
      (conv1): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace)
      (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    )
  )
  (layer3): Sequential(
    (0): BasicBlock(
      (conv1): Conv2d(128, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace)
      (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (downsample): Sequential(
        (0): Conv2d(128, 256, kernel_size=(1, 1), stride=(2, 2), bias=False)
        (1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
    (1): BasicBlock(
      (conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace)
      (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    )
  )
  (layer4): Sequential(
    (0): BasicBlock(
      (conv1): Conv2d(256, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace)
      (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (downsample): Sequential(
        (0): Conv2d(256, 512, kernel_size=(1, 1), stride=(2, 2), bias=False)
        (1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
    (1): BasicBlock(
      (conv1): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace)
      (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    )
  )
  (avgpool): AvgPool2d(kernel_size=7, stride=1, padding=0)
  (fc): Linear(in_features=512, out_features=1000, bias=True)
)
------------------------------------------------------------
conv1.weight : (64, 3, 7, 7)
bn1.weight : (64,)
bn1.bias : (64,)
layer1.0.conv1.weight : (64, 64, 3, 3)
layer1.0.bn1.weight : (64,)
layer1.0.bn1.bias : (64,)
layer1.0.conv2.weight : (64, 64, 3, 3)
layer1.0.bn2.weight : (64,)
layer1.0.bn2.bias : (64,)
layer1.1.conv1.weight : (64, 64, 3, 3)
layer1.1.bn1.weight : (64,)
layer1.1.bn1.bias : (64,)
layer1.1.conv2.weight : (64, 64, 3, 3)
layer1.1.bn2.weight : (64,)
layer1.1.bn2.bias : (64,)
layer2.0.conv1.weight : (128, 64, 3, 3)
layer2.0.bn1.weight : (128,)
layer2.0.bn1.bias : (128,)
layer2.0.conv2.weight : (128, 128, 3, 3)
layer2.0.bn2.weight : (128,)
layer2.0.bn2.bias : (128,)
layer2.0.downsample.0.weight : (128, 64, 1, 1)
layer2.0.downsample.1.weight : (128,)
layer2.0.downsample.1.bias : (128,)
layer2.1.conv1.weight : (128, 128, 3, 3)
layer2.1.bn1.weight : (128,)
layer2.1.bn1.bias : (128,)
layer2.1.conv2.weight : (128, 128, 3, 3)
layer2.1.bn2.weight : (128,)
layer2.1.bn2.bias : (128,)
layer3.0.conv1.weight : (256, 128, 3, 3)
layer3.0.bn1.weight : (256,)
layer3.0.bn1.bias : (256,)
layer3.0.conv2.weight : (256, 256, 3, 3)
layer3.0.bn2.weight : (256,)
layer3.0.bn2.bias : (256,)
layer3.0.downsample.0.weight : (256, 128, 1, 1)
layer3.0.downsample.1.weight : (256,)
layer3.0.downsample.1.bias : (256,)
layer3.1.conv1.weight : (256, 256, 3, 3)
layer3.1.bn1.weight : (256,)
layer3.1.bn1.bias : (256,)
layer3.1.conv2.weight : (256, 256, 3, 3)
layer3.1.bn2.weight : (256,)
layer3.1.bn2.bias : (256,)
layer4.0.conv1.weight : (512, 256, 3, 3)
layer4.0.bn1.weight : (512,)
layer4.0.bn1.bias : (512,)
layer4.0.conv2.weight : (512, 512, 3, 3)
layer4.0.bn2.weight : (512,)
layer4.0.bn2.bias : (512,)
layer4.0.downsample.0.weight : (512, 256, 1, 1)
layer4.0.downsample.1.weight : (512,)
layer4.0.downsample.1.bias : (512,)
layer4.1.conv1.weight : (512, 512, 3, 3)
layer4.1.bn1.weight : (512,)
layer4.1.bn1.bias : (512,)
layer4.1.conv2.weight : (512, 512, 3, 3)
layer4.1.bn2.weight : (512,)
layer4.1.bn2.bias : (512,)
fc.weight : (1000, 512)
fc.bias : (1000,)

【讨论】:

以上是关于如何获得每一层的权重形状?的主要内容,如果未能解决你的问题,请参考以下文章

如何在每一层和每个时期获得权重然后保存在文件中

Keras - 获得训练层的权重

CNN输出每一层的卷积核,即每一层的权重矩阵和偏移量矩阵

了解每一层的 Keras 权重矩阵

Keras 函数(K.function)不适用于 RNN(提供的代码)

神经网络简介二