如何在 Pytorch 中获取神经网络每一层的输出维度？

Posted 2023-03-12

技术标签:

【中文标题】如何在 Pytorch 中获取神经网络每一层的输出维度？【英文标题】：How to get an output dimension for each layer of the Neural Network in Pytorch? 【发布时间】：2019-09-16 10:05:31 【问题描述】：

class Model(nn.Module):
  def __init__(self):
    super(Model, self).__init__()
    self.net = nn.Sequential(
      nn.Conv2d(in_channels = 3, out_channels = 16), 
      nn.ReLU(), 
      nn.MaxPool2d(2),
      nn.Conv2d(in_channels = 16, out_channels = 16), 
      nn.ReLU(),
      Flatten(),
      nn.Linear(4096, 64),
      nn.ReLU(),
      nn.Linear(64, 10))

  def forward(self, x):
    return self.net(x)

我在没有扎实的神经网络知识的情况下创建了这个模型，我只是固定了参数，直到它在训练中起作用。我不确定如何获取每一层的输出维度（例如第一层之后的输出维度）。

在 Pytorch 中是否有一种简单的方法可以做到这一点？

【问题讨论】：

这能回答你的问题吗？ Model summary in pytorch 【参考方案1】：

一个简单的方法是：

将输入传递给模型。通过每一层后打印输出的大小。

class Model(nn.Module):
  def __init__(self):
    super(Model, self).__init__()
    self.net = nn.Sequential(
      nn.Conv2d(in_channels = 3, out_channels = 16), 
      nn.ReLU(), 
      nn.MaxPool2d(2),
      nn.Conv2d(in_channels = 16, out_channels = 16), 
      nn.ReLU(),
      Flatten(),
      nn.Linear(4096, 64),
      nn.ReLU(),
      nn.Linear(64, 10))

  def forward(self, x):
    for layer in self.net:
        x = layer(x)
        print(x.size())
    return x

model = Model()
x = torch.randn(1, 3, 224, 224)

# Let's print it
model(x)

但请注意输入大小，因为您在网络中使用nn.Linear。如果您的输入大小不是4096，则会导致 nn.Linear 的输入大小不兼容。

【讨论】：

什么是 (1, 3, 244, 244)？只创建一个输入的伪示例：x = torch.randn(1, 3, 244, 244) 我的模型中有多个网络和多个输入，其中一些输入看起来像 torch.Size([20,16000])？【参考方案2】：

您可以使用 torchsummary，例如，ImageNet 尺寸（3x224x224）：

from torchvision import models
from torchsummary import summary

vgg = models.vgg16()
summary(vgg, (3, 224, 224)


----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
================================================================
            Conv2d-1         [-1, 64, 224, 224]           1,792
              ReLU-2         [-1, 64, 224, 224]               0
            Conv2d-3         [-1, 64, 224, 224]          36,928
              ReLU-4         [-1, 64, 224, 224]               0
         MaxPool2d-5         [-1, 64, 112, 112]               0
            Conv2d-6        [-1, 128, 112, 112]          73,856
              ReLU-7        [-1, 128, 112, 112]               0
            Conv2d-8        [-1, 128, 112, 112]         147,584
              ReLU-9        [-1, 128, 112, 112]               0
        MaxPool2d-10          [-1, 128, 56, 56]               0
           Conv2d-11          [-1, 256, 56, 56]         295,168
             ReLU-12          [-1, 256, 56, 56]               0
           Conv2d-13          [-1, 256, 56, 56]         590,080
             ReLU-14          [-1, 256, 56, 56]               0
           Conv2d-15          [-1, 256, 56, 56]         590,080
             ReLU-16          [-1, 256, 56, 56]               0
        MaxPool2d-17          [-1, 256, 28, 28]               0
           Conv2d-18          [-1, 512, 28, 28]       1,180,160
             ReLU-19          [-1, 512, 28, 28]               0
           Conv2d-20          [-1, 512, 28, 28]       2,359,808
             ReLU-21          [-1, 512, 28, 28]               0
           Conv2d-22          [-1, 512, 28, 28]       2,359,808
             ReLU-23          [-1, 512, 28, 28]               0
        MaxPool2d-24          [-1, 512, 14, 14]               0
           Conv2d-25          [-1, 512, 14, 14]       2,359,808
             ReLU-26          [-1, 512, 14, 14]               0
           Conv2d-27          [-1, 512, 14, 14]       2,359,808
             ReLU-28          [-1, 512, 14, 14]               0
           Conv2d-29          [-1, 512, 14, 14]       2,359,808
             ReLU-30          [-1, 512, 14, 14]               0
        MaxPool2d-31            [-1, 512, 7, 7]               0
           Linear-32                 [-1, 4096]     102,764,544
             ReLU-33                 [-1, 4096]               0
          Dropout-34                 [-1, 4096]               0
           Linear-35                 [-1, 4096]      16,781,312
             ReLU-36                 [-1, 4096]               0
          Dropout-37                 [-1, 4096]               0
           Linear-38                 [-1, 1000]       4,097,000
================================================================
Total params: 138,357,544
Trainable params: 138,357,544
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.57
Forward/backward pass size (MB): 218.59
Params size (MB): 527.79
Estimated Total Size (MB): 746.96
----------------------------------------------------------------

来源：model-summary-in-pytorch

【讨论】：

什么是 (3, 244, 244)？ @Dawn17 这是单个图像的尺寸（对于 MNIST，它是 1x28x28）我收到此错误

RuntimeError: Expected 4-dimensional input for 4-dimensional weight [64, 3, 3, 3], but got 5-dimensional input of size [2, 1, 3, 224, 224] instead

我必须放多大尺寸？ @Dawn17 我需要查看您的代码来帮助您，但我猜您在 1x28x28 和 VGG 输入为 3x224x224 的网络 MNIST 中运行。因此，首先在 forward 方法中尝试将其重塑为：'out.view(out.shape[0], -1)'，其次，将模型更改为您的模型，而不是我的示例中的 VGG。【参考方案3】：

类似于 David Ng 的回答，但略短：

def get_output_shape(model, image_dim):
    return model(torch.rand(*(image_dim))).data.shape

在这个例子中，我需要找出最后一个线性层的输入：

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.expected_input_shape = (1, 1, 192, 168)
        self.conv1 = nn.Conv2d(1, 32, 3, 1) 
        self.conv2 = nn.Conv2d(32, 64, 3, 1)
        self.dropout1 = nn.Dropout2d(0.25)
        self.dropout2 = nn.Dropout2d(0.5)
        self.maxpool1 = nn.MaxPool2d(2)
        self.maxpool2 = nn.MaxPool2d(3)

        # Calculate the input of the Linear layer
        conv1_out = get_output_shape(self.maxpool1, get_output_shape(conv1, self.expected_input_shape))
        conv2_out = get_output_shape(self.maxpool2, get_output_shape(conv2, conv1_out)) 
        fc1_in = np.prod(list(conv2_out)) # Flatten

        self.fc1 = nn.Linear(fc1_in, 38)

    def forward(self, x):
        x = self.conv1(x) 
        x = F.relu(x)
        x = self.maxpool1(x) 
        x = self.conv2(x)
        x = F.relu(x)
        x = self.maxpool2(x) 
        x = self.dropout1(x) 
        x = torch.flatten(x, 1) # flatten to a single dimension
        x = self.fc1(x) 
        output = F.log_softmax(x, dim=1) 
        return output

这样，如果我对之前的层进行更改，我就不必重新计算！

我的回答是基于this answer

【讨论】：

【参考方案4】：

在nn.Sequential 容器中的某个层之后获取大小的另一种方法是添加一个自定义的Module，它只打印出输入的大小。

class PrintSize(nn.Module):
  def __init__(self):
    super(PrintSize, self).__init__()
    
  def forward(self, x):
    print(x.shape)
    return x

现在你可以这样做了：

model = nn.Sequential(
    nn.Conv2d(3, 10, 5, 1),
    // lots of convolutions, pooling, etc.
    nn.Flatten(),
    PrintSize(),
    nn.Linear(1, 12), // the input dim of 1 is just a placeholder
)

现在，您可以执行model(x)，它会在Conv2d 层运行后打印输出的形状。如果您有很多卷积并想弄清楚第一个全连接层的最终尺寸是多少，这很有用。您无需将 nn.Sequential 重新格式化为模块，只需一行即可放入此帮助程序类。

【讨论】：

如果我们不想改变现有模型，那么 layer.register_forward_hook 更好【参考方案5】：

这是一个辅助函数形式的解决方案：

def get_tensor_dimensions_impl(model, layer, image_size, for_input=False):
    t_dims = None
    def _local_hook(_, _input, _output):
        nonlocal t_dims
        t_dims = _input[0].size() if for_input else _output.size()
        return _output    
    layer.register_forward_hook(_local_hook)
    dummy_var = torch.zeros(1, 3, image_size, image_size)
    model(dummy_var)
    return t_dims

例子：

from torchvision import models, transforms

a_model = models.squeezenet1_0(pretrained=True) 
get_tensor_dimensions_impl(a_model, a_model._modules['classifier'], 224)

输出是：

torch.Size([1, 1000, 1, 1])

【讨论】：

【参考方案6】：

也许你可以试试print(model.state_dict()['next_layer.weight'].shape)。这会提示您最后一层的输出形状。

【讨论】：

【参考方案7】：

for layer in model.children():
    if hasattr(layer, 'out_features'):
        print(layer.out_features)

【讨论】：

以上是关于如何在 Pytorch 中获取神经网络每一层的输出维度？的主要内容，如果未能解决你的问题，请参考以下文章

如何在 Pytorch 中应用分层学习率？

pytorch 中LSTM模型获取最后一层的输出结果，单向或双向

pytorch_05_神经网络

最后一层的张量输出在 PyTorch 中的形状错误

Batch Norm 对神经网络中的每一层进行正则化(未完成)

如何使用 dynamic_rnn 获取多层 RNN 中每一步和每一层的状态