为什么/如何model.forward()成功上都投入是小批量VS单个项目?

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了为什么/如何model.forward()成功上都投入是小批量VS单个项目?相关的知识,希望对你有一定的参考价值。

为什么以及如何工作的呢?

当我运行输入前进阶段

  • 是小批量张量
  • 或者可替换地作为一个单一的输入项

model.__call__()(其AFAIK被呼叫前转())吞下该和溢出出足够的输出(即小批量估计或估计的单个项目的张量)

采用从Pytorch NN example testcode展示了我的意思,但我不明白这一点。

我希望它创造的问题,我不得不单项输入转换成小批量大小1(重塑(1 XXX))或同样的,就像我在下面的代码一样。

(I做了测试的变型,以确保它是例如不依赖于执行顺序)

# -*- coding: utf-8 -*-
import torch

# N is batch size; D_in is input dimension;
# H is hidden dimension; D_out is output dimension.
#N, D_in, H, D_out = 64, 1000, 100, 10
N, D_in, H, D_out = 64, 10, 4, 3

# Create random Tensors to hold inputs and outputs
x = torch.randn(N, D_in)
y = torch.randn(N, D_out)

# Use the nn package to define our model as a sequence of layers. nn.Sequential
# is a Module which contains other Modules, and applies them in sequence to
# produce its output. Each Linear Module computes output from input using a
# linear function, and holds internal Tensors for its weight and bias.
model = torch.nn.Sequential(
    torch.nn.Linear(D_in, H),
    torch.nn.ReLU(),
    torch.nn.Linear(H, D_out),
)

# The nn package also contains definitions of popular loss functions; in this
# case we will use Mean Squared Error (MSE) as our loss function.
loss_fn = torch.nn.MSELoss(reduction='sum')

learning_rate = 1e-4
for t in range(1):
    # Forward pass: compute predicted y by passing x to the model. Module objects
    # override the __call__ operator so you can call them like functions. When
    # doing so you pass a Tensor of input data to the Module and it produces
    # a Tensor of output data.
    model.eval()
    print ("###########")
    print ("x[0]",x[0])
    print ("x[0].size()", x[0].size())
    y_1pred = model(x[0])
    print ("y_1pred.size()", y_1pred.size())
    print (y_1pred)

    model.eval()
    print ("###########")
    print ("x.size()", x.size())
    y_pred = model(x)
    print ("y_pred.size()", y_pred.size())
    print ("y_pred[0]", y_pred[0])

    print ("###########")
    model.eval()
    input_item = x[0]
    batch_len1_shape = torch.Size([1,*(input_item.size())])
    batch_len1 = input_item.reshape(batch_len1_shape)
    y_pred_batch_len1 = model(batch_len1) 

    print ("input_item",input_item)
    print ("input_item.size()", input_item.size())
    print ("y_pred_batch_len1.size()", y_pred_batch_len1.size())
    print (y_1pred)

    raise Exception

这是它产生的输出:

###########
x[0] tensor([-1.3901, -0.2659,  0.4352, -0.6890,  0.1098, -0.3124,  0.6419,  1.1004,
        -0.7910, -0.5389])
x[0].size() torch.Size([10])
y_1pred.size() torch.Size([3])
tensor([-0.5366, -0.4826,  0.0538], grad_fn=<AddBackward0>)
###########
x.size() torch.Size([64, 10])
y_pred.size() torch.Size([64, 3])
y_pred[0] tensor([-0.5366, -0.4826,  0.0538], grad_fn=<SelectBackward>)
###########
input_item tensor([-1.3901, -0.2659,  0.4352, -0.6890,  0.1098, -0.3124,  0.6419,  1.1004,
        -0.7910, -0.5389])
input_item.size() torch.Size([10])
y_pred_batch_len1.size() torch.Size([1, 3])
tensor([-0.5366, -0.4826,  0.0538], grad_fn=<AddBackward0>)
答案

nn.Linear状态的文档是

输入:(N,*,in_features)其中*是指任何数量的附加维度的

所以人们自然想到的是至少两个维度是必要的。但是,如果我们看看under the hood我们将看到Linearnn.functional.linear,其中dispatchestorch.addmmtorch.matmul方面实现(取决于是否bias == True)的广播他们的论点。

因此,这种行为可能是一个错误(或文档错误),我不会依赖于它在未来的工作,如果我是你。

以上是关于为什么/如何model.forward()成功上都投入是小批量VS单个项目?的主要内容,如果未能解决你的问题,请参考以下文章

(每次调用 glTranslate 在模型视图矩阵上都是累积的)这是啥意思以及如何禁用此功能?

669 创新也是搞政治?(如何创新)

作曲家在每个项目上都内存不足,Mac OS X

SAP BAPI 使用

ArcGIS知乎上都有哪些干货可以推荐

成功的软件工程师共有的10个习惯和技能