Pytorch - 张量的元素 0 不需要 grad 并且没有 grad_fn - 将矩阵相加和相乘作为 NN 步骤参数

Posted

技术标签:

【中文标题】Pytorch - 张量的元素 0 不需要 grad 并且没有 grad_fn - 将矩阵相加和相乘作为 NN 步骤参数【英文标题】:Pytorch - element 0 of tensors does not require grad and does not have a grad_fn - Adding and Multiplying matrices as NN step parameters 【发布时间】:2021-10-26 16:33:32 【问题描述】:

我是 PyTorch 的新手。

我已经实现了一个自定义模型(基于一篇研究论文),尝试训练它时出现此错误。

element 0 of tensors does not require grad and does not have a grad_fn

这是我的模型代码:

class Classification(tnn.Module):
    def __init__(self, n_classes=7):
        super(Classification, self).__init__()
        self.feature_extractor1 = VGG16FeatureExtactor() # tnn.Module
        self.feature_extractor2 = VGG16FeatureExtactor() # tnn.Module
        self.conv = conv_layer_relu(chann_in=512, chann_out=512, k_size=1, p_size=0) # tnn.Sequential
        self.attn1 = AttentionBlock1() # tnn.Module
        self.attn2 = AttentionBlock2() # tnn.Module

        # FC layers
        self.linear1 = vggFCLayer(512, 256) # tnn.Sequential
        self.linear2 = vggFCLayer(256, 128) # tnn.Sequential

        # Final layer
        self.final = tnn.Linear(128, n_classes)

    def forward(self, x, x_lbp):
        features1 = self.feature_extractor1(x)
        features2 = self.feature_extractor2(x_lbp)
        f3 = torch.add(features1, features2)
        # Apply attention block1 to features1
        attn1 = self.attn1(features1)
        # Create mask using Attention Block 2
        attn2 = self.attn2(features2)
        mask = attn1 * attn2
        add = f3 + mask

        out = self.conv(add)
        out = out.view(out.size(0), -1)
        out = self.linear1(out)
        out = self.linear2(out)
        out = self.final(out)
        return out

这是我的训练代码:

criterion = nn.CrossEntropyLoss(weight=weights)
optimizer = optim.SGD(model.parameters(), lr=lr,momentum=0.9, nesterov=True)
scheduler = optim.lr_scheduler.ReduceLROnPlateau(optimizer, 'min')

    def train(self, epoch, trainloader):
        self.model.train()
        for batch_idx, (inputs, targets) in enumerate(trainloader):
            lbp = lbp_transform(inputs)
            self.optimizer.zero_grad()
            with torch.no_grad():
                outputs = self.model(inputs, lbp)
            loss = self.criterion(outputs, targets)
            loss.backward()
            self.scheduler.step(loss)
            utils.clip_gradient(self.optimizer, 0.1)
            self.optimizer.step()

已编辑

这是完整的错误堆栈跟踪:

RuntimeError                              Traceback (most recent call last)
<ipython-input-17-b4e6536e5301> in <module>()
      1 modelTraining = train_vggwithattention.ModelTraining(model = model , criterion=criterion,optimizer=optimizer, scheduler=scheduler, use_cuda=True)
      2 for epoch in range(start_epoch, num_epochs):
----> 3     modelTraining.train(epoch=epoch, trainloader=trainloader)
      4     # PublicTest(epoch)
      5     # writer.flush()

2 frames
/content/train_vggwithattention.py in train(self, epoch, trainloader)
     45             loss = self.criterion(outputs, targets)
     46             # loss.requires_grad = True
---> 47             loss.backward()
     48             for name, param in self.model.named_parameters():
     49                 print(name, param.grad)

/usr/local/lib/python3.7/dist-packages/torch/_tensor.py in backward(self, gradient, retain_graph, create_graph, inputs)
    253                 create_graph=create_graph,
    254                 inputs=inputs)
--> 255         torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
    256 
    257     def register_hook(self, hook):

/usr/local/lib/python3.7/dist-packages/torch/autograd/__init__.py in backward(tensors, grad_tensors, retain_graph, create_graph, grad_variables, inputs)
    147     Variable._execution_engine.run_backward(
    148         tensors, grad_tensors_, retain_graph, create_graph, inputs,
--> 149         allow_unreachable=True, accumulate_grad=True)  # allow_unreachable flag
    150 
    151 

RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

感谢任何帮助。谢谢!

【问题讨论】:

@Ivan,添加了完整的错误回溯 【参考方案1】:

您正在使用torch.no_grad() 上下文管理器推断输出,这意味着将不会保存层的激活并且无法进行反向传播。

因此,您必须在 train 函数中替换以下行:

        with torch.no_grad():
            outputs = self.model(inputs, lbp)

简单地说:

        outputs = self.model(inputs, lbp)

【讨论】:

以上是关于Pytorch - 张量的元素 0 不需要 grad 并且没有 grad_fn - 将矩阵相加和相乘作为 NN 步骤参数的主要内容,如果未能解决你的问题,请参考以下文章

pytorch 笔记:叶子张量

PyTorch 1.0 中文文档:torch.sparse

PyTorch 向量/矩阵/张量的元素乘积

PyTorch从入门到精通100讲-PyTorch张量从概念到应用

pytorch笔记 meshgrid

将函数元素应用于 Pytorch CUDA 张量的并行方式