PyTorch:将预训练模型从 3 个 RGB 通道更改为 4 个通道后,出现“ValueError:无法优化非叶张量”

Posted

技术标签:

【中文标题】PyTorch:将预训练模型从 3 个 RGB 通道更改为 4 个通道后,出现“ValueError:无法优化非叶张量”【英文标题】:PyTorch: "ValueError: can't optimize a non-leaf Tensor" after changing pretrained model from 3 RGB Channels to 4 Channels 【发布时间】:2021-04-24 07:34:41 【问题描述】:

我一直在尝试将预训练的 PyTorch Densenet 的第一个卷积层从 3 个通道更改为 4 个通道,同时保持其原始 RGB 通道的预训练权重。我已经完成了以下代码,但是优化器部分向我抛出了这个错误:"ValueError: can't optimize a non-leaf Tensor"

import torchvision.models as models
import torch.nn as nn
backbone = models.__dict__['densenet169'](pretrained=True)


weight1 = backbone.features.conv0.weight.data.clone()
new_first_layer  = nn.Conv2d(4, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
with torch.no_grad():
    new_first_layer.weight[:,:3] = weight1

backbone.features.conv0 = new_first_layer
optimizer = torch.optim.SGD(backbone.parameters(), 0.001,
                                 weight_decay=0.1)  # Changing this optimizer from SGD to ADAM

我也尝试删除参数with torch.no_grad():,但这个问题仍然存在:

  ValueError                                Traceback (most recent call last)
<ipython-input-343-5fc87352da04> in <module>()
     11 backbone.features.conv0 = new_first_layer
     12 optimizer = torch.optim.SGD(res.parameters(), 0.001,
---> 13                                  weight_decay=0.1)  # Changing this optimizer from SGD to ADAM

~/anaconda3/envs/detectron2/lib/python3.6/site-packages/torch/optim/sgd.py in __init__(self, params, lr, momentum, dampening, weight_decay, nesterov)
     66         if nesterov and (momentum <= 0 or dampening != 0):
     67             raise ValueError("Nesterov momentum requires a momentum and zero dampening")
---> 68         super(SGD, self).__init__(params, defaults)
     69 
     70     def __setstate__(self, state):

~/anaconda3/envs/detectron2/lib/python3.6/site-packages/torch/optim/optimizer.py in __init__(self, params, defaults)
     50 
     51         for param_group in param_groups:
---> 52             self.add_param_group(param_group)
     53 
     54     def __getstate__(self):

~/anaconda3/envs/detectron2/lib/python3.6/site-packages/torch/optim/optimizer.py in add_param_group(self, param_group)
    231                                 "but one of the params is " + torch.typename(param))
    232             if not param.is_leaf:
--> 233                 raise ValueError("can't optimize a non-leaf Tensor")
    234 
    235         for name, default in self.defaults.items():

ValueError: can't optimize a non-leaf Tensor

我的 PyTorch 版本是:1.7.0。

你们能帮忙吗?非常感谢!

问候。

【问题讨论】:

代码optimizer = torch.optim.SGD(res.parameters(), ...中的res是什么?你没有包括它,它就在产生错误的那一行。 查看这个答案torch.optim returns “ValueError: can't optimize a non-leaf Tensor” for multidimensional tensor 我的错,这是一个错字,应该是 optimizer = torch.optim.SGD(backbone.parameters(), 0.001,weight_decay=0.1) 而不是 res .. @KlausJude 【参考方案1】:

我想我已经解决了这个问题!:

import torchvision.models as models
import torch.nn as nn
from torch.autograd import Variable
backbone = models.__dict__['densenet169'](pretrained=True)
weight1 = backbone.features.conv0.weight.clone()
new_first_layer  = nn.Conv2d(4, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False).requires_grad_()
new_first_layer.weight[:,:3,:,:].data[...] =  Variable(weight1, requires_grad=True)
backbone.features.conv0 = new_first_layer
optimizer = torch.optim.SGD(res.parameters(), 0.001,
                                 weight_decay=0.1)  

【讨论】:

以上是关于PyTorch:将预训练模型从 3 个 RGB 通道更改为 4 个通道后,出现“ValueError:无法优化非叶张量”的主要内容,如果未能解决你的问题,请参考以下文章

无法将 PyTorch 模型导出到 ONNX

Tensorflow:如何将预训练模型已经嵌入的数据输入到 LSTM 模型中?

手动从 Tensorflow 导入 LSTM 到 PyTorch

将预训练的 CoreML 模型拆分为两部分

无需在 Pytorch 中进行微调即可从预训练模型中获取 128 个暗淡的特征向量

奉献pytorch 搭建 CNN 卷积神经网络训练图像识别的模型,配合numpy 和matplotlib 一起使用调用 cuda GPU进行加速训练