为啥新层在修改后的预训练 pytorch 模型中被忽略？

Posted 2023-03-13

技术标签:

【中文标题】为啥新层在修改后的预训练 pytorch 模型中被忽略？【英文标题】：Why does new layer get ignored in modified pretrained pytorch model?为什么新层在修改后的预训练 pytorch 模型中被忽略？ 【发布时间】：2021-10-12 12:23:35 【问题描述】：

我正在尝试将分类层添加到预训练的 Bert 模型。我尝试了一些人们在网上发布的内容，例如：

mod = list(model.children())

mod.pop()

mod.append(torch.nn.Linear(768, num_classes))

new_classifier = torch.nn.Sequential(*mod)

model.classifier = new_classifier

当我打印出模型时，我可以在摘要中看到新层，但是当我尝试训练或预测 pytorch 时，只会忽略那个新层。有谁知道这是怎么回事？我是 pytorch 的新手。

最后两行也试过了：

new_classifier = torch.nn.Sequential(*list(mod))

model = new_classifier

但这会引发“NoneType”对象不可下标错误。

【问题讨论】：

【参考方案1】：

从您的帖子看来，您似乎正在尝试替换模型的最后一层。试试这个，看看它是否可以解决问题

new_classifier = torch.nn.Sequential(*(list(model.children())[:-1]), nn.Linear(768, num_classes))
model = new_classifier

【讨论】：

感谢您的建议。不幸的是，当我传递 new_classifier 数据时，我收到一个错误，即“NoneType”对象不可下标。另外，这是另一个建议，只需通过添加以下行来添加另一层：“model.classifier = torch.nn.Linear(768, num_classes)”，但是当我传递数据时它忽略这一行，只返回 768 暗淡嵌入（虽然没有错误）能否请您提供完整的错误和更多详细信息？这里有更多错误细节：~/DNABERT/src/transformers/modeling_bert.py in forward(self, hidden_states, attention_mask, head_mask, encoder_hidden_states, encoder_attention_mask) 409 410 layer_outputs = layer_module( --> 411 hidden_states , attention_mask, head_mask[i], encoder_hidden_states, encoder_attention_mask 412 ) 413 hidden_states = layer_outputs[0] TypeError: 'NoneType' object is not subscriptable" 我想我明白了。我需要做类似的事情： class PosModel(nn.Module): def __init__(self): super(PosModel, self).__init__() self.base_model = model self.linear = nn.Linear(768, 1213) def forward (self, input_ids): outputs = self.base_model(input_ids) # 你在这里写你的新头 outputs = self.linear(outputs[1]) return outputs new_model = PosModel()

以上是关于为啥新层在修改后的预训练 pytorch 模型中被忽略？的主要内容，如果未能解决你的问题，请参考以下文章

Pytorch 中的预训练模型

有啥方法可以将 PyTorch 中可用的预训练模型下载到特定路径？

Pytorch - 跳过计算每个时期的预训练模型的特征

如果我们扩展或减少同一模型的层，我们仍然可以从 Pytorch 中的预训练模型进行训练吗？

pytorch中修改后的模型如何加载预训练模型

PyTorch 的预训练，是时候学习一下了