Sigmoid 与二元交叉熵损失
Posted
技术标签:
【中文标题】Sigmoid 与二元交叉熵损失【英文标题】:Sigmoid vs Binary Cross Entropy Loss 【发布时间】:2021-11-25 23:45:18 【问题描述】:在我的torch模型中,最后一层是torch.nn.Sigmoid()
,损失是torch.nn.BCELoss
。
在训练步骤中,出现了如下错误:
RuntimeError: torch.nn.functional.binary_cross_entropy and torch.nn.BCELoss are unsafe to autocast.
Many models use a sigmoid layer right before the binary cross entropy layer.
In this case, combine the two layers using torch.nn.functional.binary_cross_entropy_with_logits
or torch.nn.BCEWithLogitsLoss. binary_cross_entropy_with_logits and BCEWithLogits are
safe to autocast.
但是,当尝试在计算损失和反向传播时重现此错误时,一切正常:
import torch
from torch import nn
# last layer
sigmoid = nn.Sigmoid()
# loss
bce_loss = nn.BCELoss()
# the true classes
true_cls = torch.tensor([
[0.],
[1.]])
# model prediction classes
pred_cls = sigmoid(
torch.tensor([
[0.4949],
[0.4824]],requires_grad=True)
)
pred_cls
# tensor([[0.6213],
# [0.6183]], grad_fn=<SigmoidBackward>)
out = bce_loss(pred_cls, true_cls)
out
# tensor(0.7258, grad_fn=<BinaryCrossEntropyBackward>)
out.backward()
我错过了什么? 感谢您提供的任何帮助。
【问题讨论】:
【参考方案1】:您必须先将其移至cuda
并启用autocast
,如下所示:
import torch
from torch import nn
from torch.cuda.amp import autocast
# last layer
sigmoid = nn.Sigmoid().cuda()
# loss
bce_loss = nn.BCELoss().cuda()
# the true classes
true_cls = torch.tensor([
[0.],
[1.]]).cuda()
with autocast():
# model prediction classes
pred_cls = sigmoid(
torch.tensor([
[0.4949],
[0.4824]], requires_grad=True
).cuda()
)
pred_cls
# tensor([[0.6213],
# [0.6183]], grad_fn=<SigmoidBackward>)
out = bce_loss(pred_cls, true_cls)
out
# tensor(0.7258, grad_fn=<BinaryCrossEntropyBackward>)
out.backward()
RuntimeError: torch.nn.functional.binary_cross_entropy and torch.nn.BCELoss are unsafe to autocast.
Many models use a sigmoid layer right before the binary cross entropy layer.
In this case, combine the two layers using torch.nn.functional.binary_cross_entropy_with_logits
or torch.nn.BCEWithLogitsLoss. binary_cross_entropy_with_logits and BCEWithLogits are
safe to autocast.
【讨论】:
以上是关于Sigmoid 与二元交叉熵损失的主要内容,如果未能解决你的问题,请参考以下文章
为啥tf模型训练时的二元交叉熵损失与sklearn计算的不同?
具有对数损失的 TensorFlow 单 sigmoid 输出与具有稀疏 softmax 交叉熵损失的两个线性输出,用于二进制分类
pytorch二元交叉熵损失函数 nn.BCELoss() 与 torch.nn.BCEWithLogitsLoss()