Pytorch：为啥在 nn.modules.loss 和 nn.functional 模块中都实现了损失函数？

Posted 2023-03-12

技术标签:

【中文标题】Pytorch：为啥在 nn.modules.loss 和 nn.functional 模块中都实现了损失函数？【英文标题】：Pytorch: Why loss functions are implemented both in nn.modules.loss and nn.functional module?Pytorch：为什么在 nn.modules.loss 和 nn.functional 模块中都实现了损失函数？ 【发布时间】：2019-07-06 20:40:26 【问题描述】：

Pytorch 中的许多损失函数都在 nn.modules.loss 和 nn.functional 中实现。

例如，下面的两行返回相同的结果。

import torch.nn as nn
import torch.functional as F
nn.L1Loss()(x,y)
F.l1_loss(x,y)

为什么有两种实现方式？

其他参数损失函数的一致性损失函数的实例化带来了一些好处否则

【问题讨论】：

【参考方案1】：

有没有doc的BCEWithLogistsLoss的代码：

class BCEWithLogitsLoss(_Loss):
    def __init__(self, weight: Optional[Tensor] = None, size_average=None, reduce=None, reduction: str = 'mean',
                 pos_weight: Optional[Tensor] = None) -> None:
        super(BCEWithLogitsLoss, self).__init__(size_average, reduce, reduction)
        self.register_buffer('weight', weight)
        self.register_buffer('pos_weight', pos_weight)

    def forward(self, input: Tensor, target: Tensor) -> Tensor:
        return F.binary_cross_entropy_with_logits(input, target,
                                                  self.weight,
                                                  pos_weight=self.pos_weight,
                                                  reduction=self.reduction)

如果不考虑参数传递，类和函数实现完全相同。但是，使用类实现可以使您的代码更加简洁易读，例如使用函数

loss_func=binary_cross_entropy_with_logits
def train(model, dataloader, loss_fn, optimizer, weight, size_average, reduce, reduction, pos_weight):
    for x, y in dataloader:
        model.zero_grad()
        y_pred = model(x)
        loss = loss_fn(y_pred, y, weight, size_average, reduce, reduction, pos_weight)
        loss.backward()
        optimizer.step()

使用类

loss_func = BCEWithLogitsLoss(weight, size_average, reduce, reduction, pos_weight)
def train(model, dataloader, loss_fn, optimizer):
    for x, y in dataloader:
        model.zero_grad()
        y_pred = model(x)
        loss = loss_fn(y_pred, y)
        loss.backward()
        optimizer.step()

如果你有几个参数或不同的损失函数，类实现会更好。

【讨论】：

【参考方案2】：

我认为它是一种部分应用情况——能够将许多配置变量与损失函数对象“捆绑”起来很有用。在大多数情况下，您的损失函数必须将prediction 和ground_truth 作为其参数。这使得损失函数的基本 API 相当统一。但是，它们在细节上有所不同。例如，并非每个损失函数都有reduction 参数。 BCEWithLogitsLoss 有 weight 和 pos_weight 参数； PoissonNLLLoss 有 log_input、eps。写这样的函数很方便

def one_epoch(model, dataset, loss_fn, optimizer):
    for x, y in dataset:
        model.zero_grad()
        y_pred = model(x)
        loss = loss_fn(y_pred, y)
        loss.backward()
        optimizer.step()

它可以与实例化的BCEWithLogitsLoss 和PoissonNLLLoss 一样好用。但它不能与它们的功能对应物一起工作，因为需要记账。相反，您必须先创建

loss_fn_packed = functools.partial(F.binary_cross_entropy_with_logits, weight=my_weight, reduction='sum')

只有这样你才能将它与上面定义的one_epoch 一起使用。但是这个包装已经提供了面向对象的损失 API，以及一些花里胡哨（因为损失子类nn.Module，你可以使用前向和后向挂钩，在 cpu 和 gpu 之间移动东西等）。

【讨论】：

以上是关于Pytorch：为啥在 nn.modules.loss 和 nn.functional 模块中都实现了损失函数？的主要内容，如果未能解决你的问题，请参考以下文章