我使用 Torchvision RetinaNet 的输入数据结构有问题吗?
Posted
技术标签:
【中文标题】我使用 Torchvision RetinaNet 的输入数据结构有问题吗?【英文标题】:Problem with my input data structure using Torchvision RetinaNet? 【发布时间】:2021-12-15 16:22:55 【问题描述】:我相信我的输入数据具有 Torchvision RetinaNet 要求的正确结构,但是我收到一个错误,暗示可能不是。我已经包含了回调和重现问题的最小示例。
在分类头中计算损失时发生此错误。它不会在回归头中发生计算损失。
这是回调
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
/tmp/ipykernel_1483/2833406441.py in <module>
41 img_batch, targets_batch = retinanet_collate_fn(batch_size=2)
42
---> 43 outputs = model(img_batch, targets_batch)
/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
1100 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
1101 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1102 return forward_call(*input, **kwargs)
1103 # Do not call functions when jit is used
1104 full_backward_hooks, non_full_backward_hooks = [], []
/opt/conda/lib/python3.7/site-packages/torchvision/models/detection/retinanet.py in forward(self, images, targets)
530
531 # compute the losses
--> 532 losses = self.compute_loss(targets, head_outputs, anchors)
533 else:
534 # recover level sizes
/opt/conda/lib/python3.7/site-packages/torchvision/models/detection/retinanet.py in compute_loss(self, targets, head_outputs, anchors)
394 matched_idxs.append(self.proposal_matcher(match_quality_matrix))
395
--> 396 return self.head.compute_loss(targets, head_outputs, anchors, matched_idxs)
397
398 def postprocess_detections(self, head_outputs, anchors, image_shapes):
/opt/conda/lib/python3.7/site-packages/torchvision/models/detection/retinanet.py in compute_loss(self, targets, head_outputs, anchors, matched_idxs)
49 # type: (List[Dict[str, Tensor]], Dict[str, Tensor], List[Tensor], List[Tensor]) -> Dict[str, Tensor]
50 return
---> 51 'classification': self.classification_head.compute_loss(targets, head_outputs, matched_idxs),
52 'bbox_regression': self.regression_head.compute_loss(targets, head_outputs, anchors, matched_idxs),
53
/opt/conda/lib/python3.7/site-packages/torchvision/models/detection/retinanet.py in compute_loss(self, targets, head_outputs, matched_idxs)
113 foreground_idxs_per_image,
114 targets_per_image['labels'][matched_idxs_per_image[foreground_idxs_per_image]]
--> 115 ] = 1.0
116
117 # find indices for which anchors should be ignored
IndexError: index 1 is out of bounds for dimension 1 with size 1
小例子:
''' Adapted from example in PyTorch code '''
import torch
import torchvision
from torchvision.models.detection.anchor_utils import AnchorGenerator
from torchvision.models.detection.backbone_utils import resnet_fpn_backbone
from torchvision.models.detection import RetinaNet
''' Backbone '''
backbone = resnet_fpn_backbone('resnet18', pretrained=False, trainable_layers=4)
backbone.out_channels = 256
''' Anchor Generator '''
anchor_sizes = ((32,), (64,), (128,), (256,), (512,))
aspect_ratios = ((0.5, 1.0, 2.0),) * len(anchor_sizes)
anchor_generator = AnchorGenerator(sizes=anchor_sizes, aspect_ratios=aspect_ratios)
''' Model '''
model = RetinaNet(backbone,
num_classes=1,
anchor_generator=anchor_generator)
def __getitem__():
img = torch.rand(3, 256, 256)
bboxes = []
bboxes = [[15, 15, 20, 20]]*20
bboxes = torch.FloatTensor(bboxes)
labels = torch.LongTensor(np.ones(len(bboxes), dtype=int))
targets = 'boxes':bboxes, 'labels':torch.LongTensor(labels)
return img, targets
def retinanet_collate_fn(batch_size=2):
img_batch = []
targets_batch = []
for i in range(batch_size):
img, targets = __getitem__()
img_batch.append(img)
targets_batch.append(targets)
return img_batch, targets_batch
img_batch, targets_batch = retinanet_collate_fn(batch_size=2)
outputs = model(img_batch, targets_batch)
【问题讨论】:
【参考方案1】:问题已解决。我设置 num_classes = 1,所以分类器期待“0”而不是“1”。我以为背景会自动为0,我的单类为1,但事实并非如此。
【讨论】:
以上是关于我使用 Torchvision RetinaNet 的输入数据结构有问题吗?的主要内容,如果未能解决你的问题,请参考以下文章