MaskRCNN 的 segm IoU 指标从何而来 = 0?

Posted

技术标签:

【中文标题】MaskRCNN 的 segm IoU 指标从何而来 = 0?【英文标题】:Whence MaskRCNN's segm IoU metrics = 0? 【发布时间】:2021-12-12 02:44:45 【问题描述】:

在我的多类实例分割自定义数据集上训练 MaskRCNN 时,输入格式为:

image   -)  shape: torch.Size([3, 850, 600]),   dtype: torch.float32, min: tensor(0.0431),               max: tensor(0.9137)
boxes   -)  shape: torch.Size([4, 4]),          dtype: torch.float32, min: tensor(47.),                  max: tensor(807.)
masks   -)  shape: torch.Size([850, 600, 600]), dtype: torch.uint8,   min: tensor(0, dtype=torch.uint8), max: tensor(1, dtype=torch.uint8)
areas   -)  shape: torch.Size([4]),             dtype: torch.float32, min: tensor(1479.),                max: tensor(8014.)
labels  -)  shape: torch.Size([4]),             dtype: torch.int64,   min: tensor(1),                    max: tensor(1)
iscrowd -)  shape: torch.Size([4]),             dtype: torch.int64,   min: tensor(0),                    max: tensor(0)

我始终如一地获得所有细分IoU指标,如下所示:

DONE (t=0.03s).
IoU metric: bbox
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.004
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.010
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.004
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.001
IoU metric: segm
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.000

我该如何思考、调试和解决这个问题?

【问题讨论】:

【参考方案1】:

由于您的输入图像大小为 (850, 600) (H, W),并且考虑到对于此给定图像,您有 4 个对象,而不是带有 (600, 600) 掩码的 850 个对象。 你的面具张量应该有维度(对象数,850、600),因此你的输入应该是:

image   -)  shape: torch.Size([3, 850, 600]),   dtype: torch.float32, min: tensor(0.0431),               max: tensor(0.9137)
boxes   -)  shape: torch.Size([4, 4]),          dtype: torch.float32, min: tensor(47.),                  max: tensor(807.)
masks   -)  shape: torch.Size([4, 850, 600]), dtype: torch.uint8,   min: tensor(0, dtype=torch.uint8), max: tensor(1, dtype=torch.uint8)
areas   -)  shape: torch.Size([4]),             dtype: torch.float32, min: tensor(1479.),                max: tensor(8014.)
labels  -)  shape: torch.Size([4]),             dtype: torch.int64,   min: tensor(1),                    max: tensor(1)
iscrowd -)  shape: torch.Size([4]),             dtype: torch.int64,   min: tensor(0),                    max: tensor(0)

如何解决 因为您正在尝试解决实例分割问题,所以请确保您的每个 (850, 600) 掩码都堆叠起来,以便产生 (number of mask, 850, 600) 形状的张量。

【讨论】:

以上是关于MaskRCNN 的 segm IoU 指标从何而来 = 0?的主要内容,如果未能解决你的问题,请参考以下文章

Linux系统启动过程的打印信息从何而来?

神经网络从何而来?

UnobservedTaskException - 任务从何而来

0x 从何而来? [复制]

不明白2048从何而来

SKProduct Swift - 信息从何而来?