通过 DataLoader (PyTorch) 迭代:RuntimeError: 标量类型 unsigned char 的预期对象但序列元素 9 的标量类型浮点数
Posted
技术标签:
【中文标题】通过 DataLoader (PyTorch) 迭代:RuntimeError: 标量类型 unsigned char 的预期对象但序列元素 9 的标量类型浮点数【英文标题】:Iterating through DataLoader (PyTorch): RuntimeError: Expected object of scalar type unsigned char but got scalar type float for sequence element 9 【发布时间】:2020-10-14 00:58:19 【问题描述】:我是 PyTorch 的新手,遇到了预期的错误。整体背景是试图建立一个基于Spacenet 图像的建筑分割模型。我是从 Microsoft AI 的某个人那里得到的这个 repo 的分支,他构建了一个分割模型,我只是想重新运行她的训练脚本。
我已经能够下载数据并进行预处理。我的问题是在尝试实际训练模型时出现的,我正在尝试遍历我的 DataLoader,并收到以下错误消息:
RuntimeError: 标量类型 unsigned char 的预期对象,但得到 序列元素 9 的标量类型浮点数。
有用的代码片段:
我有一个 dataset.py
,它创建了 SpaceNetDataset
类,看起来像:
import os
# Ignore warnings
import warnings
import numpy as np
from PIL import Image
import torch
from torch.utils.data import Dataset
warnings.filterwarnings('ignore')
class SpaceNetDataset(Dataset):
"""Class representing a SpaceNet dataset, such as a training set."""
def __init__(self, root_dir, splits=['trainval', 'test'], transform=None):
"""
Args:
root_dir (string): Directory containing folder annotations and .txt files with the
train/val/test splits
splits: ['trainval', 'test'] - the SpaceNet utilities code would create these two
splits while converting the labels from polygons to mask annotations. The two
splits are created after chipping larger images into the required input size with
some overlaps. Thus to have splits that do not have overlapping areas, we manually
split the images (not chips) into train/val/test using utils/split_train_val_test.py,
followed by using the SpaceNet utilities to annotate each folder, and combine the
trainval and test splits it creates inside each folder.
transform (callable, optional): Optional transform to be applied
on a sample.
"""
self.root_dir = root_dir
self.transform = transform
self.image_list = []
self.xml_list = []
data_files = []
for split in splits:
with open(os.path.join(root_dir, split + '.txt')) as f:
data_files.extend(f.read().splitlines())
for line in data_files:
line = line.split(' ')
image_name = line[0].split('/')[-1]
xml_name = line[1].split('/')[-1]
self.image_list.append(image_name)
self.xml_list.append(xml_name)
def __len__(self):
return len(self.image_list)
def __getitem__(self, idx):
img_path = os.path.join(self.root_dir, 'RGB-PanSharpen', self.image_list[idx])
target_path = os.path.join(self.root_dir, 'annotations', self.image_list[idx].replace('.tif', 'segcls.tif'))
image = np.array(Image.open(img_path))
target = np.array(Image.open(target_path))
target[target == 100] = 1 # building interior
target[target == 255] = 2 # border
sample = 'image': image, 'target': target, 'image_name': self.image_list[idx]
if self.transform:
sample = self.transform(sample)
return sample
要创建 DataLoader,我有类似的东西:
dset_train = SpaceNetDataset(data_path_train, split_tags, transform=T.Compose([ToTensor()]))
loader_train = DataLoader(dset_train, batch_size=train_batch_size, shuffle=True,
num_workers=num_workers)
然后我通过执行以下操作迭代数据加载器:
for batch in loader_train:
image_tensors = batch['image']
images = batch['image'].cpu().numpy()
break # take the first shuffled batch
然后我得到错误:
Traceback (most recent call last):
File "training/train_aml.py", line 137, in <module> sample_images_train, sample_images_train_tensors = get_sample_images(which_set='train')
File "training/train_aml.py", line 123, in get_sample_images for i, batch in enumerate(loader):
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 345, in __next__ data = self._next_data() File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 856, in _next_data return self._process_data(data)
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 881, in _process_data data.reraise()
File "/usr/local/lib/python3.6/dist-packages/torch/_utils.py", line 395, in reraise raise self.exc_type(msg)
RuntimeError: Caught RuntimeError in DataLoader worker process 0. Original Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop data = fetcher.fetch(index)
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/fetch.py", line 47, in fetch return self.collate_fn(data)
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/collate.py", line 74, in default_collate return key: default_collate([d[key] for d in batch]) for key in elem
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/collate.py", line 74, in <dictcomp> return key: default_collate([d[key] for d in batch]) for key in elem
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/collate.py", line 55, in default_collate return torch.stack(batch, 0, out=out)
RuntimeError: Expected object of scalar type unsigned char but got scalar type float for sequence element 9.
该错误似乎与this one 非常相似,尽管我确实通过强制转换尝试了类似的解决方案:
dtype = torch.cuda.CharTensor if torch.cuda.is_available() else torch.CharTensor
for batch in loader:
batch['image'] = batch['image'].type(dtype)
batch['target'] = batch['target'].type(dtype)
但我最终还是遇到了同样的错误。
还有一些奇怪的事情:
-
这似乎是不确定的。大多数时候我都会收到此错误,但有时代码会继续运行(不知道为什么)
错误消息末尾的“序列元素”编号不断变化。在这种情况下,它是“序列元素 9”,有时是“序列元素 2”,等等。不知道为什么。
【问题讨论】:
【参考方案1】:啊,没关系。
原来unsigned char 来自 C++,它为您提供 0 到 255,因此这就是它对图像数据的期望。
所以我实际上是通过这样做来解决这个问题的:
image = np.array(Image.open(img_path)).astype(np.int)
target = np.array(Image.open(target_path)).astype(np.int)
在 SpaceNetDataset 类中,它似乎工作了!
【讨论】:
以上是关于通过 DataLoader (PyTorch) 迭代:RuntimeError: 标量类型 unsigned char 的预期对象但序列元素 9 的标量类型浮点数的主要内容,如果未能解决你的问题,请参考以下文章
深度之眼PyTorch训练营第二期 ---5Dataloader与Dataset