PyTorch学习6《PyTorch深度学习实践》——加载数据集（Dataset and DataLoader）

Posted 2023-03-06 ☆下山☆

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了PyTorch学习6《PyTorch深度学习实践》——加载数据集（Dataset and DataLoader）相关的知识，希望对你有一定的参考价值。

一、基础概念

二、代码及讲解

import torch
import numpy as np
from torch.utils.data import Dataset
from torch.utils.data import DataLoader

# prepare dataset
class DiabetesDataset(Dataset):
    def __init__(self, filepath):
        xy = np.loadtxt(filepath, delimiter=',', dtype=np.float32) # (759, 9) 最后一列是标签，二分类0,1
        self.len = xy.shape[0]
        self.x_data = torch.from_numpy(xy[:, :-1]) # 获取样本特征
        self.y_data = torch.from_numpy(xy[:, [-1]]) # 获取样本标签

    def __getitem__(self, index): # 获取样本索引
        return self.x_data[index], self.y_data[index]

    def __len__(self): # 获取样本总量：759
        return self.len

dataset = DiabetesDataset('diabetes.csv')
train_loader = DataLoader(dataset=dataset, batch_size=64, shuffle=False, num_workers=4)  # num_workers：多进程数据加载

# design model using class
class Model(torch.nn.Module):
    def __init__(self):
        super(Model, self).__init__()
        self.linear1 = torch.nn.Linear(8, 6)
        self.linear2 = torch.nn.Linear(6, 4)
        self.linear3 = torch.nn.Linear(4, 1)
        self.sigmoid = torch.nn.Sigmoid()

    def forward(self, x):
        x = self.sigmoid(self.linear1(x))
        x = self.sigmoid(self.linear2(x))
        x = self.sigmoid(self.linear3(x))
        return x

model = Model()

# construct loss and optimizer
criterion = torch.nn.BCELoss(reduction='mean')
optimizer = torch.optim.SGD(model.parameters(), lr=0.01)

# training cycle forward, backward, update
if __name__ == '__main__':
    for epoch in range(1):
        for i, data in enumerate(train_loader, 0): # 迭代对象
            inputs, labels = data
            print(len(data[0]))
            y_pred = model(inputs)
            loss = criterion(y_pred, labels)
            print(epoch, i, loss.item())

            optimizer.zero_grad()
            loss.backward()

            optimizer.step()

       代码中shuffle=False是不打乱样本的意思，然后我也特意设置epoch=1并且看看究竟是如何迭代的。具体如下：
       在每个epoch下，我们每次都选batch_size=64个样本进行训练并且参数更新，直到我们将所有的样本使用完，最后一个batch_size不够64个样本也要进行训练。
       可以看出，我们一共有759个样本，batch_size=64，一共进行了12次参数更新，前11次都使用64个样本，最后一次使用55个样本。

以上是关于PyTorch学习6《PyTorch深度学习实践》——加载数据集（Dataset and DataLoader）的主要内容，如果未能解决你的问题，请参考以下文章

PyTorch学习2B站刘二大人《PyTorch深度学习实践》——梯度下降算法（Gradient Descent）

‹拆书分享篇›深度学习框架PyTorch入门与实践

Pytorch反向传播实现——up主：刘二大人《PyTorch深度学习实践》

深度学习与图神经网络核心技术实践应用高级研修班-Day1Tensorflow和Pytorch

赠书福利！《深度学习框架PyTorch：入门与实践》

每月好书深度学习框架PyTorch入门与实践