Pytorch学习-训练CIFAR10分类器

Posted chuyi88

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Pytorch学习-训练CIFAR10分类器相关的知识,希望对你有一定的参考价值。

output_10_1.png

TRAINING A CLASSIFIER

参考Pytorch Tutorial:Deep Learning with PyTorch: A 60 Minute Blitz

在学会了以下后:

  1. 定义神经网络
  2. 计算损失函数
  3. 更新权重

What about data

Generally, when you have to deal with image, text, audio or video data, you can use standard python packages that load data into a numpy array. Then you can convert this array into a torch.*Tensor.

For images, packages such as Pillow, OpenCV are useful
For audio, packages such as scipy and librosa
For text, either raw Python or Cython based loading, or NLTK and SpaCy are useful

Specifically for vision, we have created a package called torchvision, that has data loaders for common datasets such as Imagenet, CIFAR10, MNIST, etc. and data transformers for images, viz., torchvision.datasets and torch.utils.data.DataLoader.

当处理图像、文本、音频或视频数据时,可以用python的标准包来家在数据并存为Numpy Array,而后再转成torch.Tensor

  • 图像: 常用Pillow,OpenCv
  • 音频: scipy,librosa
  • 文本: 原python或cython加载,或NLTK和Spacy常用

针对计算机视觉,pytorch有提供了便于处理的包torchvision里面包括了‘data loader‘,可以加载常用的数据集imagenet,Cifar10,Mnist等

还包括一些转换器(可以做数据增强 Augment)

torchvision.datasets torch.utils.data.DataLoader

在这个实验中,使用CIFAR10数据集

包含类型:‘airplane’, ‘automobile’, ‘bird’, ‘cat’, ‘deer’, ‘dog’, ‘frog’, ‘horse’, ‘ship’, ‘truck’

CIFAR10数据集中的图片size均为33232(3个通道rgb,32*32大小)

Training an image classifier

步骤:

  1. 加载并标准化训练与测试数据集,使用 torchvision
  2. 定义卷积神经网络convnet
  3. 定义损失函数
  4. 训练集训练神经网络
  5. 测试集测试网络性能

Step1:加载并标准化训练与测试数据集

import torch
import torchvision
import torchvision.transforms as transforms

torchvison数据集是 PILImage类型,值在[0,1]之间,需要转换成Tensors并标准化到[-1,1]

transform = transforms.Compose([transforms.ToTensor(),transforms.Normalize((0.5,0.5,0.5),(0.5,0.5,0.5))])
#compose 是将多个转换器功能混合在一起
#./是当前目录 ../是父目录 /是根目录
trainset = torchvision.datasets.CIFAR10(root=‘./data‘,train=True,download=True,transform=transform)#已经下载就不会再下载了
trainloader = torch.utils.data.DataLoader(trainset,batch_size=4,shuffle=True,num_workers=2)
testset = torchvision.datasets.CIFAR10(root=‘./data‘,train=False,download=True,transform=transform)
testloader = torch.utils.data.DataLoader(testset,batch_size=4,shuffle=False,num_workers=2) 
#num_workers 处理进程数
classes = (‘plane‘,‘car‘,‘bird‘,‘cat‘,‘deer‘,‘dog‘,‘frog‘,‘horse‘,‘ship‘,‘truck‘)
Files already downloaded and verified
Files already downloaded and verified
print(trainset)
print("----"*10)
print(testset)
Dataset CIFAR10
    Number of datapoints: 50000
    Split: train
    Root Location: ./data
    Transforms (if any): Compose(
                             ToTensor()
                             Normalize(mean=(0.5, 0.5, 0.5), std=(0.5, 0.5, 0.5))
                         )
    Target Transforms (if any): None
----------------------------------------
Dataset CIFAR10
    Number of datapoints: 10000
    Split: test
    Root Location: ./data
    Transforms (if any): Compose(
                             ToTensor()
                             Normalize(mean=(0.5, 0.5, 0.5), std=(0.5, 0.5, 0.5))
                         )
    Target Transforms (if any): None
#show一些图片 for fun??
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np

def imshow(img):
    img = img/2+0.5
    npimg = img.numpy()
    plt.imshow(np.transpose(npimg,(1,2,0))) #转回正常格式 从chw转回hwc
    
dataiter = iter(trainloader) #迭代器
images,labels = dataiter.next()
print(labels)
imshow(torchvision.utils.make_grid(images))

print(‘‘.join(‘%5s‘%classes[labels[j]] for j in range(4))) #因为一个batch是4,所以一次next取4个
tensor([2, 8, 1, 5])
 bird ship  car  dog

技术分享图片

labels
tensor([2, 8, 1, 5])

Step2: 定义卷积神经网络

import torch.nn as nn
import torch.nn.functional as F

class Net(nn.Module):
    #这一步只是定义了可能要用到的层,在计算中,可能有的层用了多次,有的不用
    def __init__(self):
        super(Net,self).__init__()
        self.conv1 = nn.Conv2d(3,6,5) #(输入channel,输出channel,卷积核)
        self.pool = nn.MaxPool2d(2,2) #定义一个池化层,用两次
        self.conv2 = nn.Conv2d(6,16,5)
        self.fc1 = nn.Linear(16*5*5,120)
        self.fc2 = nn.Linear(120,84)
        self.fc3 = nn.Linear(84,10)

    #实际如何构建神经网络是根据forward确定
    def forward(self,x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = x.view(-1,16*5*5)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x
    
net = Net()
Net(
  (conv1): Conv2d(3, 6, kernel_size=(5, 5), stride=(1, 1))
  (pool): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (conv2): Conv2d(6, 16, kernel_size=(5, 5), stride=(1, 1))
  (fc1): Linear(in_features=400, out_features=120, bias=True)
  (fc2): Linear(in_features=120, out_features=84, bias=True)
  (fc3): Linear(in_features=84, out_features=10, bias=True)
)

定义损失函数和优化器(用于更新权重)

注意??:torch 中最后输出了10维,而labels是一个1* 1 数字。这样处理的也是正确的,计算loss时是通过x[labels]来取得每一个数来计算,所以实际上是一样

而在其他地方是将labels当作10维向量来处理。其实都是一个东西。系统内部自行处理,不用太纠结于细节

import torch.optim as optim
#这里的crossentropy包含了softmax层,可以不用再加softmax了。 #而且这个损失函数的原理是让正确值尽可能大,错值尽可能小
criterion = nn.CrossEntropyLoss() # 交叉熵 #在这里计算的交叉熵是直接用类别来取值的,而不是化成n类-》n列向量,所在类为1这样子
optimizer = optim.SGD(net.parameters(),lr = 0.001,momentum=0.9)

训练网络

for epoch in range(2): #训练的epoch数
    running_loss = 0.0
    for i,data in enumerate(trainloader,0): #0表示是从0开始,一般默认就是0
        #得到data
        inputs,labels = data
        #初始化梯度(0)
        optimizer.zero_grad()         
        #前向计算
        outputs = net(inputs)
        #计算损失函数
        loss = criterion(outputs,labels)
        #反向传播(计算梯度)
        loss.backward()
        #更新梯度
        optimizer.step()
        
        #print 统计数据
        running_loss += loss.item() #统计数据的损失
        if i% 2000 == 1999: #每2000个batch 打印一次
            print(‘[%d, %5d] loss: %.3f‘%(epoch+1,i+1,running_loss))
            running_loss = 0.0 #打印完归零
print(‘Finished Training‘)
[1,  2000] loss: 4505.347
[1,  4000] loss: 3816.202
[1,  6000] loss: 3448.905
[1,  8000] loss: 3221.118
[1, 10000] loss: 3091.055
[1, 12000] loss: 2993.834
[2,  2000] loss: 2793.536
[2,  4000] loss: 2777.763
[2,  6000] loss: 2710.222
[2,  8000] loss: 2668.854
[2, 10000] loss: 2622.627
[2, 12000] loss: 2571.615
Finished Training

用test数据测试网络

通过预测类别并对比ground-truth

#先显示下test的图像
dataiter = iter(testloader)
images,labels = dataiter.next()

imshow(torchvision.utils.make_grid(images))
print(‘GroundTruth: ‘,‘ ‘.join(‘%5s‘ % classes[labels[j]] for j in range(4)))

GroundTruth:    cat  ship  ship plane

技术分享图片

outputs = net(images) #放进去计算预测结果

_,predicted = torch.max(outputs,1) #outputs的第2维(各行的每一列中取出最大的1列)中取出最大的数(丢弃),取出最大数所在索引(predicted)

print(‘Predicted: ‘ ,‘ ‘.join(‘%5s‘% classes[predicted[j]] for j in range(4)))
Predicted:   deer   cat  deer horse
print(outputs)
print(predicted)
tensor([[-3.4898, -3.6106,  1.2521,  3.3437,  3.3692,  3.2635,  2.6993,  2.0445,
         -4.8485, -3.5421],
        [-1.9592, -2.6239,  1.1073,  3.4853,  1.0128,  3.2079, -0.2431,  1.9412,
         -2.4887, -2.2249],
        [-0.2035,  1.3960,  0.6715, -0.1788,  3.5923, -1.4808,  0.4605, -0.0833,
         -2.6476, -1.5091],
        [-1.7742, -2.5306,  1.0426,  0.2753,  3.6487,  0.9355,  0.2774,  4.9753,
         -4.7646, -2.7965]], grad_fn=<ThAddmmBackward>)
tensor([4, 3, 4, 7])

计算整体精度

在整个测试集的表现

correct = 0
total = 0
with torch.no_grad(): #告诉机器不用再去自动计算每一个tensor梯度了。
    for data in testloader:
        images,labels = data
        outputs = net(images)
        _,predicted = torch.max(outputs.data,1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()
print(‘Accuracy of the network on the 10000 test images:%d %%‘%(100*correct/total))
Accuracy of the network on the 10000 test images:54 %

似乎学到了东西,再看看哪些类别表现的更好

class_correct = list(0.for i in range(10)) #生成浮点型list
class_total = list(0.for i in range(10))
with torch.no_grad():
    for data in testloader:
        images,labels = data
        outputs = net(images)
        _,predicted = torch.max(outputs,1)
        c = (predicted == labels).squeeze() #就是所有数据都挤到一行,可以方便c[i]取值
        for i in range(4):
            label = labels[i]
            class_correct[label] += c[i].item()
            class_total[label] +=1
for i in range(10):
    print(‘Accuracy of %5s : %2d %%‘%(classes[i],100*class_correct[i]/class_total[i]))
Accuracy of plane : 57 %
Accuracy of   car : 80 %
Accuracy of  bird : 37 %
Accuracy of   cat : 45 %
Accuracy of  deer : 45 %
Accuracy of   dog : 43 %
Accuracy of  frog : 61 %
Accuracy of horse : 54 %
Accuracy of  ship : 64 %
Accuracy of truck : 54 %

用GPU做怎么做?

就像转移tensor到gpu一样,转移整个neural net 到gpu。
先定义一个device作为首个可见的cuda device(如果有,没有则做不了)

device = torch.device("cude:0" if torch.cuda.is_available() else ‘cpu‘)
#假如在cuda机器中,这里会打印cuda device
print(device)
cpu
net.to(device)
#切记 要在每一步的inputs和targets都放到gpu device 中
inputs,labels = inputs.to(device),labels.to(device)

为什么没有显著速度提升?因为网络的太小,不明显

如何用上所有GPUs(多个)? Data Parallelism

有用的函数

  • torch.from_numpy() numpy直接转tensor,不变维度
  • transforms.ToTensor() numpy转tensor,第三维变成第一维,其他两维后移
  • x.numpy() 转回numpy格式 x是tensor变量
  • x.transpose((2,0,1)) x是numpy格式,但维度不正确,进行维度转换 意思是将最后一维变为第一维 ,(0,1,2)即表示不变

以上是关于Pytorch学习-训练CIFAR10分类器的主要内容,如果未能解决你的问题,请参考以下文章

Pytorch CIFAR10图像分类 MobileNetv2篇

Pytorch CIFAR10图像分类 MobileNetv2篇

Pytorch CIFAR10图像分类 工具函数utils篇

Pytorch CIFAR10图像分类 DenseNet篇

Pytorch CIFAR10图像分类 GoogLeNet篇

Pytorch CIFAR10图像分类 AlexNet篇