「深度学习一遍过」必修25:基于DCGAN的Image Production
Posted 闭关修炼——暂退
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了「深度学习一遍过」必修25:基于DCGAN的Image Production相关的知识,希望对你有一定的参考价值。
本专栏用于记录关于深度学习的笔记,不光方便自己复习与查阅,同时也希望能给您解决一些关于深度学习的相关问题,并提供一些微不足道的人工神经网络模型设计思路。
专栏地址:「深度学习一遍过」必修篇
目录
项目 GitHub 地址
项目心得
- 2016 年——DCGAN:DCGAN,英文全称:Deep Convolution Generative Adversarial Networks,汉语翻译:深度卷积生成对抗网络,包含判别器和生成器两部分。该项目自己搭建了 DCGAN 网络(包括生成器与判别器两部分)并在 Image Production 项目中得到了应用。生成器模型结构包括 5 个转置卷积层,4 个 BN 层,4 个 ReLU 层,1 个 Tanh 输出激活层;判别器模型结构包括 4 个卷积层,3 个 BN 层,4 个 LRelu 层,1 个 Sigmoid 输出激活层。众所周知,GAN 的不稳定主要体现在超参数敏感和模式崩塌。此次项目我对 GAN 的参数敏感特性方面感受颇深,首次训练时生成器判别器优化器初始学习率均设置为 0.0001,30轮生成模型推理时可谓效果极差,后将初始学习率设置为 0.0003 后性能得到较大提升,为进一步获得性能提升,在 train.py 文件中我编写了每隔 10 个 epoch 学习率下降为之前的 0.1 倍的相关代码,此次修改使得模型性能大大提升,在 50 轮之后的模型推理中就可以生成不错的 “假” 图。下一步打算在超参数方面做进一步改进,使得该项目能够在更早的轮数结束时就可产生效果不错的 “假” 图。
项目代码
net.py
#!/usr/bin/python
# -*- coding:utf-8 -*-
# ------------------------------------------------- #
# 作者:赵泽荣
# 时间:2021年9月15日(农历八月初九)
# 个人站点:1.https://zhao302014.github.io/
# 2.https://blog.csdn.net/IT_charge/
# 个人GitHub地址:https://github.com/zhao302014
# ------------------------------------------------- #
import torch.nn as nn
# --------------------------------------------------------------------------------- #
# 自己搭建一个 DCGAN 模型结构
# · DCGAN 提出时间:2016 年
# · DCGAN 的判别器和生成器都使用了卷积神经网络(CNN)来替代 GAN 中的多层感知机
# · DCGAN 为了使整个网络可微,拿掉了 CNN 中的池化层,另外将全连接层以全局池化层替代以减轻计算量
# · DCGAN 相比于 GAN 或者是普通 CNN 的改进包含以下几个方面:1.使用卷积和去卷积代替池化层
# 2.在生成器和判别器中都添加了批量归一化操作
# 3.去掉了全连接层,使用全局池化层替代
# 4.生成器的输出层使用 Tanh 激活函数,其他层使用 RELU
# 5.判别器的所有层都是用 LeakyReLU 激活函数
# --------------------------------------------------------------------------------- #
# 定义生成器结构模型
class MyGenerator(nn.Module):
def __init__(self):
super(MyGenerator, self).__init__()
# 5 个转置卷积层,4 个 BN 层,4 个 ReLU 层,1 个 Tanh 输出激活层
self.ReLU = nn.ReLU()
self.deconv1 = nn.ConvTranspose2d(in_channels=100, out_channels=1024, kernel_size=4, stride=1)
self.bn1 = nn.BatchNorm2d(1024)
self.deconv2 = nn.ConvTranspose2d(in_channels=1024, out_channels=512, kernel_size=4, stride=2, padding=1)
self.bn2 = nn.BatchNorm2d(512)
self.deconv3 = nn.ConvTranspose2d(in_channels=512, out_channels=256, kernel_size=4, stride=2, padding=1)
self.bn3 = nn.BatchNorm2d(256)
self.deconv4 = nn.ConvTranspose2d(in_channels=256, out_channels=128, kernel_size=4, stride=2, padding=1)
self.bn4 = nn.BatchNorm2d(128)
self.deconv5 = nn.ConvTranspose2d(in_channels=128, out_channels=3, kernel_size=4, stride=2, padding=1)
self.Tanh = nn.Tanh()
def forward(self, x): # 输入shape: torch.Size([1, 100, 1, 1])
x = self.deconv1(x) # shape: torch.Size([1, 1024, 4, 4])
x = self.bn1(x) # shape: torch.Size([1, 1024, 4, 4])
x = self.ReLU(x) # shape: torch.Size([1, 1024, 4, 4])
x = self.deconv2(x) # shape: torch.Size([1, 512, 8, 8])
x = self.bn2(x) # shape: torch.Size([1, 512, 8, 8])
x = self.ReLU(x) # shape: torch.Size([1, 512, 8, 8])
x = self.deconv3(x) # shape: torch.Size([1, 256, 16, 16])
x = self.bn3(x) # shape: torch.Size([1, 256, 16, 16])
x = self.ReLU(x) # shape: torch.Size([1, 256, 16, 16])
x = self.deconv4(x) # shape: torch.Size([1, 128, 32, 32])
x = self.bn4(x) # shape: torch.Size([1, 128, 32, 32])
x = self.ReLU(x) # shape: torch.Size([1, 128, 32, 32])
x = self.deconv5(x) # shape: torch.Size([1, 3, 64, 64])
x = self.Tanh(x) # shape: torch.Size([1, 3, 64, 64])
return x
# 定义判别器结构模型
class MyDiscriminator(nn.Module):
def __init__(self):
super(MyDiscriminator, self).__init__()
# 4 个卷积层,3 个 BN 层,4 个 LRelu 层,1 个 Sigmoid 输出激活层
self.LeakyReLU = nn.LeakyReLU(0.2, inplace=True)
self.Sigmoid = nn.Sigmoid()
self.conv1 = nn.Conv2d(in_channels=3, out_channels=64, kernel_size=4, stride=2, padding=1)
self.conv2 = nn.Conv2d(in_channels=64, out_channels=128, kernel_size=4, stride=2, padding=1)
self.bn2 = nn.BatchNorm2d(128)
self.conv3 = nn.Conv2d(in_channels=128, out_channels=256, kernel_size=4, stride=2, padding=1)
self.bn3 = nn.BatchNorm2d(256)
self.conv4 = nn.Conv2d(in_channels=256, out_channels=512, kernel_size=4, stride=2, padding=1)
self.bn4 = nn.BatchNorm2d(512)
self.conv5 = nn.Conv2d(in_channels=512, out_channels=1, kernel_size=4, stride=1)
def forward(self, x): # 输入shape: torch.Size([1, 3, 64, 64])
x = self.conv1(x) # shape: torch.Size([1, 64, 32, 32])
x = self.LeakyReLU(x) # shape: torch.Size([1, 64, 32, 32])
x = self.conv2(x) # shape: torch.Size([1, 128, 16, 16])
x = self.bn2(x) # shape: torch.Size([1, 128, 16, 16])
x = self.LeakyReLU(x) # shape: torch.Size([1, 128, 16, 16])
x = self.conv3(x) # shape: torch.Size([1, 256, 8, 8])
x = self.bn3(x) # shape: torch.Size([1, 256, 8, 8])
x = self.LeakyReLU(x) # shape: torch.Size([1, 256, 8, 8])
x = self.conv4(x) # shape: torch.Size([1, 512, 4, 4])
x = self.bn4(x) # shape: torch.Size([1, 512, 4, 4])
x = self.LeakyReLU(x) # shape: torch.Size([1, 512, 4, 4])
x = self.conv5(x) # shape: torch.Size([1, 1, 1, 1])
x = self.Sigmoid(x) # shape: torch.Size([1, 1, 1, 1])
return x
train.py
#!/usr/bin/python
# -*- coding:utf-8 -*-
# ------------------------------------------------- #
# 作者:赵泽荣
# 时间:2021年9月15日(农历八月初九)
# 个人站点:1.https://zhao302014.github.io/
# 2.https://blog.csdn.net/IT_charge/
# 个人GitHub地址:https://github.com/zhao302014
# ------------------------------------------------- #
import torch.nn as nn
import torch.optim as optim
import torch.utils.data
from torch.optim import lr_scheduler
import torchvision.datasets as dset
import torchvision.transforms as transforms
from net import MyGenerator, MyDiscriminator
data_transform = transforms.Compose([
transforms.Resize(64),
transforms.CenterCrop(64),
transforms.ToTensor(),
transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)),
])
# 加载训练数据集
train_data_path = "./data/"
train_dataset = dset.ImageFolder(root=train_data_path, transform=data_transform)
train_dataloader = torch.utils.data.DataLoader(train_dataset, batch_size=64, shuffle=True)
# 如果显卡可用,则用显卡进行训练
device = "cuda" if torch.cuda.is_available() else 'cpu'
# 调用 net 里定义的模型,如果 GPU 可用则将模型转到 GPU
modelGenerator = MyGenerator().to(device)
modelDiscriminator = MyDiscriminator().to(device)
# 定义损失函数
loss = nn.BCELoss()
# 定义优化器
optimizerGenerator = optim.Adam(modelGenerator.parameters(), lr=0.0003, betas=(0.5, 0.999))
optimizerDiscriminator = optim.Adam(modelDiscriminator.parameters(), lr=0.0003, betas=(0.5, 0.999))
# 学习率每隔 10 个 epoch 变为原来的 0.1
lr_scheduler_Generator = lr_scheduler.StepLR(optimizerGenerator, step_size=10, gamma=0.1)
lr_scheduler_Discriminator = lr_scheduler.StepLR(optimizerDiscriminator, step_size=10, gamma=0.1)
# 定义训练函数
discriminator_loss, generator_loss = 0, 0
def train(train_dataloader, modelGenerator, modelDiscriminator, loss, optimizerGenerator, optimizerDiscriminator):
for i, data in enumerate(train_dataloader, 0):
# ----------------------------------------- #
# 更新判别器:最大化 log(D(x)) + log(1 - D(G(z)))
# ----------------------------------------- #
# 用所有“真”的数据进行训练
modelDiscriminator.zero_grad() # 梯度清空
real_img = data[0].to(device) # data[0].shape: torch.Size([batch_size, c, w, h])
batch_size = real_img.size(0) # 获取 batch_size 大小
real_img_label = torch.full((batch_size,), 1.0, device=device) # 1.0:“真”标签;label:tensor([1., 1., 1., 1., ... , 1., 1., 1., 1.])
# 判别器推理
real_img_output = modelDiscriminator(real_img).view(-1) # shape: torch.Size([64])
# 计算所有“真”标签的损失函数
real_img_loss = loss(real_img_output, real_img_label)
real_img_loss.backward() # 误差反传
# 生成假数据并进行训练
noise = torch.randn(batch_size, 100, 1, 1, device=device)
# 用生成器生成假图像
fake_img = modelGenerator(noise)
fake_img_label = torch.full((batch_size,), 0.0, device=device) # 0.0:“假”标签;label:tensor([0., 0., 0., 0., ... , 0., 0., 0., 0.])
fake_img_output = modelDiscriminator(fake_img.detach()).view(-1)
# 计算判别器在假数据上的损失
fake_img_loss = loss(fake_img_output, fake_img_label)
fake_img_loss.backward() # 误差反传
discriminator_loss = real_img_loss + fake_img_loss
optimizerDiscriminator.step() # 参数更新
# ----------------------------------------- #
# 更新生成器:最大化 log(D(G(z)))
# ----------------------------------------- #
modelGenerator.zero_grad()
# 生成器样本标签都为 1
img_label = torch.full((batch_size,), 1.0, device=device) # 1.0:“真”标签;label:tensor([1., 1., 1., 1., ... , 1., 1., 1., 1.])
img_output = modelDiscriminator(fake_img).view(-1)
# 计算损失
img_loss = loss(img_output, img_label)
img_loss.backward() # 误差反传
optimizerGenerator.step() # 参数更新
generator_loss = real_img_loss
print('判别器loss:', discriminator_loss.item())
print('生成器loss:', generator_loss.item())
# 开始训练
epoch = 100
for t in range(epoch):
lr_scheduler_Generator.step()
lr_scheduler_Discriminator.step()
print(f"Epoch t + 1\\n--------------------------------")
train(train_dataloader, modelGenerator, modelDiscriminator, loss, optimizerGenerator, optimizerDiscriminator)
torch.save(modelGenerator.state_dict(), "save_model/model.pt".format(t)) # 模型保存
print("Done!")
test.py
#!/usr/bin/python
# -*- coding:utf-8 -*-
# ------------------------------------------------- #
# 作者:赵泽荣
# 时间:2021年9月15日(农历八月初九)
# 个人站点:1.https://zhao302014.github.io/
# 2.https://blog.csdn.net/IT_charge/
# 个人GitHub地址:https://github.com/zhao302014
# ------------------------------------------------- #
import numpy as np
import torch.utils.data
import torchvision.utils as vutils
import torchvision.transforms as transforms
import matplotlib.pyplot as plt
import torchvision.datasets as dset
from net import MyGenerator
data_transform = transforms.Compose([
transforms.Resize(64),
transforms.CenterCrop(64),
transforms.ToTensor(),
transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)),
])
# 加载训练数据集
train_data_path = "./data/"
train_dataset = dset.ImageFolder(root=train_data_path, transform=data_transform)
train_dataloader = torch.utils.data.DataLoader(train_dataset, batch_size=64, shuffle=True)
# 如果显卡可用,则用显卡进行训练
device = "cuda" if torch.cuda.is_available() else 'cpu'
# 调用 net 里定义的模型,如果 GPU 可用则将模型转到 GPU
modelGenerator = MyGenerator().to(device)
# 加载 train.py 里训练好的模型
modelGenerator.load_state_dict(torch.load("./save_model/99model.pt")) # 注:一般而言,50轮之后的模型才能生成较好“假”图
# 进入验证阶段
modelGenerator.eval()
img_list = []
iters = 0
# 开始验证
for i, data in enumerate(train_dataloader):
noise = torch.randn(64, 100, 1, 1, device=device)
if (iters % 20 == 0):
fake_img = modelGenerator(noise).cpu()
img_list.append(vutils.make_grid(fake_img, normalize=True))
iters += 1
# 真图与生成图对比
real_batch = next(iter(train_dataloader))
real_img = vutils.make_grid(real_batch[0], normalize=True).cpu()
# 开始画图
plt.figure(figsize=(12, 12)) # 设置画布“宽、长”大小(单位为inch)
# 一行两列,真图位于左边位置
plt.subplot(1, 2, 1)
plt.axis("off") # 关闭坐标轴
plt.title("Real Images") # 设置图像标题
plt.imshow(np.transpose(real_img, (1, 2, 0)))
# 一行两列,生成图位于右边位置
plt.subplot(1, 2, 2)
plt.axis("off") # 关闭坐标轴
plt.title("Fake Images") # 设置图像标题
plt.imshow(np.transpose(img_list[-1], (1, 2, 0)))
plt.show() # 图像显示
欢迎大家交流评论,一起学习
希望本文能帮助您解决您在这方面遇到的问题
感谢阅读
END
以上是关于「深度学习一遍过」必修25:基于DCGAN的Image Production的主要内容,如果未能解决你的问题,请参考以下文章
「深度学习一遍过」必修24:基于UNet的Semantic Segmentation
「深度学习一遍过」必修24:基于UNet的Semantic Segmentation
「深度学习一遍过」必修17:基于Pytorch细粒度分类实战
「深度学习一遍过」必修20:基于AlexNet的MNIST手写数字识别