[BPnet识别MNIST02]MNIST字符集下载以及python读取

Posted AIplusX

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了[BPnet识别MNIST02]MNIST字符集下载以及python读取相关的知识,希望对你有一定的参考价值。

写在前面

做为我的第一个机器学习的练手项目,我选择了较为简单的MNIST手写字符集识别,因为MNIST在网上是开源的,但是我们需要将其处理成模型可使用的格式,因此这篇文章我就分享我的MNIST字符集下载以及数据集成过程

主要内容

主要内容在我的古月居博客:
[BPnet识别MNIST02]MNIST字符集下载以及python读取

conda下载tensorflow

在pycharm的这个地方可以通过命令行下载tensorflow:


通过命令conda install tensorflow之后,可能会报下面这个错误:

Collecting package metadata (current_repodata.json): failed

ProxyError: Conda cannot proceed due to an error in your proxy configuration.
Check for typos and other configuration errors in any '.netrc' file in your home directory,
any environment variables ending in '_PROXY', and any other system-wide proxy
configuration settings.

关掉当前开的VPN软件即可解决这个错误

MNIST下载

MNIST数据读取

def readFile(type=0):#0 is traindata,1 is testdata
    if (type == 0):
        with open('./dataSet/train-images.idx3-ubyte','rb') as ti:
            train_image = ti.read()
        with open('./dataSet/train-labels.idx1-ubyte', 'rb') as tl:
            train_labels = tl.read()
        return train_image,train_labels
    elif (type == 1):
        with open('./dataSet/t10k-images-idx3-ubyte', 'rb') as t_i:
            test_image = t_i.read()
        with open('./dataSet/t10k-labels-idx1-ubyte', 'rb') as t_l:
            test_labels = t_l.read()
        return test_image,test_labels

image, label = readFile()

img_size_bit = struct.calcsize('>784B')
lab_size_bit = struct.calcsize('>1B')

def getImages(image,n,startidx=0):
    img = []
    index = struct.calcsize('>IIII') + img_size_bit * startidx
    for i in range(n):
        temp = struct.unpack_from('>784B', image, index)
        img.append(np.reshape(temp, (28, 28)))
        index += img_size_bit
    return img

def getLabels(label,n,startidx=0):
    lab = []
    index = struct.calcsize('>II') + lab_size_bit * startidx
    for i in range(n):
        temp = struct.unpack_from('>1B', label, index)
        lab.append(temp[0])
        index += lab_size_bit
    return lab

源码:

import pandas
import struct
import numpy as np
# import tensorflow as tf
import matplotlib.pyplot as plt

img_size_bit = struct.calcsize('>784B')
lab_size_bit = struct.calcsize('>1B')

def readFile(type=0):#0 is traindata,1 is testdata
    if (type == 0):
        with open('./dataSet/train-images.idx3-ubyte','rb') as ti:
            train_image = ti.read()
        with open('./dataSet/train-labels.idx1-ubyte', 'rb') as tl:
            train_labels = tl.read()
        return train_image,train_labels
    elif (type == 1):
        with open('./dataSet/t10k-images.idx3-ubyte', 'rb') as t_i:
            test_image = t_i.read()
        with open('./dataSet/t10k-labels.idx1-ubyte', 'rb') as t_l:
            test_labels = t_l.read()
        return test_image,test_labels

def getImages(image,n,startidx=0):
    img = []
    index = struct.calcsize('>IIII') + img_size_bit * startidx
    for i in range(n):
        temp = struct.unpack_from('>784B', image, index)
        img.append(np.reshape(temp, (28, 28)))
        index += img_size_bit
    return img

def getLabels(label,n,startidx=0):
    lab = []
    index = struct.calcsize('>II') + lab_size_bit * startidx
    for i in range(n):
        temp = struct.unpack_from('>1B', label, index)
        lab.append(temp[0])
        index += lab_size_bit
    return lab


if __name__ == '__main__':
    n = 9
    image, label = readFile(1)
    train_img = getImages(image, n)
    train_lab = getLabels(label, n)

    for i in range(n):
        plt.subplot(int(np.sqrt(n)), int(np.sqrt(n)), 1 + i)
        title = "label:" + str(train_lab[i])
        plt.title(title)
        plt.axis('off')
        plt.subplots_adjust(hspace=0.4)
        plt.imshow(train_img[i], cmap='gray')
    plt.show()
    # print(type(label))
    # print(len(label))
    # print(lab_size_bit)

主要内容

主要内容在我的古月居博客:
[BPnet识别MNIST02]MNIST字符集下载以及python读取

以上是关于[BPnet识别MNIST02]MNIST字符集下载以及python读取的主要内容,如果未能解决你的问题,请参考以下文章

[BPnet识别MNIST06]发散的误差函数

[BPnet识别MNIST05]神经网络梯度下降公式分析

[BPnet识别MNIST07]神经网络的实现以及调优

[BPnet识别MNIST04]神经网络的变量和公式分析

[BPnet识别MNIST01]利用conda建立python工程

[BPnet识别MNIST08]神经网络参数初始值对于模型结果的影响