TensorFlow2快速入门- MNIST 数据集详解

Posted 2021-06-30 空中旋转篮球

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了TensorFlow2快速入门- MNIST 数据集详解相关的知识，希望对你有一定的参考价值。

1.软硬件环境

python3.7、pycharm-community-2021.1.1、tensorflow2.5

2.MNIST数据集

下载和介绍地址：MNIST数据集下载地址

Training set images: train-images-idx3-ubyte.gz (9.9 MB, 解压后 47 MB, 包含 60,000 个样本)
Training set labels: train-labels-idx1-ubyte.gz (29 KB, 解压后 60 KB, 包含 60,000 个标签)
Test set images: t10k-images-idx3-ubyte.gz (1.6 MB, 解压后 7.8 MB, 包含 10,000 个样本)
Test set labels: t10k-labels-idx1-ubyte.gz (5KB, 解压后 10 KB, 包含 10,000 个标签)
这里看到的和后面实际下载的格式不一样。

3.测试代码

下载测试包的代码：

import tensorflow as tf

#将样本从整数转换为浮点数
mnist = tf.keras.datasets.mnist

(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

#将模型的各层堆叠起来，以搭建 tf.keras.Sequential 模型。为训练选择优化器和损失函数
model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  tf.keras.layers.Dense(128, activation='relu'),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10, activation='softmax')
])

model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

#训练并验证模型
model.fit(x_train, y_train, epochs=5)
model.evaluate(x_test,  y_test, verbose=2)

运行结果，先找GPU，找不到后就跳过使用CPU

2021-06-28 08:27:45.633464: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cudart64_110.dll'; dlerror: cudart64_110.dll not found
2021-06-28 08:27:45.633864: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2021-06-28 08:28:03.197821: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library nvcuda.dll
2021-06-28 08:28:03.296392: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: GeForce GT 730 computeCapability: 3.5
coreClock: 0.9015GHz coreCount: 2 deviceMemorySize: 2.00GiB deviceMemoryBandwidth: 37.33GiB/s
2021-06-28 08:28:03.318004: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cudart64_110.dll'; dlerror: cudart64_110.dll not found
2021-06-28 08:28:03.321171: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cublas64_11.dll'; dlerror: cublas64_11.dll not found
2021-06-28 08:28:03.324283: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cublasLt64_11.dll'; dlerror: cublasLt64_11.dll not found
2021-06-28 08:28:03.327513: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cufft64_10.dll'; dlerror: cufft64_10.dll not found
2021-06-28 08:28:03.330824: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'curand64_10.dll'; dlerror: curand64_10.dll not found
2021-06-28 08:28:03.334013: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cusolver64_11.dll'; dlerror: cusolver64_11.dll not found
2021-06-28 08:28:03.337124: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cusparse64_11.dll'; dlerror: cusparse64_11.dll not found
2021-06-28 08:28:03.340231: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cudnn64_8.dll'; dlerror: cudnn64_8.dll not found
2021-06-28 08:28:03.340436: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1766] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
2021-06-28 08:28:03.361671: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-06-28 08:28:03.376499: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-06-28 08:28:03.377334: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264]      
2021-06-28 08:28:05.616614: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:176] None of the MLIR Optimization Passes are enabled (registered 2)
Epoch 1/5
1875/1875 [==============================] - 3s 696us/step - loss: 0.2919 - accuracy: 0.9150
Epoch 2/5
1875/1875 [==============================] - 1s 668us/step - loss: 0.1421 - accuracy: 0.9582
Epoch 3/5
1875/1875 [==============================] - 1s 675us/step - loss: 0.1088 - accuracy: 0.9666
Epoch 4/5
1875/1875 [==============================] - 1s 660us/step - loss: 0.0873 - accuracy: 0.9734
Epoch 5/5
1875/1875 [==============================] - 1s 680us/step - loss: 0.0744 - accuracy: 0.9763
313/313 - 0s - loss: 0.0725 - accuracy: 0.9776

Process finished with exit code 0

这样测试完之后也不清楚到底做了什么。

4.测试数据查看

在一下路劲下找到数据集：

C:\\Users\\RS001(administsrator)\\.keras\\datasets

我之前下载过，所以没有提示下载，如我把这个数据删除重新运行，就会重新下载这个数据：

运行过程如下：

2021-06-28 08:32:38.806450: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cudart64_110.dll'; dlerror: cudart64_110.dll not found
2021-06-28 08:32:38.806683: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
Traceback (most recent call last):
  File "C:\\Users\\RS001\\AppData\\Local\\Programs\\Python\\Python37\\lib\\urllib\\request.py", line 1319, in do_open
    encode_chunked=req.has_header('Transfer-encoding'))
  File "C:\\Users\\RS001\\AppData\\Local\\Programs\\Python\\Python37\\lib\\http\\client.py", line 1252, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "C:\\Users\\RS001\\AppData\\Local\\Programs\\Python\\Python37\\lib\\http\\client.py", line 1298, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "C:\\Users\\RS001\\AppData\\Local\\Programs\\Python\\Python37\\lib\\http\\client.py", line 1247, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "C:\\Users\\RS001\\AppData\\Local\\Programs\\Python\\Python37\\lib\\http\\client.py", line 1026, in _send_output
    self.send(msg)
  File "C:\\Users\\RS001\\AppData\\Local\\Programs\\Python\\Python37\\lib\\http\\client.py", line 966, in send
    self.connect()
  File "C:\\Users\\RS001\\AppData\\Local\\Programs\\Python\\Python37\\lib\\http\\client.py", line 1414, in connect
    super().connect()
  File "C:\\Users\\RS001\\AppData\\Local\\Programs\\Python\\Python37\\lib\\http\\client.py", line 938, in connect
    (self.host,self.port), self.timeout, self.source_address)
  File "C:\\Users\\RS001\\AppData\\Local\\Programs\\Python\\Python37\\lib\\socket.py", line 728, in create_connection
    raise err
  File "C:\\Users\\RS001\\AppData\\Local\\Programs\\Python\\Python37\\lib\\socket.py", line 716, in create_connection
    sock.connect(sa)
ConnectionRefusedError: [WinError 10061] 由于目标计算机积极拒绝，无法连接。

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "D:\\pythonProject\\venv\\lib\\site-packages\\tensorflow\\python\\keras\\utils\\data_utils.py", line 258, in get_file
    urlretrieve(origin, fpath, dl_progress)
  File "C:\\Users\\RS001\\AppData\\Local\\Programs\\Python\\Python37\\lib\\urllib\\request.py", line 247, in urlretrieve
    with contextlib.closing(urlopen(url, data)) as fp:
  File "C:\\Users\\RS001\\AppData\\Local\\Programs\\Python\\Python37\\lib\\urllib\\request.py", line 222, in urlopen
    return opener.open(url, data, timeout)
  File "C:\\Users\\RS001\\AppData\\Local\\Programs\\Python\\Python37\\lib\\urllib\\request.py", line 525, in open
    response = self._open(req, data)
  File "C:\\Users\\RS001\\AppData\\Local\\Programs\\Python\\Python37\\lib\\urllib\\request.py", line 543, in _open
    '_open', req)
  File "C:\\Users\\RS001\\AppData\\Local\\Programs\\Python\\Python37\\lib\\urllib\\request.py", line 503, in _call_chain
    result = func(*args)
  File "C:\\Users\\RS001\\AppData\\Local\\Programs\\Python\\Python37\\lib\\urllib\\request.py", line 1362, in https_open
    context=self._context, check_hostname=self._check_hostname)
  File "C:\\Users\\RS001\\AppData\\Local\\Programs\\Python\\Python37\\lib\\urllib\\request.py", line 1321, in do_open
    raise URLError(err)
urllib.error.URLError: <urlopen error [WinError 10061] 由于目标计算机积极拒绝，无法连接。>

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "D:/pythonProject/test0000.py", line 6, in <module>
    (x_train, y_train), (x_test, y_test) = mnist.load_data()
  File "D:\\pythonProject\\venv\\lib\\site-packages\\tensorflow\\python\\keras\\datasets\\mnist.py", line 75, in load_data
    '731c5ac602752760c8e48fbffcf8c3b850d9dc2a2aedcf2cc48468fc17b673d1')
  File "D:\\pythonProject\\venv\\lib\\site-packages\\tensorflow\\python\\keras\\utils\\data_utils.py", line 262, in get_file
    raise Exception(error_msg.format(origin, e.errno, e.reason))
Exception: URL fetch failure on https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz: None -- [WinError 10061] 由于目标计算机积极拒绝，无法连接。

显示了整个下载的过程，提示无法下载，这个问题是由于我的网络环境引起的，其他博文有介绍，换个网络环境就好了。用的教育网，连接手机共享网络换成移动的就好了。

5.测试数据读取

mnist.npz这数据可以使用numpy读取。简单一点，我们将后缀改为zip解压打开看一下，发现里面有如下四个数据：

下面我们用numpy读取一下：

import numpy as np
mnist_data = np.load('mnist.npz')

print(mnist_data.files)

运行结果：

['x_test', 'x_train', 'y_train', 'y_test']

继续查看.npy文件：

import numpy as np

mnist_x_test = np.load('x_test.npy')
mnist_y_test = np.load('y_test.npy')
mnist_x_train = np.load('x_train.npy')
mnist_y_train = np.load('y_train.npy')
print("mnist_x_test:",mnist_x_test)
print("mnist_y_test:",mnist_y_test)
print("mnist_x_train:",mnist_x_train)
print("mnist_y_train:",mnist_y_train)

mnist_x_test: [[[0 0 0 ... 0 0 0]
  [0 0 0 ... 0 0 0]
  [0 0 0 ... 0 0 0]
  ...
  [0 0 0 ... 0 0 0]
  [0 0 0 ... 0 0 0]
  [0 0 0 ... 0 0 0]]

 [[0 0 0 ... 0 0 0]
  [0 0 0 ... 0 0 0]
  [0 0 0 ... 0 0 0]
  ...
  [0 0 0 ... 0 0 0]
  [0 0 0 ... 0 0 0]
  [0 0 0 ... 0 0 0]]

 [[0 0 0 ... 0 0 0]
  [0 0 0 ... 0 0 0]
  [0 0 0 ... 0 0 0]
  ...
  [0 0 0 ... 0 0 0]
  [0 0 0 ... 0 0 0]
  [0 0 0 ... 0 0 0]]

 ...

 [[0 0 0 ... 0 0 0]
  [0 0 0 ... 0 0 0]
  [0 0 0 ... 0 0 0]
  ...
  [0 0 0 ... 0 0 0]
  [0 0 0 ... 0 0 0]
  [0 0 0 ... 0 0 0]]

 [[0 0 0 ... 0 0 0]
  [0 0 0 ... 0 0 0]
  [0 0 0 ... 0 0 0]
  ...
  [0 0 0 ... 0 0 0]
  [0 0 0 ... 0 0 0]
  [0 0 0 ... 0 0 0]]

 [[0 0 0 ... 0 0 0]
  [0 0 0 ... 0 0 0]
  [0 0 0 ... 0 0 0]
  ...
  [0 0 0 ... 0 0 0]
  [0 0 0 ... 0 0 0]
  [0 0 0 ... 0 0 0]]]
mnist_y_test: [7 2 1 ... 4 5 6]
mnist_x_train: [[[0 0 0 ... 0 0 0]
  [0 0 0 ... 0 0 0]
  [0 0 0 ... 0 0 0]
  ...
  [0 0 0 ... 0 0 0]
  [0 0 0 ... 0 0 0]
  [0 0 0 ... 0 0 0]]

 [[0 0 0 ... 0 0 0]
  [0 0 0 ... 0 0 0]
  [0 0 0 ... 0 0 0]
  ...
  [0 0 0 ... 0 0 0]
  [0 0 0 ... 0 0 0]
  [0 0 0 ... 0 0 0]]

 [[0 0 0 ... 0 0 0]
  [0 0 0 ... 0 0 0]
  [0 0 0 ... 0 0 0]
  ...
  [0 0 0 ... 0 0 0]
  [0 0 0 ... 0 0 0]
  [0 0 0 ... 0 0 0]]

 ...

 [[0 0 0 ... 0 0 0]
  [0 0 0 ... 0 0 0]
  [0 0 0 ... 0 0 0]
  ...
  [0 0 0 ... 0 0 0]
  [0 0 0 ... 0 0 0]
  [0 0 0 ... 0 0 0]]

 [[0 0 0 ... 0 0 0]
  [0 0 0 ... 0 0 0]
  [0 0 0 ... 0 0 0]
  ...
  [0 0 0 ... 0 0 0]
  [0 0 0 ... 0 0 0]
  [0 0 0 ... 0 0 0]]

 [[0 0 0 ... 0 0 0]
  [0 0 0 ... 0 0 0]
  [0 0 0 ... 0 0 0]
  ...
  [0 0 0 ... 0 0 0]
  [0 0 0 ... 0 0 0]
  [0 0 0 ... 0 0 0]]]
mnist_y_train: [5 0 4 ... 5 6 8]

数据集和训练集实际上都处理为了数组。

为什么显示这么多零值，应该是没有显示全。测试代码中要转换为浮点型。

这段代码的作用：

#将样本从整数转换为浮点数
mnist = tf.keras.datasets.mnist

(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

6.尝试以上数据集显示为图像

有相关博主做过一些计算

import numpy as np
import cv2

mnist_x_test = np.load('x_test.npy')
mnist_x_train = np.load('x_train.npy')

print(len(mnist_x_train))
cv2.imshow("mnist", mnist_x_train[0])
cv2.waitKey(0)
cv2.imshow("mnist", mnist_x_test[0])
cv2.waitKey(0)

10000
60000

以上是关于TensorFlow2快速入门- MNIST 数据集详解的主要内容，如果未能解决你的问题，请参考以下文章