TensorFlow2快速入门- MNIST 数据集详解
Posted 空中旋转篮球
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了TensorFlow2快速入门- MNIST 数据集详解相关的知识,希望对你有一定的参考价值。
1.软硬件环境
python3.7、pycharm-community-2021.1.1、tensorflow2.5
2.MNIST数据集
下载和介绍地址:MNIST数据集下载地址
- Training set images: train-images-idx3-ubyte.gz (9.9 MB, 解压后 47 MB, 包含 60,000 个样本)
- Training set labels: train-labels-idx1-ubyte.gz (29 KB, 解压后 60 KB, 包含 60,000 个标签)
- Test set images: t10k-images-idx3-ubyte.gz (1.6 MB, 解压后 7.8 MB, 包含 10,000 个样本)
- Test set labels: t10k-labels-idx1-ubyte.gz (5KB, 解压后 10 KB, 包含 10,000 个标签)
- 这里看到的和后面实际下载的格式不一样。
3.测试代码
下载测试包的代码:
import tensorflow as tf
#将样本从整数转换为浮点数
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
#将模型的各层堆叠起来,以搭建 tf.keras.Sequential 模型。为训练选择优化器和损失函数
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10, activation='softmax')
])
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
#训练并验证模型
model.fit(x_train, y_train, epochs=5)
model.evaluate(x_test, y_test, verbose=2)
运行结果,先找GPU,找不到后就跳过使用CPU
2021-06-28 08:27:45.633464: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cudart64_110.dll'; dlerror: cudart64_110.dll not found
2021-06-28 08:27:45.633864: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2021-06-28 08:28:03.197821: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library nvcuda.dll
2021-06-28 08:28:03.296392: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce GT 730 computeCapability: 3.5
coreClock: 0.9015GHz coreCount: 2 deviceMemorySize: 2.00GiB deviceMemoryBandwidth: 37.33GiB/s
2021-06-28 08:28:03.318004: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cudart64_110.dll'; dlerror: cudart64_110.dll not found
2021-06-28 08:28:03.321171: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cublas64_11.dll'; dlerror: cublas64_11.dll not found
2021-06-28 08:28:03.324283: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cublasLt64_11.dll'; dlerror: cublasLt64_11.dll not found
2021-06-28 08:28:03.327513: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cufft64_10.dll'; dlerror: cufft64_10.dll not found
2021-06-28 08:28:03.330824: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'curand64_10.dll'; dlerror: curand64_10.dll not found
2021-06-28 08:28:03.334013: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cusolver64_11.dll'; dlerror: cusolver64_11.dll not found
2021-06-28 08:28:03.337124: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cusparse64_11.dll'; dlerror: cusparse64_11.dll not found
2021-06-28 08:28:03.340231: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cudnn64_8.dll'; dlerror: cudnn64_8.dll not found
2021-06-28 08:28:03.340436: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1766] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
2021-06-28 08:28:03.361671: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-06-28 08:28:03.376499: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-06-28 08:28:03.377334: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264]
2021-06-28 08:28:05.616614: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:176] None of the MLIR Optimization Passes are enabled (registered 2)
Epoch 1/5
1875/1875 [==============================] - 3s 696us/step - loss: 0.2919 - accuracy: 0.9150
Epoch 2/5
1875/1875 [==============================] - 1s 668us/step - loss: 0.1421 - accuracy: 0.9582
Epoch 3/5
1875/1875 [==============================] - 1s 675us/step - loss: 0.1088 - accuracy: 0.9666
Epoch 4/5
1875/1875 [==============================] - 1s 660us/step - loss: 0.0873 - accuracy: 0.9734
Epoch 5/5
1875/1875 [==============================] - 1s 680us/step - loss: 0.0744 - accuracy: 0.9763
313/313 - 0s - loss: 0.0725 - accuracy: 0.9776
Process finished with exit code 0
这样测试完之后也不清楚到底做了什么。
4.测试数据查看
在一下路劲下找到数据集:
C:\\Users\\RS001(administsrator)\\.keras\\datasets
我之前下载过,所以没有提示下载,如我把这个数据删除重新运行,就会重新下载这个数据:
运行过程如下:
2021-06-28 08:32:38.806450: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cudart64_110.dll'; dlerror: cudart64_110.dll not found
2021-06-28 08:32:38.806683: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
Traceback (most recent call last):
File "C:\\Users\\RS001\\AppData\\Local\\Programs\\Python\\Python37\\lib\\urllib\\request.py", line 1319, in do_open
encode_chunked=req.has_header('Transfer-encoding'))
File "C:\\Users\\RS001\\AppData\\Local\\Programs\\Python\\Python37\\lib\\http\\client.py", line 1252, in request
self._send_request(method, url, body, headers, encode_chunked)
File "C:\\Users\\RS001\\AppData\\Local\\Programs\\Python\\Python37\\lib\\http\\client.py", line 1298, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "C:\\Users\\RS001\\AppData\\Local\\Programs\\Python\\Python37\\lib\\http\\client.py", line 1247, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "C:\\Users\\RS001\\AppData\\Local\\Programs\\Python\\Python37\\lib\\http\\client.py", line 1026, in _send_output
self.send(msg)
File "C:\\Users\\RS001\\AppData\\Local\\Programs\\Python\\Python37\\lib\\http\\client.py", line 966, in send
self.connect()
File "C:\\Users\\RS001\\AppData\\Local\\Programs\\Python\\Python37\\lib\\http\\client.py", line 1414, in connect
super().connect()
File "C:\\Users\\RS001\\AppData\\Local\\Programs\\Python\\Python37\\lib\\http\\client.py", line 938, in connect
(self.host,self.port), self.timeout, self.source_address)
File "C:\\Users\\RS001\\AppData\\Local\\Programs\\Python\\Python37\\lib\\socket.py", line 728, in create_connection
raise err
File "C:\\Users\\RS001\\AppData\\Local\\Programs\\Python\\Python37\\lib\\socket.py", line 716, in create_connection
sock.connect(sa)
ConnectionRefusedError: [WinError 10061] 由于目标计算机积极拒绝,无法连接。
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "D:\\pythonProject\\venv\\lib\\site-packages\\tensorflow\\python\\keras\\utils\\data_utils.py", line 258, in get_file
urlretrieve(origin, fpath, dl_progress)
File "C:\\Users\\RS001\\AppData\\Local\\Programs\\Python\\Python37\\lib\\urllib\\request.py", line 247, in urlretrieve
with contextlib.closing(urlopen(url, data)) as fp:
File "C:\\Users\\RS001\\AppData\\Local\\Programs\\Python\\Python37\\lib\\urllib\\request.py", line 222, in urlopen
return opener.open(url, data, timeout)
File "C:\\Users\\RS001\\AppData\\Local\\Programs\\Python\\Python37\\lib\\urllib\\request.py", line 525, in open
response = self._open(req, data)
File "C:\\Users\\RS001\\AppData\\Local\\Programs\\Python\\Python37\\lib\\urllib\\request.py", line 543, in _open
'_open', req)
File "C:\\Users\\RS001\\AppData\\Local\\Programs\\Python\\Python37\\lib\\urllib\\request.py", line 503, in _call_chain
result = func(*args)
File "C:\\Users\\RS001\\AppData\\Local\\Programs\\Python\\Python37\\lib\\urllib\\request.py", line 1362, in https_open
context=self._context, check_hostname=self._check_hostname)
File "C:\\Users\\RS001\\AppData\\Local\\Programs\\Python\\Python37\\lib\\urllib\\request.py", line 1321, in do_open
raise URLError(err)
urllib.error.URLError: <urlopen error [WinError 10061] 由于目标计算机积极拒绝,无法连接。>
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "D:/pythonProject/test0000.py", line 6, in <module>
(x_train, y_train), (x_test, y_test) = mnist.load_data()
File "D:\\pythonProject\\venv\\lib\\site-packages\\tensorflow\\python\\keras\\datasets\\mnist.py", line 75, in load_data
'731c5ac602752760c8e48fbffcf8c3b850d9dc2a2aedcf2cc48468fc17b673d1')
File "D:\\pythonProject\\venv\\lib\\site-packages\\tensorflow\\python\\keras\\utils\\data_utils.py", line 262, in get_file
raise Exception(error_msg.format(origin, e.errno, e.reason))
Exception: URL fetch failure on https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz: None -- [WinError 10061] 由于目标计算机积极拒绝,无法连接。
显示了整个下载的过程,提示无法下载,这个问题是由于我的网络环境引起的,其他博文有介绍,换个网络环境就好了。用的教育网,连接手机共享网络换成移动的就好了。
5.测试数据读取
mnist.npz这数据可以使用numpy读取。简单一点,我们将后缀改为zip解压打开看一下,发现里面有如下四个数据:
下面我们用numpy读取一下:
import numpy as np
mnist_data = np.load('mnist.npz')
print(mnist_data.files)
运行结果:
['x_test', 'x_train', 'y_train', 'y_test']
继续查看.npy文件:
import numpy as np
mnist_x_test = np.load('x_test.npy')
mnist_y_test = np.load('y_test.npy')
mnist_x_train = np.load('x_train.npy')
mnist_y_train = np.load('y_train.npy')
print("mnist_x_test:",mnist_x_test)
print("mnist_y_test:",mnist_y_test)
print("mnist_x_train:",mnist_x_train)
print("mnist_y_train:",mnist_y_train)
mnist_x_test: [[[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
...
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]]
[[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
...
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]]
[[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
...
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]]
...
[[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
...
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]]
[[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
...
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]]
[[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
...
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]]]
mnist_y_test: [7 2 1 ... 4 5 6]
mnist_x_train: [[[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
...
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]]
[[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
...
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]]
[[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
...
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]]
...
[[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
...
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]]
[[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
...
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]]
[[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
...
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]]]
mnist_y_train: [5 0 4 ... 5 6 8]
数据集和训练集实际上都处理为了数组。
为什么显示这么多零值,应该是没有显示全。测试代码中要转换为浮点型。
这段代码的作用:
#将样本从整数转换为浮点数
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
6.尝试以上数据集显示为图像
有相关博主做过一些计算
import numpy as np
import cv2
mnist_x_test = np.load('x_test.npy')
mnist_x_train = np.load('x_train.npy')
print(len(mnist_x_train))
cv2.imshow("mnist", mnist_x_train[0])
cv2.waitKey(0)
cv2.imshow("mnist", mnist_x_test[0])
cv2.waitKey(0)
10000
60000
以上是关于TensorFlow2快速入门- MNIST 数据集详解的主要内容,如果未能解决你的问题,请参考以下文章
Tensorflow2.0的手写数字识别系统(Mnist数据集)
TensorFlow入门实战|第1周:实现mnist手写数字识别
TensorFlow入门实战|第1周:实现mnist手写数字识别
TensorFlow入门实战|第1周:实现mnist手写数字识别