学习 MNIST 后对非 MNIST 图像进行分类
Posted
技术标签:
【中文标题】学习 MNIST 后对非 MNIST 图像进行分类【英文标题】:Classifying non-MNIST image after learning MNIST 【发布时间】:2017-12-02 15:13:29 【问题描述】:我的机器学习算法已经学习了 MNIST 数据库中的 70000 张图像。我想在 MNIST 数据集中未包含的图像上对其进行测试。但是,我的预测函数无法读取我的测试图像的数组表示。
如何在外部图像上测试我的算法? 为什么我的代码失败了?
PS 我用的是python3
收到错误:
Traceback (most recent call last):
File "hello_world2.py", line 28, in <module>
print(sgd_clf.predict(arr))
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/sklearn/linear_model/base.py", line 336, in predict
scores = self.decision_function(X)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/sklearn/linear_model/base.py", line 317, in decision_function
% (X.shape[1], n_features))
ValueError: X has 15 features per sample; expecting 784
代码:
# Common Imports
import numpy as np
from sklearn.datasets import fetch_mldata
from sklearn.linear_model import SGDClassifier
from PIL import Image
from resizeimage import resizeimage
# loading and learning MNIST data
mnist = fetch_mldata('MNIST original')
x, y = mnist["data"], mnist["target"]
sgd_clf = SGDClassifier(random_state=42)
sgd_clf.fit(x, y)
# loading and converting to array a non-MNIST image of a "5", which is in the same folder
img = Image.open("5.png")
arr = np.array(img)
# trying to predict that the image is a "5"
img = Image.open("5.png")
img = img.convert('L') #makes it greyscale
img = resizeimage.resize_thumbnail(img, [28,28])
arr = np.array(img)
print(sgd_clf.predict(arr)) # ERROR... why????????? How do you fix it?????
【问题讨论】:
该图像必须调整大小。 MNIST 图像为 28x28。 另外,您的图像似乎是 3 通道的。您必须对其进行灰度化。 如何调整 MNIST 图像的大小? (注意:请参阅原始代码进行编辑。谢谢。) 这个例子可能会有所帮助:github.com/niektemme/tensorflow-mnist-predict/blob/master/… 【参考方案1】:这不仅仅是调整大小的问题,图像需要数字居中和黑底白字等。我一直在研究这项工作的功能。这是使用 opencv 的当前版本,虽然它可以做进一步的改进,包括使用 PIL 代替 opencv,但它应该给出所需步骤的概念。
def open_as_mnist(image_path):
"""
Assume this is a color or grey scale image of a digit which has not so far been preprocessed
Black and White
Resize to 20 x 20 (digit in center ideally)
Sharpen
Add white border to make it 28 x 28
Convert to white on black
"""
# open as greyscale
image = cv2.imread(image_path, 0)
# crop to contour with largest area
cropped = do_cropping(image)
# resizing the image to 20 x 20
resized20 = cv2.resize(cropped, (20, 20), interpolation=cv2.INTER_CUBIC)
cv2.imwrite('1_resized.jpg', resized20)
# gaussian filtering
blurred = cv2.GaussianBlur(resized20, (3, 3), 0)
# white digit on black background
ret, thresh = cv2.threshold(blurred, 127, 255, cv2.THRESH_BINARY_INV)
padded = to20by20(thresh)
resized28 = padded_image(padded, 28)
# normalize the image values to fit in the range [0,1]
norm_image = np.asarray(resized28, dtype=np.float32) / 255.
# cv2.imshow('image', norm_image)
# cv2.waitKey(0)
# # Flatten the image to a 1-D vector and return
flat = norm_image.reshape(1, 28 * 28)
# return flat
# normalize pixels to 0 and 1. 0 is pure white, 1 is pure black.
tva = [(255 - x) * 1.0 / 255.0 for x in flat]
return tva
def padded_image(image, tosize):
"""
This method adds padding to the image and makes it to a tosize x tosize array,
without losing the aspect ratio.
Assumes desired image is square
:param image: the input image as numpy array
:param tosize: the final dimensions
"""
# image dimensions
image_height, image_width = image.shape
# if not already square then pad to square
if image_height != image_width:
# Add padding
# The aim is to make an image of different width and height to a sqaure image
# For that first the biggest attribute among width and height are determined.
max_index = np.argmax([image_height, image_width])
# if height is the biggest one, then add padding to width until width becomes
# equal to height
if max_index == 0:
#order of padding is: top, bottom, left, right
left = int((image_height - image_width) / 2)
right = image_height - image_width - left
padded_img = cv2.copyMakeBorder(image, 0, 0,
left,
right,
cv2.BORDER_CONSTANT)
# else if width is the biggest one, then add padding to height until height becomes
# equal to width
else:
top = int((image_width - image_height) / 2)
bottom = image_width - image_height - top
padded_img = cv2.copyMakeBorder(image, top, bottom, 0, 0, cv2.BORDER_CONSTANT)
else:
padded_img = image
# now that it's a square, add any additional padding required
image_height, image_width = padded_img.shape
padding = tosize - image_height
# need to handle where padding is not divisiable by 2
left = top = int(padding/2)
right = bottom = padding - left
resized = cv2.copyMakeBorder(image, top, bottom, left, right, cv2.BORDER_CONSTANT)
return resized
【讨论】:
【参考方案2】:如果你想阅读图片然后调整它的大小,请尝试
In [1]: import PIL.Image as Image
In [2]: img = Image.open('2.jpg', mode='r')
In [3]: img.mode
Out[3]: 'RGB'
In [4]: img.size
Out[4]: (2880, 1800)
In [5]: img_new = img.resize([4000, 4000], Image.ANTIALIAS)
In [6]: img_new2 = img.resize([32, 32], Image.ANTIALIAS)
文档是here
这是2.jpg,抱歉,不是数字。
此图来自网络,不好意思,忘记出处了。
如果遇到模式是'RGBA',建议你转成'RGB'模式,
newimg = Image.new('RGB', img.size)
newimg.paste(img, mask=img.split()[3])
return newimg
【讨论】:
【参考方案3】:请试试这个:
img = Image.open("5.png")
img = img.resize((28,28))
img = img.convert('L') #makes it greyscale
【讨论】:
试过了。取得了进展。但我仍然需要调整图像大小。我对如何调整 MNIST 图像的大小感到困惑。 (注意:请参阅上面的代码了解最近的改进。谢谢。) 我更新了我的答案。可能是因为您将图像转换为灰色后调整了图像大小,所以它再次向图像添加了 3 层。而且你不需要使用另一个库来调整大小。以上是关于学习 MNIST 后对非 MNIST 图像进行分类的主要内容,如果未能解决你的问题,请参考以下文章