python数字识别
Posted michellel.top
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了python数字识别相关的知识,希望对你有一定的参考价值。
首先:安装依赖包PIL
pip install pillow
pip3 install pillow
接着:安装tesseract
pip install tesseract
pip3 install tesseract
或者:安装pytesseract
pip install pytesseract
pip3 install pytesseract
最后:安装tesseract-data
配置:环境变量
vim ~/.bash_profile
添加如下环境变量,TESSDATA_PREFIX的值根据tesseract的安装路径做调整
export TESSDATA_PREFIX="/usr/share/tessdata"
export PATH=$PATH:$TESSDATA_PREFIX
1
2
使环境变量生效
source ~/.bash_profile
例程:
1 import pytesseract 2 #tesseract_cmd = \'/usr/bin/tesseract\' 3 pytesseract.pytesseract.tesseract_cmd = \'/usr/bin/tesseract\' 4 from PIL import Image 5 6 image = Image.open("/home/nication/python/code.jpg") 7 code = pytesseract.image_to_string(image) 8 print(code)
python testp.py
6067
结果没有问题
python实现KNN,识别手写数字
写了识别手写数字的KNN算法,如下图所示。参考链接http://blog.csdn.net/april_newnew/article/details/44176059。
# -*- coding: utf-8 -*- import numpy as np import pandas as pd import os def readtxt(filename): text=[] f = open(filename,‘r‘,encoding=‘utf-8‘) for line in f.readlines(): text.append(line) txt = list(text) txt=np.array(txt,dtype=‘float‘) txt = txt.tolist() return txt def readdata(rootfile): data = [] label = [] for root,dirs,files in os.walk(rootfile): for name in files: filename = root +‘\\\\‘+name txt = readtxt(filename) data.append(txt) label1 = name.split(‘_‘)[0] label.append(label1) data = pd.DataFrame(data) return data,label def KNN(traindata,trainlabel,testdatai,K): length = len(traindata) newtest = np.tile(testdatai, (length,1)) newtest = pd.DataFrame(newtest) diff = newtest - traindata diff = diff**2 cha = diff.sum(axis=1) cha = cha**0.5 result = pd.DataFrame({‘label‘:trainlabel, ‘cha‘:cha}) labels = result.sort_values(by=‘cha‘)[:K] frequent =labels.groupby(labels[‘label‘]).size() labely = frequent.argmax() return labely def test(trainfile,testfile,K): result = [] traindata, trainlabel= readdata(trainfile) testdata, testlabel = readdata(testfile) for i in range(len(testdata)): labely = KNN(traindata,trainlabel,testdata.loc[i,:],K) result.append(labely) tongji = pd.DataFrame({‘result‘:result,‘testlabel‘:testlabel}) accuary = len(tongji[tongji[‘result‘]==tongji[‘testlabel‘]])/len(result) return result,accuary trainfile=r‘E:\\trainingDigits‘ testfile=r‘E:\\testDigits‘ K=3 result, accuary= test(trainfile,testfile,K)
注:训练数据集有2,210条记录,测试数据有670条。准确率并不高,只有0.45。目前不知道为什么,以后多学习,争取优化代码。
以上是关于python数字识别的主要内容,如果未能解决你的问题,请参考以下文章
在 Python 多处理进程中运行较慢的 OpenCV 代码片段
手写数字识别python代码 卷积层,池化层,正向传播(relu:激活函数)
Python:(人工智能识别手写数字)使用卷积神经网络代码多个报错及相应解决方法