扫描仪扫描文件处理之imagemagick_resize.py

Posted 2020-09-28

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了扫描仪扫描文件处理之imagemagick_resize.py相关的知识，希望对你有一定的参考价值。

高级扫描书籍系列参数配置文章：http://www.cnblogs.com/whycnblogs/category/1036599.html

作用：

批量调整扫描图片的宽度高度到指定值（像素不够增加、多余减去，自动居中，不改变原始图片的宽高比例不变形）
删除图片的exif信息（包含分辨率、DPI、等等）防止后续ABBYY识别出现问题

需要环境：

python3
python3 pip模块Pillow或PIL
imagemagick

# -*- coding: utf-8 -*-
# version: python 3
# ==========
# brew reinstall imagemagick
# pip3 install Pillow
# ==========
import sys, os
from PIL import Image

path = ‘/Users/osx/Desktop/test‘  # 处理目录【修改】
suffix = ‘jpg‘  # "处理目录"中的指定图片后缀【修改】

out_path = os.path.join(path, ‘out‘)  # 输出目录
# see: http://www.a4papersize.org/a4-paper-size-in-pixels.php (A4 Dimensions @ 600 DPI)
width = 4961  # 输出宽度（像素）
height = 7016  # 输出高度（像素）
brightness = -10  # 亮度（0表示不设置）
contrast = 20  # 对比度（0表示不设置）

if os.path.exists(out_path):
    print(‘输出目录已存在，请移走后再运行程序！‘)
    sys.exit()

if not os.path.exists(out_path):
    os.makedirs(out_path)


def get_file_list():
    exclude = ([‘.DS_Store‘, ‘.localized‘, ‘Thumbs.db‘, ‘desktop.ini‘])
    result_list = []
    if os.path.isfile(path):
        result_list.append(path)
    else:
        for dir_path, dir_names, file_names in os.walk(path):
            if os.path.abspath(dir_path) != os.path.abspath(path):  # 只允许 1 层目录
                continue
            for name in file_names:
                if not os.path.basename(name) in exclude:
                    result_list.append(os.path.join(dir_path, name))
    return result_list


def parse_image(in_image_file, out_image_file):
    # -----删除exif信息-----
    image_file = open(in_image_file, ‘rb‘)
    image = Image.open(image_file)
    data = list(image.getdata())
    image_without_exif = Image.new(image.mode, image.size)
    image_without_exif.putdata(data)
    image_without_exif.save(out_image_file)

    # -----命令行处理图片-----
    shell = ‘convert -resize %sx%s -gravity center -extent %sx%s -brightness-contrast %sx%s %s %s‘             % (width, height, width, height, brightness, contrast, out_image_file, out_image_file)  # 覆盖图片

    # print(shell)
    os.system(shell)


count = 0
file_list = get_file_list()
for tar in file_list:
    tar = os.path.abspath(tar)
    if os.path.splitext(tar)[1][1:] == suffix:  # 校验后缀是否合法
        count += 1
        tar_name = os.path.basename(tar)
        tar_out = os.path.join(out_path, tar_name)
        print(‘%s  %s‘ % (count, tar_name))
        parse_image(tar, tar_out)  # 处理图片

print(‘----------‘)
print(‘总共处理了：%s‘ % (count))

以上是关于扫描仪扫描文件处理之imagemagick_resize.py的主要内容，如果未能解决你的问题，请参考以下文章

扫描仪扫描文件处理之切书

扫描仪扫描文件处理之scan_png_monochrome.py

扫描仪扫描文件处理之富士通ix500参数

扫描仪扫描文件处理之imagemagick常用参数

扫描仪扫描文件处理之书签

扫描仪扫描文件处理之imagemagick_resize.py