如何使用python检查目录中所有图像的尺寸?
Posted
技术标签:
【中文标题】如何使用python检查目录中所有图像的尺寸?【英文标题】:How to check dimensions of all images in a directory using python? 【发布时间】:2010-12-03 04:28:56 【问题描述】:我需要检查目录中图像的尺寸。目前它有大约 700 张图像。 我只需要检查尺寸,如果尺寸与给定尺寸不匹配,它将被移动到不同的文件夹。我该如何开始?
【问题讨论】:
【参考方案1】:如果您不需要 PIL 的其余部分而只需要 PNG、JPEG 和 GIF 的图像尺寸,那么这个小功能(BSD 许可)可以很好地完成工作:
http://code.google.com/p/bfg-pages/source/browse/trunk/pages/getimageinfo.py
import StringIO
import struct
def getImageInfo(data):
data = str(data)
size = len(data)
height = -1
width = -1
content_type = ''
# handle GIFs
if (size >= 10) and data[:6] in ('GIF87a', 'GIF89a'):
# Check to see if content_type is correct
content_type = 'image/gif'
w, h = struct.unpack("<HH", data[6:10])
width = int(w)
height = int(h)
# See PNG 2. Edition spec (http://www.w3.org/TR/PNG/)
# Bytes 0-7 are below, 4-byte chunk length, then 'IHDR'
# and finally the 4-byte width, height
elif ((size >= 24) and data.startswith('\211PNG\r\n\032\n')
and (data[12:16] == 'IHDR')):
content_type = 'image/png'
w, h = struct.unpack(">LL", data[16:24])
width = int(w)
height = int(h)
# Maybe this is for an older PNG version.
elif (size >= 16) and data.startswith('\211PNG\r\n\032\n'):
# Check to see if we have the right content type
content_type = 'image/png'
w, h = struct.unpack(">LL", data[8:16])
width = int(w)
height = int(h)
# handle JPEGs
elif (size >= 2) and data.startswith('\377\330'):
content_type = 'image/jpeg'
jpeg = StringIO.StringIO(data)
jpeg.read(2)
b = jpeg.read(1)
try:
while (b and ord(b) != 0xDA):
while (ord(b) != 0xFF): b = jpeg.read(1)
while (ord(b) == 0xFF): b = jpeg.read(1)
if (ord(b) >= 0xC0 and ord(b) <= 0xC3):
jpeg.read(3)
h, w = struct.unpack(">HH", jpeg.read(4))
break
else:
jpeg.read(int(struct.unpack(">H", jpeg.read(2))[0])-2)
b = jpeg.read(1)
width = int(w)
height = int(h)
except struct.error:
pass
except ValueError:
pass
return content_type, width, height
【讨论】:
这对我来说就像一个魅力,+1 对于没有第三方库的解决方案。 你用什么来称呼这个函数?你的数据是什么?【参考方案2】:一种常见的方法是使用python成像库PIL来获取尺寸:
from PIL import Image
import os.path
filename = os.path.join('path', 'to', 'image', 'file')
img = Image.open(filename)
print img.size
然后您需要遍历目录中的文件,根据您所需的尺寸检查尺寸,并移动那些不匹配的文件。
【讨论】:
【参考方案3】:您可以使用Python Imaging Library(又名 PIL)来读取图像标题并查询尺寸。
解决它的一种方法是自己编写一个函数,该函数接受文件名并返回尺寸(使用 PIL)。然后使用os.path.walk
函数遍历目录下的所有文件,应用这个函数。收集结果,您可以构建映射字典filename -> dimensions
,然后使用列表推导(参见itertools
)过滤掉与所需大小不匹配的那些。
【讨论】:
我这样做了,但是用 os.listdir 代替.. 与 ~700 个图像效果很好。 os.path.walk 更好吗? 如果os.listdir
能满足您的需求,那很好。主要区别在于os.walk
会递归到子目录中。【参考方案4】:
这是一个满足您需要的脚本:
#!/usr/bin/env python
"""
Get information about images in a folder.
"""
from os import listdir
from os.path import isfile, join
from PIL import Image
def print_data(data):
"""
Parameters
----------
data : dict
"""
for k, v in data.items():
print("%s:\t%s" % (k, v))
print("Min width: %i" % data["min_width"])
print("Max width: %i" % data["max_width"])
print("Min height: %i" % data["min_height"])
print("Max height: %i" % data["max_height"])
def main(path):
"""
Parameters
----------
path : str
Path where to look for image files.
"""
onlyfiles = [f for f in listdir(path) if isfile(join(path, f))]
# Filter files by extension
onlyfiles = [f for f in onlyfiles if f.endswith(".jpg")]
data =
data["images_count"] = len(onlyfiles)
data["min_width"] = 10 ** 100 # No image will be bigger than that
data["max_width"] = 0
data["min_height"] = 10 ** 100 # No image will be bigger than that
data["max_height"] = 0
for filename in onlyfiles:
im = Image.open(filename)
width, height = im.size
data["min_width"] = min(width, data["min_width"])
data["max_width"] = max(width, data["max_width"])
data["min_height"] = min(height, data["min_height"])
data["max_height"] = max(height, data["max_height"])
print_data(data)
if __name__ == "__main__":
main(path=".")
【讨论】:
【参考方案5】:import os
from PIL import Image
folder_images = "/tmp/photos"
size_images = dict()
for dirpath, _, filenames in os.walk(folder_images):
for path_image in filenames:
image = os.path.abspath(os.path.join(dirpath, path_image))
with Image.open(image) as img:
width, heigth = img.size
SIZE_IMAGES[path_image] = 'width': width, 'heigth': heigth
print(size_images)
在folder_images
你的箭头目录中,它是图像。
size_images
是图片大小的变量,采用这种格式。
例子:
'image_name.jpg' : 'width': 100, 'heigth': 100
【讨论】:
虽然您的代码背后的想法很好,但它缺乏解释。我还要指出,全部大写的变量通常用于常量,因此我不建议像使用SIZE_IMAGES
那样将其用于字典。
拜托,我可以重新评估我的答案。【参考方案6】:
您还可以使用 cv2 库来检查图像的尺寸。
import cv2
# read image
img = cv2.imread('boarding_pass.png', cv2.IMREAD_UNCHANGED)
# get dimensions of image
dimensions = img.shape
# height, width, number of channels in image
height = img.shape[0]
width = img.shape[1]
channels = img.shape[2]
print('Image Dimension : ',dimensions)
print('Image Height : ',height)
print('Image Width : ',width)
print('Number of Channels : ',channels)
【讨论】:
【参考方案7】:我对上面提供的答案非常满意,因为这些答案帮助我为这个问题写了另一个简单的答案。 由于上述答案只有脚本,因此读者需要运行以检查它们是否正常工作。所以我决定使用交互模式编程(使用 Python shell)来解决这个问题。 我想你会很清楚。我正在使用 Python 2.7.12 并且我已经安装了 Pillow 库来使用 PIL 来访问图像。我的当前目录中有很多 jpg 图像和 1 个 png 图像。 现在让我们继续讨论 Python shell。
>>> #Date of creation : 3 March 2017
>>> #Python version : 2.7.12
>>>
>>> import os #Importing os module
>>> import glob #Importing glob module to list the same type of image files like jpg/png(here)
>>>
>>> for extension in ["jpg", 'png']:
... print "List of all " + extension + " files in current directory:-"
... i = 1
... for imgfile in glob.glob("*."+extension):
... print i,") ",imgfile
... i += 1
... print "\n"
...
List of all jpg files in current directory:-
1 ) 002-tower-babel.jpg
2 ) 1454906.jpg
3 ) 69151278-great-hd-wallpapers.jpg
4 ) amazing-ancient-wallpaper.jpg
5 ) Ancient-Rome.jpg
6 ) babel_full.jpg
7 ) Cuba-is-wonderfull.jpg
8 ) Cute-Polar-Bear-Images-07775.jpg
9 ) Cute-Polar-Bear-Widescreen-Wallpapers-07781.jpg
10 ) Hard-work-without-a-lh.jpg
11 ) jpeg422jfif.jpg
12 ) moscow-park.jpg
13 ) moscow_city_night_winter_58404_1920x1080.jpg
14 ) Photo1569.jpg
15 ) Pineapple-HD-Photos-03691.jpg
16 ) Roman_forum_cropped.jpg
17 ) socrates.jpg
18 ) socrates_statement1.jpg
19 ) steve-jobs.jpg
20 ) The_Great_Wall_of_China_at_Jinshanling-edit.jpg
21 ) torenvanbabel_grt.jpg
22 ) tower_of_babel4.jpg
23 ) valckenborch_babel_1595_grt.jpg
24 ) Wall-of-China-17.jpg
List of all png files in current directory:-
1 ) gergo-hungary.png
>>> #So let's display all the resolutions with the filename
... from PIL import Image #Importing Python Imaging library(PIL)
>>> for extension in ["jpg", 'png']:
... i = 1
... for imgfile in glob.glob("*." + extension):
... img = Image.open(imgfile)
... print i,") ",imgfile,", resolution: ",img.size[0],"x",img.size[1]
... i += 1
... print "\n"
...
1 ) 002-tower-babel.jpg , resolution: 1024 x 768
2 ) 1454906.jpg , resolution: 1920 x 1080
3 ) 69151278-great-hd-wallpapers.jpg , resolution: 5120 x 2880
4 ) amazing-ancient-wallpaper.jpg , resolution: 1920 x 1080
5 ) Ancient-Rome.jpg , resolution: 1000 x 667
6 ) babel_full.jpg , resolution: 1464 x 1142
7 ) Cuba-is-wonderfull.jpg , resolution: 1366 x 768
8 ) Cute-Polar-Bear-Images-07775.jpg , resolution: 1600 x 1067
9 ) Cute-Polar-Bear-Widescreen-Wallpapers-07781.jpg , resolution: 2300 x 1610
10 ) Hard-work-without-a-lh.jpg , resolution: 650 x 346
11 ) jpeg422jfif.jpg , resolution: 2048 x 1536
12 ) moscow-park.jpg , resolution: 1920 x 1200
13 ) moscow_city_night_winter_58404_1920x1080.jpg , resolution: 1920 x 1080
14 ) Photo1569.jpg , resolution: 480 x 640
15 ) Pineapple-HD-Photos-03691.jpg , resolution: 2365 x 1774
16 ) Roman_forum_cropped.jpg , resolution: 4420 x 1572
17 ) socrates.jpg , resolution: 852 x 480
18 ) socrates_statement1.jpg , resolution: 1280 x 720
19 ) steve-jobs.jpg , resolution: 1920 x 1080
20 ) The_Great_Wall_of_China_at_Jinshanling-edit.jpg , resolution: 4288 x 2848
21 ) torenvanbabel_grt.jpg , resolution: 1100 x 805
22 ) tower_of_babel4.jpg , resolution: 1707 x 956
23 ) valckenborch_babel_1595_grt.jpg , resolution: 1100 x 748
24 ) Wall-of-China-17.jpg , resolution: 1920 x 1200
1 ) gergo-hungary.png , resolution: 1236 x 928
>>>
【讨论】:
【参考方案8】:如果您使用的是 ipython / jupyter notebook,这个功能就像一个魅力。方便的命令是 linux 终端中的file
命令。你问优点?这里:
def get_image_size_faster(file_dir, ext='png'):
"""
Function to retrieve image size without loading the image at all
params:
file_dir = path of the folder containing image files
dim_index = index of image dimensions in the `file $file_path` call output
For PNG : -3 # Downloads/test.png: PNG image data, 4032 x 3024, 8-bit/color RGB, non-interlaced
For JPEG/JPG : -2 # Downloads/test.jpg: JPEG image data,..., baseline, precision 8, 2252x1400, components 3
For GIF : -1 # Downloads/test.gif: GIF image data, version 89a, 498 x 373
"""
dim_index_map =
'png' : -3,
'jpg' : -2,
'jpeg': -2,
'gif' : -1
dim_index = dim_index_map[ext]
files_regex = "file_dir/*.ext".format(file_dir=file_dir, ext=ext)
outputs = !file $files_regex
dims = [tuple(map(int, x.split(',')[dim_index].strip().split('x'))) for x in outputs]
return dims
可以使用subprocess
包为这个函数编写python-script 替代方案,它产生相同的结果
【讨论】:
【参考方案9】:我尝试使用@JohnTESlade 的答案,但我遇到了字节字符串转换问题,所以我更正了它,遵循了一些 PEP,并添加了对 EMF 类型的支持,这是我需要的。
def get_image_info(data: bytes) -> Tuple[str, int, int]:
size = len(data)
height = -1
width = -1
content_type = ''
# handle GIFs
if (size >= 10) and data[:6] in (b'GIF87a', b'GIF89a'):
# Check to see if content_type is correct
content_type = 'image/gif'
w, h = struct.unpack("<HH", data[6:10])
width = int(w)
height = int(h)
# See PNG 2. Edition spec (http://www.w3.org/TR/PNG/)
# Bytes 0-7 are below, 4-byte chunk length, then 'IHDR'
# and finally the 4-byte width, height
elif ((size >= 24) and data[0:8] == b'\211PNG\r\n\032\n'
and (data[12:16] == b'IHDR')):
content_type = 'image/png'
w, h = struct.unpack(">LL", data[16:24])
width = int(w)
height = int(h)
# Maybe this is for an older PNG version.
elif (size >= 16) and data[0:8] == b'\211PNG\r\n\032\n':
# Check to see if we have the right content type
content_type = 'image/png'
w, h = struct.unpack(">LL", data[8:16])
width = int(w)
height = int(h)
# handle JPEGs
elif (size >= 2) and data[0:2] == b'\377\330':
content_type = 'image/jpeg'
jpeg = BytesIO(data)
jpeg.read(2)
b = jpeg.read(1)
w, h = -1, -1
try:
while b and ord(b) != 0xDA:
while ord(b) != 0xFF:
b = jpeg.read(1)
while ord(b) == 0xFF:
b = jpeg.read(1)
if 0xC0 <= ord(b) <= 0xC3:
jpeg.read(3)
h, w = struct.unpack(">HH", jpeg.read(4))
break
else:
jpeg.read(int(struct.unpack(">H", jpeg.read(2))[0]) - 2)
b = jpeg.read(1)
width = int(w)
height = int(h)
except struct.error:
pass
except ValueError:
pass
# Maybe this will work for most EMF types.
elif (size >= 40) and data[0:4] == b'\001\000\000\000':
# Check to see if we have the right content type
content_type = 'image/x-emf'
x, y, r, b = struct.unpack("<LLLL", data[24:40])
width = int(r - x)
height = int(b - y)
return content_type, width, height
【讨论】:
以上是关于如何使用python检查目录中所有图像的尺寸?的主要内容,如果未能解决你的问题,请参考以下文章