将图像与图像集合匹配

Posted 2023-02-25

技术标签:

【中文标题】将图像与图像集合匹配【英文标题】：Matching image to images collection 【发布时间】：2014-10-01 15:24:14 【问题描述】：

我收集了大量卡片图片，以及一张特定卡片的照片。我可以使用哪些工具来查找与我最相似的收藏图片？

这里是收集样本：

Abundance Aggressive Urge Demystify

这是我要查找的内容：

Card Photo

【问题讨论】：

嗯...你的一张图片有什么不同 - 它是更亮/更暗，旋转/扭曲/移位，是不同的大小，是不同的格式（JPEG/PNG），或在图像中移动了一个较小的元素，但其余的像素与像素相同，或者....？假设它是由固定相机从上方在白色背景上打印出来并拍摄的。它通常更亮，可能有点扭曲/旋转。很难就您提供的信息提出建议。您能否发布大集合中的 2-3 张图片和您尝试与您的收藏匹配的奇怪图片？ @MarkSetchell 我用图像样本更新了问题对于任何感兴趣的人，我认为您可以使用可用的图像here 作为参考图像。但是更多的测试图像肯定会有助于评估一种方法。通过上面链接中的参考图像和可用的一个测试图像，我使用欧几里德距离来找到最佳匹配，正如我在答案编辑部分中概述的那样，它为这个特定的测试图像提供了很好的结果。 【参考方案1】：

感谢您发布一些照片。

我编写了一个名为Perceptual Hashing 的算法，由 Neal Krawetz 博士发现。在将您的图片与卡片进行比较时，我得到以下相似度百分比：

Card vs. Abundance 79%
Card vs. Aggressive 83%
Card vs. Demystify 85%

所以，它不是您图像类型的理想鉴别器，但有点工作。您可能希望使用它来为您的用例定制它。

我会为您收藏中的每张图片计算一个哈希值，一次一个，然后为每张图片存储一次哈希值。然后，当您获得一张新卡时，计算其哈希值并将其与存储的哈希值进行比较。

#!/bin/bash
################################################################################
# Similarity
# Mark Setchell
#
# Calculate percentage similarity of two images using Perceptual Hashing
# See article by Dr Neal Krawetz entitled "Looks Like It" - www.hackerfactor.com
#
# Method:
# 1) Resize image to black and white 8x8 pixel square regardless
# 2) Calculate mean brightness of those 64 pixels
# 3) For each pixel, store "1" if pixel>mean else store "0" if less than mean
# 4) Convert resulting 64bit string of 1's and 0's, 16 hex digit "Perceptual Hash"
#
# If finding difference between Perceptual Hashes, simply total up number of bits
# that differ between the two strings - this is the Hamming distance.
#
# Requires ImageMagick - www.imagemagick.org
#
# Usage:
#
# Similarity image|imageHash [image|imageHash]
# If you pass one image filename, it will tell you the Perceptual hash as a 16
# character hex string that you may want to store in an alternate stream or as
# an attribute or tag in filesystems that support such things. Do this in order
# to just calculate the hash once for each image.
#
# If you pass in two images, or two hashes, or an image and a hash, it will try
# to compare them and give a percentage similarity between them.
################################################################################
function PerceptualHash()

   TEMP="tmp$$.png"

   # Force image to 8x8 pixels and greyscale
   convert "$1" -colorspace gray -quality 80 -resize 8x8! PNG8:"$TEMP"

   # Calculate mean brightness and correct to range 0..255
   MEAN=$(convert "$TEMP" -format "%[fx:int(mean*255)]" info:)

   # Now extract all 64 pixels and build string containing "1" where pixel > mean else "0"
   hash=""
   for i in 0..7; do
      for j in 0..7; do
         pixel=$(convert "$TEMP"[1x1+$i+$j] -colorspace gray text: | grep -Eo "\(\d+," | tr -d '(,' )
         bit="0"
         [ $pixel -gt $MEAN ] && bit="1"
         hash="$hash$bit"
      done
   done
   hex=$(echo "obase=16;ibase=2;$hash" | bc)
   printf "%016s\n" $hex
   #rm "$TEMP" > /dev/null 2>&1


function HammingDistance()
   # Convert input hex strings to upper case like bc requires
   STR1=$(tr '[a-z]' '[A-Z]' <<< $1)
   STR2=$(tr '[a-z]' '[A-Z]' <<< $2)

   # Convert hex to binary and zero left pad to 64 binary digits
   STR1=$(printf "%064s" $(echo "obase=2;ibase=16;$STR1" | bc))
   STR2=$(printf "%064s" $(echo "obase=2;ibase=16;$STR2" | bc))

   # Calculate Hamming distance between two strings, each differing bit adds 1
   hamming=0
   for i in 0..63;do
      a=$STR1:i:1
      b=$STR2:i:1
      [ $a != $b ] && ((hamming++))
   done

   # Hamming distance is in range 0..64 and small means more similar
   # We want percentage similarity, so we do a little maths
   similarity=$((100-(hamming*100/64)))
   echo $similarity


function Usage()
   echo "Usage: Similarity image|imageHash [image|imageHash]" >&2
   exit 1


################################################################################
# Main
################################################################################
if [ $# -eq 1 ]; then
   # Expecting a single image file for which to generate hash
   if [ ! -f "$1" ]; then
      echo "ERROR: File $1 does not exist" >&2
      exit 1
   fi
   PerceptualHash "$1" 
   exit 0
fi

if [ $# -eq 2 ]; then
   # Expecting 2 things, i.e. 2 image files, 2 hashes or one of each
   if [ -f "$1" ]; then
      hash1=$(PerceptualHash "$1")
   else
      hash1=$1
   fi
   if [ -f "$2" ]; then
      hash2=$(PerceptualHash "$2")
   else
      hash2=$2
   fi
   HammingDistance $hash1 $hash2
   exit 0
fi

Usage

【讨论】：

phash.org 是一个用于计算各种感知散列的命令行工具。提供 Linux 和 Windows 两种版本。我收到./file.sh: line 47: [: -gt: unary operator expected 如果不是丢弃颜色，而是分别计算 R、G、B 中每一个的哈希值，与它们在图像中的平均值相比，它肯定会做得更好——一张带红色的照片会散列与绿色照片非常不同。【参考方案2】：

我还尝试了您的每张图片与卡片的标准化互相关，如下所示：

#!/bin/bash
size="300x400!"
convert card.png -colorspace RGB -normalize -resize $size card.jpg
for i in *.jpg
do 
   cc=$(convert $i -colorspace RGB -normalize -resize $size JPG:- | \
   compare - card.jpg -metric NCC null: 2>&1)
   echo "$cc:$i"
done | sort -n

我得到了这个输出（按匹配质量排序）：

0.453999:abundance.jpg
0.550696:aggressive.jpg
0.629794:demystify.jpg

这表明卡片与demystify.jpg 的相关性最好。

请注意，我将所有图像的大小调整为相同大小并标准化了它们的对比度，以便可以轻松地比较它们，并将由对比度差异产生的影响降至最低。使它们更小也可以减少关联所需的时间。

【讨论】：

【参考方案3】：

新方法！

似乎下面的 ImageMagick 命令，或者它的变体，取决于查看更多图像选择，将提取卡片顶部的措辞

convert aggressiveurge.jpg -crop 80%x10%+10%+10% crop.png

它占用图像顶部的 10% 和宽度的 80%（从左上角的 10% 开始并将其存储在 crop.png 中，如下所示：

如果你通过tessseract OCR 运行如下：

tesseract crop.png agg

您会得到一个名为 agg.txt 的文件，其中包含：

E‘ Aggressive Urge \L® E

你可以通过grep来清理，只寻找相邻的大小写字母：

grep -Eo "\<[A-Za-z]+\>" agg.txt

得到

Aggressive Urge

:-)

【讨论】：

如果这是最好的方法会很有趣:) 我目前正在尝试它们。如果拍摄的卡片稍微向一侧倾斜，就会出现问题。或者也许我做得不对：/【参考方案4】：

我通过将图像数据排列为向量并获取集合图像向量和搜索到的图像向量之间的内积来尝试此方法。最相似的向量将给出最高的内积。我将所有图像的大小调整为相同的大小以获得相等长度的向量，这样我就可以取内积。这种调整大小将额外减少内积计算成本，并提供实际图像的粗略近似。

您可以使用 Matlab 或 Octave 快速检查这一点。下面是 Matlab/Octave 脚本。我在那里添加了 cmets。我尝试将变量 mult 从 1 更改为 8（您可以尝试任何整数值），对于所有这些情况，图像 Demystify 给出了卡片图像的最高内积。对于 mult = 8，我在 Matlab 中得到以下 ip 向量：

ip =

683007892

558305537

604013365

如您所见，它为图像 Demystify 提供了 683007892 的最高内积。

% load images
imCardPhoto = imread('0.png');
imDemystify = imread('1.jpg');
imAggressiveUrge = imread('2.jpg');
imAbundance = imread('3.jpg');

% you can experiment with the size by varying mult
mult = 8;
size = [17 12]*mult;

% resize with nearest neighbor interpolation
smallCardPhoto = imresize(imCardPhoto, size);
smallDemystify = imresize(imDemystify, size);
smallAggressiveUrge = imresize(imAggressiveUrge, size);
smallAbundance = imresize(imAbundance, size);

% image collection: each image is vectorized. if we have n images, this
% will be a (size_rows*size_columns*channels) x n matrix
collection = [double(smallDemystify(:)) ...
    double(smallAggressiveUrge(:)) ...
    double(smallAbundance(:))];

% vectorize searched image. this will be a (size_rows*size_columns*channels) x 1
% vector
x = double(smallCardPhoto(:));

% take the inner product of x and each image vector in collection. this
% will result in a n x 1 vector. the higher the inner product is, more similar the
% image and searched image(that is x)
ip = collection' * x;

编辑

我尝试了另一种方法，基本上采用参考图像和卡片图像之间的欧几里得距离（l2 范数），它给了我很好的结果，我在link 找到了大量参考图像（383 张图像）你的测试卡图片。

这里我没有获取整个图像，而是提取了包含图像的上部并用于比较。

在以下步骤中，所有训练图像和测试图像在进行任何处理之前调整为预定义大小。

从训练图像中提取图像区域对这些图像执行形态闭合以获得粗略的近似值（此步骤可能不是必需的）将这些图像矢量化并存储在一个训练集中（我称它为训练集，即使这种方法没有训练）加载测试卡图像，提取图像感兴趣区域 (ROI)，应用闭合，然后矢量化计算每个参考图像向量与测试图像向量之间的欧式距离选择最小距离项（或前k项）

我使用 OpenCV 在 C++ 中完成了这项工作。我还包括了一些使用不同尺度的测试结果。

#include <opencv2/opencv.hpp>
#include <iostream>
#include <algorithm>
#include <string.h>
#include <windows.h>

using namespace cv;
using namespace std;

#define INPUT_FOLDER_PATH       string("Your test image folder path")
#define TRAIN_IMG_FOLDER_PATH   string("Your training image folder path")

void search()

    WIN32_FIND_DATA ffd;
    HANDLE hFind = INVALID_HANDLE_VALUE;

    vector<Mat> images;
    vector<string> labelNames;
    int label = 0;
    double scale = .2;  // you can experiment with scale
    Size imgSize(200*scale, 285*scale); // training sample images are all 200 x 285 (width x height)
    Mat kernel = getStructuringElement(MORPH_ELLIPSE, Size(3, 3));

    // get all training samples in the directory
    hFind = FindFirstFile((TRAIN_IMG_FOLDER_PATH + string("*")).c_str(), &ffd);
    if (INVALID_HANDLE_VALUE == hFind) 
    
        cout << "INVALID_HANDLE_VALUE: " << GetLastError() << endl;
        return;
     
    do
    
        if (!(ffd.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY))
        
            Mat im = imread(TRAIN_IMG_FOLDER_PATH+string(ffd.cFileName));
            Mat re;
            resize(im, re, imgSize, 0, 0);  // resize the image

            // extract only the upper part that contains the image
            Mat roi = re(Rect(re.cols*.1, re.rows*35/285.0, re.cols*.8, re.rows*125/285.0));
            // get a coarse approximation
            morphologyEx(roi, roi, MORPH_CLOSE, kernel);

            images.push_back(roi.reshape(1)); // vectorize the roi
            labelNames.push_back(string(ffd.cFileName));
        

    
    while (FindNextFile(hFind, &ffd) != 0);

    // load the test image, apply the same preprocessing done for training images
    Mat test = imread(INPUT_FOLDER_PATH+string("0.png"));
    Mat re;
    resize(test, re, imgSize, 0, 0);
    Mat roi = re(Rect(re.cols*.1, re.rows*35/285.0, re.cols*.8, re.rows*125/285.0));
    morphologyEx(roi, roi, MORPH_CLOSE, kernel);
    Mat testre = roi.reshape(1);

    struct imgnorm2_t
    
        string name;
        double norm2;
    ;
    vector<imgnorm2_t> imgnorm;
    for (size_t i = 0; i < images.size(); i++)
    
        imgnorm2_t data = labelNames[i], 
            norm(images[i], testre) /* take the l2-norm (euclidean distance) */;
        imgnorm.push_back(data); // store data
    

    // sort stored data based on euclidean-distance in the ascending order
    sort(imgnorm.begin(), imgnorm.end(), 
        [] (imgnorm2_t& first, imgnorm2_t& second)  return (first.norm2 < second.norm2); );
    for (size_t i = 0; i < imgnorm.size(); i++)
    
        cout << imgnorm[i].name << " : " << imgnorm[i].norm2 << endl;

结果：

比例 = 1.0;

demystify.jpg：10989.6，sylvan_basilisk.jpg：11990.7，scathe_zombies.jpg：12307.6

比例 = .8;

demystify.jpg : 8572.84, sylvan_basilisk.jpg : 9440.18, steel_golem.jpg : 9445.36

比例 = .6;

demystify.jpg：6226.6，steel_golem.jpg：6887.96，sylvan_basilisk.jpg：7013.05

比例 = .4;

demystify.jpg : 4185.68, steel_golem.jpg : 4544.64, sylvan_basilisk.jpg : 4699.67

比例 = .2;

demystify.jpg：1903.05，steel_golem.jpg：2154.64，sylvan_basilisk.jpg：2277.42

【讨论】：

【参考方案5】：

如果我理解正确，您需要将它们作为图片进行比较。这里有一个非常简单但有效的解决方案——它叫做Sikuli。

我可以使用哪些工具来查找与我最相似的收藏图片？

这个工具在图像处理方面工作得非常好，不仅能够找到您的卡片（图像）是否与您已经定义为图案的内容相似，而且还可以搜索部分图像内容（所谓的矩形）。

默认情况下，您可以通过 Python 扩展它的功能。任何 ImageObject 都可以设置为以百分比接受similarity_pattern，这样您就可以准确地找到您要查找的内容。

此工具的另一大优势是您可以在一天内学习基础知识。

希望这会有所帮助。

【讨论】：

以上是关于将图像与图像集合匹配的主要内容，如果未能解决你的问题，请参考以下文章