使用 OpenCV 自动调整一张纸的彩色照片的对比度和亮度

Posted 2023-02-16

技术标签:

【中文标题】使用 OpenCV 自动调整一张纸的彩色照片的对比度和亮度【英文标题】：Automatic contrast and brightness adjustment of a color photo of a sheet of paper with OpenCV 【发布时间】：2019-11-16 05:42:07 【问题描述】：

拍摄一张纸时（例如用手机摄像头），我得到以下结果（左图）（jpg 下载here）。想要的结果（使用图像编辑软件手动处理）在右侧：

我想用openCV处理原始图像自动获得更好的亮度/对比度（让背景更白）。

假设：图像具有 A4 纵向格式（我们不需要在本主题中对其进行透视变形），并且这张纸是白色的，可能带有黑色或彩色的文本/图像。

到目前为止我已经尝试过：

各种自适应阈值方法，例如 Gaussian、OTSU（参见 OpenCV doc Image Thresholding）。它通常适用于 OTSU：

ret, gray = cv2.threshold(img, 0, 255, cv2.THRESH_OTSU + cv2.THRESH_BINARY)

但它只适用于灰度图像，不能直接适用于彩色图像。此外，输出是二进制（白色或黑色），这是我不想要的：我更喜欢保留彩色非二进制图像作为输出

Histogram equalization

正如 answer (Histogram equalization not working on color image - OpenCV) 或 one (OpenCV Python equalizeHist colored image) 所建议的：

img3 = cv2.imread(f)
img_transf = cv2.cvtColor(img3, cv2.COLOR_BGR2YUV)
img_transf[:,:,0] = cv2.equalizeHist(img_transf[:,:,0])
img4 = cv2.cvtColor(img_transf, cv2.COLOR_YUV2BGR)
cv2.imwrite('test.jpg', img4)

或使用 HSV：

img_transf = cv2.cvtColor(img3, cv2.COLOR_BGR2HSV)
img_transf[:,:,2] = cv2.equalizeHist(img_transf[:,:,2])
img4 = cv2.cvtColor(img_transf, cv2.COLOR_HSV2BGR)

不幸的是，结果很糟糕，因为它在局部产生了可怕的微对比（？）：

我也尝试过 YCbCr，结果很相似。

我还尝试了从1 到1000 的各种tileGridSize CLAHE (Contrast Limited Adaptive Histogram Equalization)：

img3 = cv2.imread(f)
img_transf = cv2.cvtColor(img3, cv2.COLOR_BGR2HSV)
clahe = cv2.createCLAHE(tileGridSize=(100,100))
img_transf[:,:,2] = clahe.apply(img_transf[:,:,2])
img4 = cv2.cvtColor(img_transf, cv2.COLOR_HSV2BGR)
cv2.imwrite('test.jpg', img4)

但结果也同样糟糕。

使用 LAB 颜色空间执行此 CLAHE 方法，如问题 How to apply CLAHE on RGB color images 中所建议：

import cv2, numpy as np
bgr = cv2.imread('_example.jpg')
lab = cv2.cvtColor(bgr, cv2.COLOR_BGR2LAB)
lab_planes = cv2.split(lab)
clahe = cv2.createCLAHE(clipLimit=2.0,tileGridSize=(100,100))
lab_planes[0] = clahe.apply(lab_planes[0])
lab = cv2.merge(lab_planes)
bgr = cv2.cvtColor(lab, cv2.COLOR_LAB2BGR)
cv2.imwrite('_example111.jpg', bgr)

也给出了不好的结果。输出图片：

在每个通道上单独进行自适应阈值或直方图均衡（R、G、B）不是一种选择，因为它会破坏色彩平衡，正如 here 所解释的那样。

“对比度拉伸”方法来自scikit-imageHistogram Equalization上的教程：

图像被重新缩放，以包括落在第 2 和第 98 个百分位数内的所有强度

稍微好一点，但仍远未达到预期结果（请参阅此问题顶部的图片）。

TL;DR：如何使用 OpenCV/Python 对一张纸的彩色照片进行自动亮度/对比度优化？ 可以使用哪种阈值/直方图均衡/其他技术?

【问题讨论】：

如何结合阈值和重新缩放，我的意思是，也使用阈值，但是对于 8（或 16）级（不是 2 作为二进制阈值），然后将其重新缩放回 256亮度等级？因为它是彩色图像，您可以尝试使用每个颜色通道。感谢@Tiendung 的想法。如何自动找到最佳的 8 或 16 级（无需为每个图像手动设置参数），类似于 OTSU？这不是或多或少类似于直方图均衡吗？您能否发布一个示例 Python 代码，以便我们尝试您的建议？看起来 JPEG 压缩伪影正在给您带来麻烦。您没有更好的扫描质量吗？ @CrisLuengo 不，这与 JPEG 压缩伪影无关（根据我的测试）。 @Basj 查看我分享的脚本，自动方法的输出似乎比你分享的手动调整的图像要好。 【参考方案1】：

可以分别使用 alpha (α) 和 beta (β) 调整对比度和亮度。这些变量通常称为 gain 和 bias 参数。表达式可以写成

OpenCV 已经将其实现为 cv2.convertScaleAbs()，因此我们可以将此函数与用户定义的 alpha 和 beta 值一起使用。

import cv2

image = cv2.imread('1.jpg')

alpha = 1.95 # Contrast control (1.0-3.0)
beta = 0 # Brightness control (0-100)

manual_result = cv2.convertScaleAbs(image, alpha=alpha, beta=beta)

cv2.imshow('original', image)
cv2.imshow('manual_result', manual_result)
cv2.waitKey()

但问题是

如何获得彩色照片的自动亮度/对比度优化？

本质上问题是如何自动计算alpha 和beta。为此，我们可以查看图像的直方图。自动亮度和对比度优化计算 alpha 和 beta，使输出范围为[0...255]。我们计算累积分布以确定颜色频率小于某个阈值（例如 1%）的位置，并切割直方图的右侧和左侧。这为我们提供了最小和最大范围。这是在（蓝色）和剪切后（橙色）的直方图的可视化。请注意图像中更“有趣”的部分在剪辑后如何更加明显。

为了计算alpha，我们取剪裁后的最小和最大灰度范围，并将其与我们期望的输出范围255相除

α = 255 / (maximum_gray - minimum_gray)

为了计算 beta，我们将其代入公式 g(i, j)=0 和 f(i, j)=minimum_gray

g(i,j) = α * f(i,j) + β

解决后会导致这个

β = -minimum_gray * α

对于您的图像，我们得到了这个

阿尔法：3.75

测试版：-311.25

您可能需要调整剪辑阈值以优化结果。以下是使用 1% 阈值与其他图像的一些示例结果：之前 -> 之后

自动亮度和对比度代码

import cv2
import numpy as np
from matplotlib import pyplot as plt

# Automatic brightness and contrast optimization with optional histogram clipping
def automatic_brightness_and_contrast(image, clip_hist_percent=1):
    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    
    # Calculate grayscale histogram
    hist = cv2.calcHist([gray],[0],None,[256],[0,256])
    hist_size = len(hist)
    
    # Calculate cumulative distribution from the histogram
    accumulator = []
    accumulator.append(float(hist[0]))
    for index in range(1, hist_size):
        accumulator.append(accumulator[index -1] + float(hist[index]))
    
    # Locate points to clip
    maximum = accumulator[-1]
    clip_hist_percent *= (maximum/100.0)
    clip_hist_percent /= 2.0
    
    # Locate left cut
    minimum_gray = 0
    while accumulator[minimum_gray] < clip_hist_percent:
        minimum_gray += 1
    
    # Locate right cut
    maximum_gray = hist_size -1
    while accumulator[maximum_gray] >= (maximum - clip_hist_percent):
        maximum_gray -= 1
    
    # Calculate alpha and beta values
    alpha = 255 / (maximum_gray - minimum_gray)
    beta = -minimum_gray * alpha
    
    '''
    # Calculate new histogram with desired range and show histogram 
    new_hist = cv2.calcHist([gray],[0],None,[256],[minimum_gray,maximum_gray])
    plt.plot(hist)
    plt.plot(new_hist)
    plt.xlim([0,256])
    plt.show()
    '''

    auto_result = cv2.convertScaleAbs(image, alpha=alpha, beta=beta)
    return (auto_result, alpha, beta)

image = cv2.imread('1.jpg')
auto_result, alpha, beta = automatic_brightness_and_contrast(image)
print('alpha', alpha)
print('beta', beta)
cv2.imshow('auto_result', auto_result)
cv2.waitKey()

带有此代码的结果图像：

使用 1% 阈值的其他图像的结果

另一种版本是使用饱和度算法而不是使用 OpenCV 的 cv2.convertScaleAbs() 将 gain 和 bias 添加到图像。内置方法不采用绝对值，这会导致无意义的结果（例如，44 处的像素，alpha = 3 和 beta = -210 在 OpenCV 中变为 78，而实际上它应该变为 0）。

import cv2
import numpy as np
# from matplotlib import pyplot as plt

def convertScale(img, alpha, beta):
    """Add bias and gain to an image with saturation arithmetics. Unlike
    cv2.convertScaleAbs, it does not take an absolute value, which would lead to
    nonsensical results (e.g., a pixel at 44 with alpha = 3 and beta = -210
    becomes 78 with OpenCV, when in fact it should become 0).
    """

    new_img = img * alpha + beta
    new_img[new_img < 0] = 0
    new_img[new_img > 255] = 255
    return new_img.astype(np.uint8)

# Automatic brightness and contrast optimization with optional histogram clipping
def automatic_brightness_and_contrast(image, clip_hist_percent=25):
    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

    # Calculate grayscale histogram
    hist = cv2.calcHist([gray],[0],None,[256],[0,256])
    hist_size = len(hist)

    # Calculate cumulative distribution from the histogram
    accumulator = []
    accumulator.append(float(hist[0]))
    for index in range(1, hist_size):
        accumulator.append(accumulator[index -1] + float(hist[index]))

    # Locate points to clip
    maximum = accumulator[-1]
    clip_hist_percent *= (maximum/100.0)
    clip_hist_percent /= 2.0

    # Locate left cut
    minimum_gray = 0
    while accumulator[minimum_gray] < clip_hist_percent:
        minimum_gray += 1

    # Locate right cut
    maximum_gray = hist_size -1
    while accumulator[maximum_gray] >= (maximum - clip_hist_percent):
        maximum_gray -= 1

    # Calculate alpha and beta values
    alpha = 255 / (maximum_gray - minimum_gray)
    beta = -minimum_gray * alpha

    '''
    # Calculate new histogram with desired range and show histogram 
    new_hist = cv2.calcHist([gray],[0],None,[256],[minimum_gray,maximum_gray])
    plt.plot(hist)
    plt.plot(new_hist)
    plt.xlim([0,256])
    plt.show()
    '''

    auto_result = convertScale(image, alpha=alpha, beta=beta)
    return (auto_result, alpha, beta)

image = cv2.imread('1.jpg')
auto_result, alpha, beta = automatic_brightness_and_contrast(image)
print('alpha', alpha)
print('beta', beta)
cv2.imshow('auto_result', auto_result)
cv2.imwrite('auto_result.png', auto_result)
cv2.imshow('image', image)
cv2.waitKey()

【讨论】：

感谢您的回答（已经有帮助，请保留）。问题是如何自动找到 alpha / beta（我想要一个没有手动参数调整的处理）以获得良好的图像优化（相当标准：我们希望背景接近白色而不是灰色，文本或图像要形成鲜明对比等）。你有什么想法，让算法为任何一张照片找到好的 alpha beta 值吗？一种可能的方法是使用图像的直方图自动查找 alpha 和 beta 值。检查更新的代码感谢您的更新答复！它稍微改善了结果，但例如在我的示例图像上，背景仍然很暗（我编辑了您的答案以在将您的代码与示例图像一起使用时添加结果图像，这有助于进一步参考）。当前的直方图剪裁技术会去除最异常的部分，通常用于增加对比度/亮度，但由于您试图获得完全白色的背景图像，因此很难确定自动阿尔法/贝塔。通常，使用平均值，但要获得完全白色的背景，您需要一些指标来使值相对于平均值倾斜。也许添加一个常量可以工作。无论如何，这是一个有趣的问题。祝你好运！ @mLstudent33，这是一个很好的问题。我从未在能量图上尝试过。我相信它会根据图像中所有像素的相对阈值进行增强，所以我的猜测是它仍然应该可以工作，但效果不会那么明显【参考方案2】：

我认为这样做的方法是 1) 从 HCL 颜色空间中提取色度（饱和度）通道。（HCL 比 HSL 或 HSV 效果更好）。只有颜色应该具有非零饱和度，因此明亮和灰色阴影将变暗。 2) 使用 otsu 阈值作为掩码的结果的阈值。 3）将您的输入转换为灰度并应用局部区域（即自适应）阈值。 4）将蒙版放入原图的alpha通道，然后将局部区域阈值化结果与原图合成，使其与原始颜色区域保持一致，其他任何地方都使用局部区域阈值结果。

抱歉，我不太了解 OpeCV，但这里是使用 ImageMagick 的步骤。

请注意，通道从 0 开始编号。（H=0 或红色，C=1 或绿色，L=2 或蓝色）

输入：

magick image.jpg -colorspace HCL -channel 1 -separate +channel tmp1.png

magick tmp1.png -auto-threshold otsu tmp2.png

magick image.jpg -colorspace gray -negate -lat 20x20+10% -negate tmp3.png

magick tmp3.png \( image.jpg tmp2.png -alpha off -compose copy_opacity -composite \) -compose over -composite result.png

补充：

这是 Python Wand 代码，它产生相同的输出结果。它需要 Imagemagick 7 和 Wand 0.5.5。

#!/bin/python3.7

from wand.image import Image
from wand.display import display
from wand.version import QUANTUM_RANGE

with Image(filename='text.jpg') as img:
    with img.clone() as copied:
        with img.clone() as hcl:
            hcl.transform_colorspace('hcl')
            with hcl.channel_images['green'] as mask:
                mask.auto_threshold(method='otsu')
                copied.composite(mask, left=0, top=0, operator='copy_alpha')
                img.transform_colorspace('gray')
                img.negate()
                img.adaptive_threshold(width=20, height=20, offset=0.1*QUANTUM_RANGE)
                img.negate()
                img.composite(copied, left=0, top=0, operator='over')
                img.save(filename='text_process.jpg')

【讨论】：

哇，这是一个非常巧妙的解决方案。我希望我以前知道这些技术，这样我就不必自己从样板 OpenCV 中实现类似的东西。也可以在 Python Wand 中做到这一点，因为它基于 Imagemagick。我已经添加了 Python Wand 代码，以便可以在 ADDITION 中回答【参考方案3】：

此方法应该适用于您的应用程序。首先，您在强度直方图中找到一个可以很好地分离分布模式的阈值，然后使用该值重新调整强度。

from skimage.filters import threshold_yen
from skimage.exposure import rescale_intensity
from skimage.io import imread, imsave

img = imread('mY7ep.jpg')

yen_threshold = threshold_yen(img)
bright = rescale_intensity(img, (0, yen_threshold), (0, 255))

imsave('out.jpg', bright)

我这里使用的是Yen的方法，可以在this page了解更多关于这个方法的信息。

【讨论】：

有意思，谢谢分享！当整个图像的照明条件变化很大时，这种方法会起作用吗？ @FalconUA 我猜它不像那样工作。我已经用 RGB 图像对我的情况进行了测试，它会产生一个空白文档图像。原因是亮度调整比例不是针对区域，而是threshold_yen 中的计算对整个图像是准确的。您找到可行的解决方案了吗？【参考方案4】：

强大的局部自适应软二值化！这就是我所说的。

我以前做过类似的事情，目的有点不同，所以这可能不完全适合您的需求，但希望它有所帮助（我在晚上写了这段代码供个人使用，所以它很难看）。从某种意义上说，与您的代码相比，此代码旨在解决一个更一般的情况，在这种情况下，我们可以在背景中有大量结构化噪音（参见下面的演示）。

此代码的作用是什么？ 给定一张纸的照片，它会将其变白，以便完美打印。请参阅下面的示例图片。

Teaser：这就是您的页面在此算法之后的样子（之前和之后）。请注意，即使颜色标记注释也不见了，所以我不知道这是否适合您的用例，但代码可能有用：

要获得完全干净的结果，您可能需要稍微调整一下过滤参数，但正如您所见，即使使用默认参数，它也能很好地工作。

第 0 步：剪切图片以贴近页面

假设您以某种方式执行了此步骤（在您提供的示例中似乎是这样）。如果您需要手动注释和重新扭曲工具，请私信我！ ^^ 此步骤的结果如下（我在这里使用的示例可能比您提供的示例更难，虽然它可能与您的情况不完全匹配）：

由此我们可以立即看出以下问题：

光照条件不均匀。这意味着所有简单的二值化方法都行不通。我尝试了OpenCV提供的很多解决方案，以及它们的组合，但都没有奏效！ 大量的背景噪音。在我的例子中，我需要去除纸张的网格，以及纸张另一面的墨水。

第 1 步：伽玛校正

这一步的原因是为了平衡整个图像的对比度（因为根据光照条件，您的图像可能会略微曝光过度/曝光不足）。

乍一看，这似乎是一个不必要的步骤，但它的重要性不容小觑：从某种意义上说，它将图像归一化为相似的曝光分布，以便您以后可以选择有意义的超参数（例如DELTA下一节的参数，噪声过滤参数，形态学参数等）

# Somehow I found the value of `gamma=1.2` to be the best in my case
def adjust_gamma(image, gamma=1.2):
    # build a lookup table mapping the pixel values [0, 255] to
    # their adjusted gamma values
    invGamma = 1.0 / gamma
    table = np.array([((i / 255.0) ** invGamma) * 255
        for i in np.arange(0, 256)]).astype("uint8")

    # apply gamma correction using the lookup table
    return cv2.LUT(image, table)

以下是伽玛调整的结果：

你可以看到它现在有点……“平衡”了。如果没有这一步，您将在后续步骤中手动选择的所有参数都将变得不那么健壮！

第 2 步：自适应二值化检测文本斑点

在这一步中，我们将自适应地二值化文本块。稍后我会添加更多的cmets，但思路基本如下：

我们将图像分成大小为BLOCK_SIZE 的块。诀窍是选择足够大的尺寸，这样您仍然可以获得大量的文本和背景（即比您拥有的任何符号都大），但又足够小，不会受到任何光照条件变化的影响（即“大，但仍然本地”）。在每个块内，我们进行局部自适应二值化：我们查看中间值并假设它是背景（因为我们选择了足够大的BLOCK_SIZE 以使其大部分成为背景）。然后，我们进一步定义DELTA——基本上只是一个阈值，即“离中位数还有多远，我们仍将其视为背景？”。

所以，函数process_image 完成了工作。此外，您可以修改 preprocess 和 postprocess 函数以满足您的需要（但是，正如您从上面的示例中看到的那样，该算法非常鲁棒，即它运行得很好 -即用即用，无需过多修改参数）。

这部分的代码假设前景比背景暗（即纸上的墨水）。但是您可以通过调整 preprocess 函数轻松地改变它：而不是 255 - image，只返回 image。

# These are probably the only important parameters in the
# whole pipeline (steps 0 through 3).
BLOCK_SIZE = 40
DELTA = 25

# Do the necessary noise cleaning and other stuffs.
# I just do a simple blurring here but you can optionally
# add more stuffs.
def preprocess(image):
    image = cv2.medianBlur(image, 3)
    return 255 - image

# Again, this step is fully optional and you can even keep
# the body empty. I just did some opening. The algorithm is
# pretty robust, so this stuff won't affect much.
def postprocess(image):
    kernel = np.ones((3,3), np.uint8)
    image = cv2.morphologyEx(image, cv2.MORPH_OPEN, kernel)
    return image

# Just a helper function that generates box coordinates
def get_block_index(image_shape, yx, block_size): 
    y = np.arange(max(0, yx[0]-block_size), min(image_shape[0], yx[0]+block_size))
    x = np.arange(max(0, yx[1]-block_size), min(image_shape[1], yx[1]+block_size))
    return np.meshgrid(y, x)

# Here is where the trick begins. We perform binarization from the 
# median value locally (the img_in is actually a slice of the image). 
# Here, following assumptions are held:
#   1.  The majority of pixels in the slice is background
#   2.  The median value of the intensity histogram probably
#       belongs to the background. We allow a soft margin DELTA
#       to account for any irregularities.
#   3.  We need to keep everything other than the background.
#
# We also do simple morphological operations here. It was just
# something that I empirically found to be "useful", but I assume
# this is pretty robust across different datasets.
def adaptive_median_threshold(img_in):
    med = np.median(img_in)
    img_out = np.zeros_like(img_in)
    img_out[img_in - med < DELTA] = 255
    kernel = np.ones((3,3),np.uint8)
    img_out = 255 - cv2.dilate(255 - img_out,kernel,iterations = 2)
    return img_out

# This function just divides the image into local regions (blocks),
# and perform the `adaptive_mean_threshold(...)` function to each
# of the regions.
def block_image_process(image, block_size):
    out_image = np.zeros_like(image)
    for row in range(0, image.shape[0], block_size):
        for col in range(0, image.shape[1], block_size):
            idx = (row, col)
            block_idx = get_block_index(image.shape, idx, block_size)
            out_image[block_idx] = adaptive_median_threshold(image[block_idx])
    return out_image

# This function invokes the whole pipeline of Step 2.
def process_image(img):
    image_in = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    image_in = preprocess(image_in)
    image_out = block_image_process(image_in, BLOCK_SIZE)
    image_out = postprocess(image_out)
    return image_out

结果是像这样的漂亮斑点，紧跟墨迹：

第 3 步：二值化的“软”部分

有了覆盖符号和更多的斑点，我们终于可以进行白化过程了。

如果我们仔细观察带有文字的纸张照片（尤其是那些有手迹的照片），从“背景”（白纸）到“前景”（深色墨水）的转换并不明显，但非常循序渐进。本节中其他基于二值化的答案提出了一个简单的阈值处理（即使它们是局部自适应的，它仍然是一个阈值），它适用于打印文本，但会产生不那么漂亮的手写结果。

因此，本部分的动机是我们希望保留从黑色到白色的逐渐传输的效果，就像用天然墨水的纸张的自然照片一样。这样做的最终目的是使其可打印。

主要思想很简单：像素值（经过上述阈值处理后）与局部最小值的差异越大，就越有可能属于背景。我们可以使用一系列Sigmoid 函数来表达这一点，重新缩放到局部块的范围（以便该函数自适应地缩放整个图像）。

# This is the function used for composing
def sigmoid(x, orig, rad):
    k = np.exp((x - orig) * 5 / rad)
    return k / (k + 1.)

# Here, we combine the local blocks. A bit lengthy, so please
# follow the local comments.
def combine_block(img_in, mask):
    # First, we pre-fill the masked region of img_out to white
    # (i.e. background). The mask is retrieved from previous section.
    img_out = np.zeros_like(img_in)
    img_out[mask == 255] = 255
    fimg_in = img_in.astype(np.float32)

    # Then, we store the foreground (letters written with ink)
    # in the `idx` array. If there are none (i.e. just background),
    # we move on to the next block.
    idx = np.where(mask == 0)
    if idx[0].shape[0] == 0:
        img_out[idx] = img_in[idx]
        return img_out

    # We find the intensity range of our pixels in this local part
    # and clip the image block to that range, locally.
    lo = fimg_in[idx].min()
    hi = fimg_in[idx].max()
    v = fimg_in[idx] - lo
    r = hi - lo

    # Now we use good old OTSU binarization to get a rough estimation
    # of foreground and background regions.
    img_in_idx = img_in[idx]
    ret3,th3 = cv2.threshold(img_in[idx],0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)

    # Then we normalize the stuffs and apply sigmoid to gradually
    # combine the stuffs.
    bound_value = np.min(img_in_idx[th3[:, 0] == 255])
    bound_value = (bound_value - lo) / (r + 1e-5)
    f = (v / (r + 1e-5))
    f = sigmoid(f, bound_value + 0.05, 0.2)

    # Finally, we re-normalize the result to the range [0..255]
    img_out[idx] = (255. * f).astype(np.uint8)
    return img_out

# We do the combination routine on local blocks, so that the scaling
# parameters of Sigmoid function can be adjusted to local setting
def combine_block_image_process(image, mask, block_size):
    out_image = np.zeros_like(image)
    for row in range(0, image.shape[0], block_size):
        for col in range(0, image.shape[1], block_size):
            idx = (row, col)
            block_idx = get_block_index(image.shape, idx, block_size)
            out_image[block_idx] = combine_block(
                image[block_idx], mask[block_idx])
    return out_image

# Postprocessing (should be robust even without it, but I recommend
# you to play around a bit and find what works best for your data.
# I just left it blank.
def combine_postprocess(image):
    return image

# The main function of this section. Executes the whole pipeline.
def combine_process(img, mask):
    image_in = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    image_out = combine_block_image_process(image_in, mask, 20)
    image_out = combine_postprocess(image_out)
    return image_out

有些东西是可选的，所以被注释掉了。 combine_process 函数从上一步获取掩码，并执行整个合成管道。您可以尝试使用它们来获取您的特定数据（图像）。结果很整洁：

可能我会在这个答案的代码中添加更多的 cmets 和解释。将整个内容（连同裁剪和变形代码）上传到 Github。

【讨论】：

你的组合程序很简单但是很聪明。在处理自然文本图像时消除了许多不便。感谢您分享这个好方法！但是，它是二值化，所以输出不会保持颜色渐变（例如：假设扫描的纸上有一张照片！），所以这不是本主题所要求的。但再一次，它本身很有趣，所以感谢分享！或者@FalconUA 您是否有一个仍然保留颜色的算法的修改版本（但只需找到最佳亮度/对比度平衡，请参阅我的问题中的更多详细信息）？将处理后的图片和原始图片放在一起，恢复像素为黑色的颜色您是否有办法按照您在步骤 0 中的描述剪切图像以贴近页面？【参考方案5】：

首先，我们将文本和颜色标记分开。这可以在具有色彩饱和度通道的色彩空间中完成。我改用了一种受this paper 启发的非常简单的方法：对于（浅色）灰色区域，min(R,G,B)/max(R,G,B) 的比率将接近 1，对于彩色区域，

灰度图像文本被局部阈值化以生成黑白图像。您可以从this comparison 或that survey 中选择您最喜欢的技术。我选择了 NICK 技术，它可以很好地应对低对比度并且相当稳健，即参数 k 的选择在大约 -0.3 和 -0.1 之间适用于非常广泛的条件，这对自动处理很有好处。对于提供的示例文档，所选择的技术并没有起到很大的作用，因为它的照明相对均匀，但为了处理非均匀照明的图像，它应该是一种局部阈值技术。 p>

在最后一步中，颜色区域被添加回二值化文本图像。

因此，除了颜色检测和二值化方法不同之外，此解决方案与 @fmw42 的解决方案非常相似（全归功于他的想法）。

image = cv2.imread('mY7ep.jpg')

# make mask and inverted mask for colored areas
b,g,r = cv2.split(cv2.blur(image,(5,5)))
np.seterr(divide='ignore', invalid='ignore') # 0/0 --> 0
m = (np.fmin(np.fmin(b, g), r) / np.fmax(np.fmax(b, g), r)) * 255
_,mask_inv = cv2.threshold(np.uint8(m), 0, 255, cv2.THRESH_BINARY+cv2.THRESH_OTSU)
mask = cv2.bitwise_not(mask_inv)

# local thresholding of grayscale image
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
text = cv2.ximgproc.niBlackThreshold(gray, 255, cv2.THRESH_BINARY, 41, -0.1, binarizationMethod=cv2.ximgproc.BINARIZATION_NICK)

# create background (text) and foreground (color markings)
bg = cv2.bitwise_and(text, text, mask = mask_inv)
fg = cv2.bitwise_and(image, image, mask = mask)

out = cv2.add(cv2.cvtColor(bg, cv2.COLOR_GRAY2BGR), fg)

如果你不需要颜色标记，你可以简单地将灰度图像二值化：

image = cv2.imread('mY7ep.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
text = cv2.ximgproc.niBlackThreshold(gray, 255, cv2.THRESH_BINARY, at_bs, -0.3, binarizationMethod=cv2.ximgproc.BINARIZATION_NICK)

【讨论】：

以上是关于使用 OpenCV 自动调整一张纸的彩色照片的对比度和亮度的主要内容，如果未能解决你的问题，请参考以下文章