旋转图像并裁剪黑色边框
Posted
技术标签:
【中文标题】旋转图像并裁剪黑色边框【英文标题】:Rotate image and crop out black borders 【发布时间】:2013-05-18 03:58:54 【问题描述】:我的应用程序:我正在尝试旋转图像(使用 OpenCV 和 Python)
目前我开发了以下代码,它旋转输入图像,用黑色边框填充它,给我 A。我想要的是 B - 旋转图像中最大可能的区域裁剪窗口。我称之为轴对齐boundED box。
这与Rotate and crop 基本相同,但是我无法得到该问题的答案。此外,该答案显然仅对方形图像有效。我的图片是矩形的。
给出A的代码:
import cv2
import numpy as np
def getTranslationMatrix2d(dx, dy):
"""
Returns a numpy affine transformation matrix for a 2D translation of
(dx, dy)
"""
return np.matrix([[1, 0, dx], [0, 1, dy], [0, 0, 1]])
def rotateImage(image, angle):
"""
Rotates the given image about it's centre
"""
image_size = (image.shape[1], image.shape[0])
image_center = tuple(np.array(image_size) / 2)
rot_mat = np.vstack([cv2.getRotationMatrix2D(image_center, angle, 1.0), [0, 0, 1]])
trans_mat = np.identity(3)
w2 = image_size[0] * 0.5
h2 = image_size[1] * 0.5
rot_mat_notranslate = np.matrix(rot_mat[0:2, 0:2])
tl = (np.array([-w2, h2]) * rot_mat_notranslate).A[0]
tr = (np.array([w2, h2]) * rot_mat_notranslate).A[0]
bl = (np.array([-w2, -h2]) * rot_mat_notranslate).A[0]
br = (np.array([w2, -h2]) * rot_mat_notranslate).A[0]
x_coords = [pt[0] for pt in [tl, tr, bl, br]]
x_pos = [x for x in x_coords if x > 0]
x_neg = [x for x in x_coords if x < 0]
y_coords = [pt[1] for pt in [tl, tr, bl, br]]
y_pos = [y for y in y_coords if y > 0]
y_neg = [y for y in y_coords if y < 0]
right_bound = max(x_pos)
left_bound = min(x_neg)
top_bound = max(y_pos)
bot_bound = min(y_neg)
new_w = int(abs(right_bound - left_bound))
new_h = int(abs(top_bound - bot_bound))
new_image_size = (new_w, new_h)
new_midx = new_w * 0.5
new_midy = new_h * 0.5
dx = int(new_midx - w2)
dy = int(new_midy - h2)
trans_mat = getTranslationMatrix2d(dx, dy)
affine_mat = (np.matrix(trans_mat) * np.matrix(rot_mat))[0:2, :]
result = cv2.warpAffine(image, affine_mat, new_image_size, flags=cv2.INTER_LINEAR)
return result
【问题讨论】:
据我所知,这本质上是一个非线性优化问题,(搜索旋转图像中包含的所有AABB矩形,找到面积最大的那个)。我似乎无法弄清楚解决它所需的逻辑。 这里是一个解决同样问题的人的算法的链接。 roffle-largest-rectangle.blogspot.com/2011/09/…这是Java代码,我没有检查它的逻辑,但它可以帮助你开始。 还有这个***.com/questions/5789239/… 嘿!绝对精彩 - 这两个链接看起来都很完美。 附带说明:步骤 A 可以用更少的代码完成,请参见此处:pyimagesearch.com/2017/01/02/…(在函数rotate_bound()
of imutils 中实现
【参考方案1】:
这个解决方案/实现背后的数学相当于this solution of an analagous question,但公式被简化并避免了奇点。这是与其他解决方案中的largest_rotated_rect
具有相同界面的python 代码,但在几乎所有情况下都提供了更大的区域(始终被证明是最佳的):
def rotatedRectWithMaxArea(w, h, angle):
"""
Given a rectangle of size wxh that has been rotated by 'angle' (in
radians), computes the width and height of the largest possible
axis-aligned rectangle (maximal area) within the rotated rectangle.
"""
if w <= 0 or h <= 0:
return 0,0
width_is_longer = w >= h
side_long, side_short = (w,h) if width_is_longer else (h,w)
# since the solutions for angle, -angle and 180-angle are all the same,
# if suffices to look at the first quadrant and the absolute values of sin,cos:
sin_a, cos_a = abs(math.sin(angle)), abs(math.cos(angle))
if side_short <= 2.*sin_a*cos_a*side_long or abs(sin_a-cos_a) < 1e-10:
# half constrained case: two crop corners touch the longer side,
# the other two corners are on the mid-line parallel to the longer line
x = 0.5*side_short
wr,hr = (x/sin_a,x/cos_a) if width_is_longer else (x/cos_a,x/sin_a)
else:
# fully constrained case: crop touches all 4 sides
cos_2a = cos_a*cos_a - sin_a*sin_a
wr,hr = (w*cos_a - h*sin_a)/cos_2a, (h*cos_a - w*sin_a)/cos_2a
return wr,hr
以下是该功能与其他解决方案的比较:
>>> wl,hl = largest_rotated_rect(1500,500,math.radians(20))
>>> print (wl,hl),', area=',wl*hl
(828.2888697391496, 230.61639227890998) , area= 191016.990904
>>> wm,hm = rotatedRectWithMaxArea(1500,500,math.radians(20))
>>> print (wm,hm),', area=',wm*hm
(730.9511000407718, 266.044443118978) , area= 194465.478358
在[0,pi/2[
中的角度angle
旋转图像的边界框(宽度w
,高度h
)具有以下尺寸:
w_bb = w*cos_a + h*sin_a
身高h_bb = w*sin_a + h*cos_a
如果w_r
、h_r
是计算得到的裁剪图像的最佳宽度和高度,则边界框的插图为:
(w_bb-w_r)/2
垂直方向:(h_bb-h_r)/2
证明:
寻找具有最大面积的两条平行线之间的轴对齐矩形是一个具有一个参数的优化问题,例如x
如下图:
让s
表示两条平行线之间的距离(它将变成旋转矩形的较短边)。那么求知矩形的边a
、b
分别与x
、s-x
有一个恒定的比例,即x = a sin α 和 (sx) = b cos α:
所以最大化a*b
的面积意味着最大化x*(s-x)
。由于直角三角形的“高度定理”,我们知道x*(s-x) = p*q = h*h
。因此在x = s-x = s/2
处达到最大面积,即平行线之间的两个角 E、G 在中线上:
此解决方案仅在此最大矩形适合旋转矩形时才有效。因此,对角线EG
不得长于旋转矩形的另一边l
。自从
EG = AF + DH = s/2*(cot α + tan α) = s/(2sin αcos α) = s/sin 2*α
我们有条件 s ≤ lsin 2α,其中 s 和 l 是旋转矩形的短边和长边。
在 s > lsin 2α 的情况下,参数 x
必须小于(小于 s/2)并且 s.t.广受欢迎的矩形的所有角都位于旋转矩形的一侧。这导致方程
x*cot α + (s-x)*tan α = l
给出 x = sin α*(lcos α - ssin α)/cos 2*α。从 a = x/sin α 和 b = (s-x)/cos α 我们得到上面用到的公式。
【讨论】:
+1 - 我已经针对优化程序(最大化面积)测试了您的解决方案,并且您的解决方案始终更快、更精确,迄今为止给出了相同的结果... @Saullo 数学有时会变魔术! :-) 感谢您分享您的发现。我现在添加了公式的(主要是图形的)推导。 谢谢!你的回答是最好的,对我帮助很大! 那么如何获取这个框的坐标呢? @TokeFaurby 好问题!我已经添加了裁剪区域的边界与边界框的距离(就在 Proof 部分之前)【参考方案2】:因此,在研究了许多声称的解决方案之后,我终于找到了一种可行的方法; Andri 和 Magnus Hoff 在 Calculate largest rectangle in a rotated rectangle 上的答案。
下面的 Python 代码包含感兴趣的方法 - largest_rotated_rect
- 和一个简短的演示。
import math
import cv2
import numpy as np
def rotate_image(image, angle):
"""
Rotates an OpenCV 2 / NumPy image about it's centre by the given angle
(in degrees). The returned image will be large enough to hold the entire
new image, with a black background
"""
# Get the image size
# No that's not an error - NumPy stores image matricies backwards
image_size = (image.shape[1], image.shape[0])
image_center = tuple(np.array(image_size) / 2)
# Convert the OpenCV 3x2 rotation matrix to 3x3
rot_mat = np.vstack(
[cv2.getRotationMatrix2D(image_center, angle, 1.0), [0, 0, 1]]
)
rot_mat_notranslate = np.matrix(rot_mat[0:2, 0:2])
# Shorthand for below calcs
image_w2 = image_size[0] * 0.5
image_h2 = image_size[1] * 0.5
# Obtain the rotated coordinates of the image corners
rotated_coords = [
(np.array([-image_w2, image_h2]) * rot_mat_notranslate).A[0],
(np.array([ image_w2, image_h2]) * rot_mat_notranslate).A[0],
(np.array([-image_w2, -image_h2]) * rot_mat_notranslate).A[0],
(np.array([ image_w2, -image_h2]) * rot_mat_notranslate).A[0]
]
# Find the size of the new image
x_coords = [pt[0] for pt in rotated_coords]
x_pos = [x for x in x_coords if x > 0]
x_neg = [x for x in x_coords if x < 0]
y_coords = [pt[1] for pt in rotated_coords]
y_pos = [y for y in y_coords if y > 0]
y_neg = [y for y in y_coords if y < 0]
right_bound = max(x_pos)
left_bound = min(x_neg)
top_bound = max(y_pos)
bot_bound = min(y_neg)
new_w = int(abs(right_bound - left_bound))
new_h = int(abs(top_bound - bot_bound))
# We require a translation matrix to keep the image centred
trans_mat = np.matrix([
[1, 0, int(new_w * 0.5 - image_w2)],
[0, 1, int(new_h * 0.5 - image_h2)],
[0, 0, 1]
])
# Compute the tranform for the combined rotation and translation
affine_mat = (np.matrix(trans_mat) * np.matrix(rot_mat))[0:2, :]
# Apply the transform
result = cv2.warpAffine(
image,
affine_mat,
(new_w, new_h),
flags=cv2.INTER_LINEAR
)
return result
def largest_rotated_rect(w, h, angle):
"""
Given a rectangle of size wxh that has been rotated by 'angle' (in
radians), computes the width and height of the largest possible
axis-aligned rectangle within the rotated rectangle.
Original JS code by 'Andri' and Magnus Hoff from Stack Overflow
Converted to Python by Aaron Snoswell
"""
quadrant = int(math.floor(angle / (math.pi / 2))) & 3
sign_alpha = angle if ((quadrant & 1) == 0) else math.pi - angle
alpha = (sign_alpha % math.pi + math.pi) % math.pi
bb_w = w * math.cos(alpha) + h * math.sin(alpha)
bb_h = w * math.sin(alpha) + h * math.cos(alpha)
gamma = math.atan2(bb_w, bb_w) if (w < h) else math.atan2(bb_w, bb_w)
delta = math.pi - alpha - gamma
length = h if (w < h) else w
d = length * math.cos(alpha)
a = d * math.sin(alpha) / math.sin(delta)
y = a * math.cos(gamma)
x = y * math.tan(gamma)
return (
bb_w - 2 * x,
bb_h - 2 * y
)
def crop_around_center(image, width, height):
"""
Given a NumPy / OpenCV 2 image, crops it to the given width and height,
around it's centre point
"""
image_size = (image.shape[1], image.shape[0])
image_center = (int(image_size[0] * 0.5), int(image_size[1] * 0.5))
if(width > image_size[0]):
width = image_size[0]
if(height > image_size[1]):
height = image_size[1]
x1 = int(image_center[0] - width * 0.5)
x2 = int(image_center[0] + width * 0.5)
y1 = int(image_center[1] - height * 0.5)
y2 = int(image_center[1] + height * 0.5)
return image[y1:y2, x1:x2]
def demo():
"""
Demos the largest_rotated_rect function
"""
image = cv2.imread("lenna_rectangle.png")
image_height, image_width = image.shape[0:2]
cv2.imshow("Original Image", image)
print "Press [enter] to begin the demo"
print "Press [q] or Escape to quit"
key = cv2.waitKey(0)
if key == ord("q") or key == 27:
exit()
for i in np.arange(0, 360, 0.5):
image_orig = np.copy(image)
image_rotated = rotate_image(image, i)
image_rotated_cropped = crop_around_center(
image_rotated,
*largest_rotated_rect(
image_width,
image_height,
math.radians(i)
)
)
key = cv2.waitKey(2)
if(key == ord("q") or key == 27):
exit()
cv2.imshow("Original Image", image_orig)
cv2.imshow("Rotated Image", image_rotated)
cv2.imshow("Cropped Image", image_rotated_cropped)
print "Done"
if __name__ == "__main__":
demo()
只需将this image(裁剪以证明它适用于非方形图像)与上述文件放在同一目录中,然后运行它。
【讨论】:
函数largest_rotated_rect
给出了无法扩展的矩形尺寸,即两个尺寸都没有更大的轴平行矩形将适合旋转的矩形。但除了少数特殊情况外,此函数不会返回适合的最大(最大面积)矩形尺寸。请参阅我的解决方案以获得真正的最佳值。
您可以使用cv::RotatedRect(center,ImageSize,angle).boundingRect()
查找旋转后图像的大小
rotate_image 实际上是以度为单位的角度,而不是弧度,因为 cv2.getRotationMatrix2D 是以度为单位的角度,而不是弧度docs.opencv.org/2.4/modules/imgproc/doc/…
我认为你有一个错字。您在 largest_rotated_rect
方法中对 gamma
的结果将始终相同。
嗨。你能告诉我如何获得白色背景而不是黑色的旋转图像吗?【参考方案3】:
祝贺您的出色工作!我想将 OpenCV 中的代码与 C++ 库一起使用,因此我进行了以下转换。也许这种方法对其他人有帮助。
#include <iostream>
#include <opencv.hpp>
#define PI 3.14159265359
using namespace std;
double degree_to_radian(double angle)
return angle * PI / 180;
cv::Mat rotate_image (cv::Mat image, double angle)
// Rotates an OpenCV 2 image about its centre by the given angle
// (in radians). The returned image will be large enough to hold the entire
// new image, with a black background
cv::Size image_size = cv::Size(image.rows, image.cols);
cv::Point image_center = cv::Point(image_size.height/2, image_size.width/2);
// Convert the OpenCV 3x2 matrix to 3x3
cv::Mat rot_mat = cv::getRotationMatrix2D(image_center, angle, 1.0);
double row[3] = 0.0, 0.0, 1.0;
cv::Mat new_row = cv::Mat(1, 3, rot_mat.type(), row);
rot_mat.push_back(new_row);
double slice_mat[2][2] =
rot_mat.col(0).at<double>(0), rot_mat.col(1).at<double>(0),
rot_mat.col(0).at<double>(1), rot_mat.col(1).at<double>(1)
;
cv::Mat rot_mat_nontranslate = cv::Mat(2, 2, rot_mat.type(), slice_mat);
double image_w2 = image_size.width * 0.5;
double image_h2 = image_size.height * 0.5;
// Obtain the rotated coordinates of the image corners
std::vector<cv::Mat> rotated_coords;
double image_dim_d_1[2] = -image_h2, image_w2 ;
cv::Mat image_dim = cv::Mat(1, 2, rot_mat.type(), image_dim_d_1);
rotated_coords.push_back(cv::Mat(image_dim * rot_mat_nontranslate));
double image_dim_d_2[2] = image_h2, image_w2 ;
image_dim = cv::Mat(1, 2, rot_mat.type(), image_dim_d_2);
rotated_coords.push_back(cv::Mat(image_dim * rot_mat_nontranslate));
double image_dim_d_3[2] = -image_h2, -image_w2 ;
image_dim = cv::Mat(1, 2, rot_mat.type(), image_dim_d_3);
rotated_coords.push_back(cv::Mat(image_dim * rot_mat_nontranslate));
double image_dim_d_4[2] = image_h2, -image_w2 ;
image_dim = cv::Mat(1, 2, rot_mat.type(), image_dim_d_4);
rotated_coords.push_back(cv::Mat(image_dim * rot_mat_nontranslate));
// Find the size of the new image
vector<double> x_coords, x_pos, x_neg;
for (int i = 0; i < rotated_coords.size(); i++)
double pt = rotated_coords[i].col(0).at<double>(0);
x_coords.push_back(pt);
if (pt > 0)
x_pos.push_back(pt);
else
x_neg.push_back(pt);
vector<double> y_coords, y_pos, y_neg;
for (int i = 0; i < rotated_coords.size(); i++)
double pt = rotated_coords[i].col(1).at<double>(0);
y_coords.push_back(pt);
if (pt > 0)
y_pos.push_back(pt);
else
y_neg.push_back(pt);
double right_bound = *max_element(x_pos.begin(), x_pos.end());
double left_bound = *min_element(x_neg.begin(), x_neg.end());
double top_bound = *max_element(y_pos.begin(), y_pos.end());
double bottom_bound = *min_element(y_neg.begin(), y_neg.end());
int new_w = int(abs(right_bound - left_bound));
int new_h = int(abs(top_bound - bottom_bound));
// We require a translation matrix to keep the image centred
double trans_mat[3][3] =
1, 0, int(new_w * 0.5 - image_w2),
0, 1, int(new_h * 0.5 - image_h2),
0, 0, 1,
;
// Compute the transform for the combined rotation and translation
cv::Mat aux_affine_mat = (cv::Mat(3, 3, rot_mat.type(), trans_mat) * rot_mat);
cv::Mat affine_mat = cv::Mat(2, 3, rot_mat.type(), NULL);
affine_mat.push_back(aux_affine_mat.row(0));
affine_mat.push_back(aux_affine_mat.row(1));
// Apply the transform
cv::Mat output;
cv::warpAffine(image, output, affine_mat, cv::Size(new_h, new_w), cv::INTER_LINEAR);
return output;
cv::Size largest_rotated_rect(int h, int w, double angle)
// Given a rectangle of size wxh that has been rotated by 'angle' (in
// radians), computes the width and height of the largest possible
// axis-aligned rectangle within the rotated rectangle.
// Original JS code by 'Andri' and Magnus Hoff from Stack Overflow
// Converted to Python by Aaron Snoswell (https://***.com/questions/16702966/rotate-image-and-crop-out-black-borders)
// Converted to C++ by Eliezer Bernart
int quadrant = int(floor(angle/(PI/2))) & 3;
double sign_alpha = ((quadrant & 1) == 0) ? angle : PI - angle;
double alpha = fmod((fmod(sign_alpha, PI) + PI), PI);
double bb_w = w * cos(alpha) + h * sin(alpha);
double bb_h = w * sin(alpha) + h * cos(alpha);
double gamma = w < h ? atan2(bb_w, bb_w) : atan2(bb_h, bb_h);
double delta = PI - alpha - gamma;
int length = w < h ? h : w;
double d = length * cos(alpha);
double a = d * sin(alpha) / sin(delta);
double y = a * cos(gamma);
double x = y * tan(gamma);
return cv::Size(bb_w - 2 * x, bb_h - 2 * y);
// for those interested in the actual optimum - contributed by coproc
#include <algorithm>
cv::Size really_largest_rotated_rect(int h, int w, double angle)
// Given a rectangle of size wxh that has been rotated by 'angle' (in
// radians), computes the width and height of the largest possible
// axis-aligned rectangle within the rotated rectangle.
if (w <= 0 || h <= 0)
return cv::Size(0,0);
bool width_is_longer = w >= h;
int side_long = w, side_short = h;
if (!width_is_longer)
std::swap(side_long, side_short);
// since the solutions for angle, -angle and pi-angle are all the same,
// it suffices to look at the first quadrant and the absolute values of sin,cos:
double sin_a = fabs(sin(angle)), cos_a = fabs(cos(angle));
double wr,hr;
if (side_short <= 2.*sin_a*cos_a*side_long)
// half constrained case: two crop corners touch the longer side,
// the other two corners are on the mid-line parallel to the longer line
double x = 0.5*side_short;
wr = x/sin_a;
hr = x/cos_a;
if (!width_is_longer)
std::swap(wr,hr);
else
// fully constrained case: crop touches all 4 sides
double cos_2a = cos_a*cos_a - sin_a*sin_a;
wr = (w*cos_a - h*sin_a)/cos_2a;
hr = (h*cos_a - w*sin_a)/cos_2a;
return cv::Size(wr,hr);
cv::Mat crop_around_center(cv::Mat image, int height, int width)
// Given a OpenCV 2 image, crops it to the given width and height,
// around it's centre point
cv::Size image_size = cv::Size(image.rows, image.cols);
cv::Point image_center = cv::Point(int(image_size.height * 0.5), int(image_size.width * 0.5));
if (width > image_size.width)
width = image_size.width;
if (height > image_size.height)
height = image_size.height;
int x1 = int(image_center.x - width * 0.5);
int x2 = int(image_center.x + width * 0.5);
int y1 = int(image_center.y - height * 0.5);
int y2 = int(image_center.y + height * 0.5);
return image(cv::Rect(cv::Point(y1, x1), cv::Point(y2,x2)));
void demo(cv::Mat image)
// Demos the largest_rotated_rect function
int image_height = image.rows;
int image_width = image.cols;
for (float i = 0.0; i < 360.0; i+=0.5)
cv::Mat image_orig = image.clone();
cv::Mat image_rotated = rotate_image(image, i);
cv::Size largest_rect = largest_rotated_rect(image_height, image_width, degree_to_radian(i));
// for those who trust math (added by coproc):
cv::Size largest_rect2 = really_largest_rotated_rect(image_height, image_width, degree_to_radian(i));
cout << "area1 = " << largest_rect.height * largest_rect.width << endl;
cout << "area2 = " << largest_rect2.height * largest_rect2.width << endl;
cv::Mat image_rotated_cropped = crop_around_center(
image_rotated,
largest_rect.height,
largest_rect.width
);
cv::imshow("Original Image", image_orig);
cv::imshow("Rotated Image", image_rotated);
cv::imshow("Cropped image", image_rotated_cropped);
if (char(cv::waitKey(15)) == 'q')
break;
int main (int argc, char* argv[])
cv::Mat image = cv::imread(argv[1]);
if (image.empty())
cout << "> The input image was not found." << endl;
exit(EXIT_FAILURE);
cout << "Press [s] to begin or restart the demo" << endl;
cout << "Press [q] to quit" << endl;
while (true)
cv::imshow("Original Image", image);
char opt = char(cv::waitKey(0));
switch (opt)
case 's':
demo(image);
break;
case 'q':
return EXIT_SUCCESS;
default:
break;
return EXIT_SUCCESS;
【讨论】:
我将在一周结束时给你 50 代表的赏金。非常感谢您翻译代码,伙计。太棒了!【参考方案4】:TensorFlow 中的旋转和裁剪
我个人在 TensorFlow 中需要这个函数,感谢 Aaron Snoswell,我可以实现这个函数。
def _rotate_and_crop(image, output_height, output_width, rotation_degree, do_crop):
"""Rotate the given image with the given rotation degree and crop for the black edges if necessary
Args:
image: A `Tensor` representing an image of arbitrary size.
output_height: The height of the image after preprocessing.
output_width: The width of the image after preprocessing.
rotation_degree: The degree of rotation on the image.
do_crop: Do cropping if it is True.
Returns:
A rotated image.
"""
# Rotate the given image with the given rotation degree
if rotation_degree != 0:
image = tf.contrib.image.rotate(image, math.radians(rotation_degree), interpolation='BILINEAR')
# Center crop to ommit black noise on the edges
if do_crop == True:
lrr_width, lrr_height = _largest_rotated_rect(output_height, output_width, math.radians(rotation_degree))
resized_image = tf.image.central_crop(image, float(lrr_height)/output_height)
image = tf.image.resize_images(resized_image, [output_height, output_width], method=tf.image.ResizeMethod.BILINEAR, align_corners=False)
return image
def _largest_rotated_rect(w, h, angle):
"""
Given a rectangle of size wxh that has been rotated by 'angle' (in
radians), computes the width and height of the largest possible
axis-aligned rectangle within the rotated rectangle.
Original JS code by 'Andri' and Magnus Hoff from Stack Overflow
Converted to Python by Aaron Snoswell
Source: http://***.com/questions/16702966/rotate-image-and-crop-out-black-borders
"""
quadrant = int(math.floor(angle / (math.pi / 2))) & 3
sign_alpha = angle if ((quadrant & 1) == 0) else math.pi - angle
alpha = (sign_alpha % math.pi + math.pi) % math.pi
bb_w = w * math.cos(alpha) + h * math.sin(alpha)
bb_h = w * math.sin(alpha) + h * math.cos(alpha)
gamma = math.atan2(bb_w, bb_w) if (w < h) else math.atan2(bb_w, bb_w)
delta = math.pi - alpha - gamma
length = h if (w < h) else w
d = length * math.cos(alpha)
a = d * math.sin(alpha) / math.sin(delta)
y = a * math.cos(gamma)
x = y * math.tan(gamma)
return (
bb_w - 2 * x,
bb_h - 2 * y
)
如果您需要在 TensorFlow 中进一步实现示例和可视化,可以使用this repository。 我希望这对其他人有帮助。
【讨论】:
这是金子!我不敢相信现在实际上有一个 tensorflow 端口:P 感谢分享 @ByungSoo-Ko!【参考方案5】:使用出色的 imutils 库为简洁起见的小更新。
def rotated_rect(w, h, angle):
"""
Given a rectangle of size wxh that has been rotated by 'angle' (in
radians), computes the width and height of the largest possible
axis-aligned rectangle within the rotated rectangle.
Original JS code by 'Andri' and Magnus Hoff from Stack Overflow
Converted to Python by Aaron Snoswell
"""
angle = math.radians(angle)
quadrant = int(math.floor(angle / (math.pi / 2))) & 3
sign_alpha = angle if ((quadrant & 1) == 0) else math.pi - angle
alpha = (sign_alpha % math.pi + math.pi) % math.pi
bb_w = w * math.cos(alpha) + h * math.sin(alpha)
bb_h = w * math.sin(alpha) + h * math.cos(alpha)
gamma = math.atan2(bb_w, bb_w) if (w < h) else math.atan2(bb_w, bb_w)
delta = math.pi - alpha - gamma
length = h if (w < h) else w
d = length * math.cos(alpha)
a = d * math.sin(alpha) / math.sin(delta)
y = a * math.cos(gamma)
x = y * math.tan(gamma)
return (bb_w - 2 * x, bb_h - 2 * y)
def crop(img, w, h):
x, y = int(img.shape[1] * .5), int(img.shape[0] * .5)
return img[
int(np.ceil(y - h * .5)) : int(np.floor(y + h * .5)),
int(np.ceil(x - w * .5)) : int(np.floor(x + h * .5))
]
def rotate(img, angle):
# rotate, crop and return original size
(h, w) = img.shape[:2]
img = imutils.rotate_bound(img, angle)
img = crop(img, *rotated_rect(w, h, angle))
img = cv2.resize(img,(w,h),interpolation=cv2.INTER_AREA)
return img
【讨论】:
一段不错的代码。我想使用您的代码作为基准创建一个新的解决方案(因为它是算法的最新版本),但我无法找到它的确切位置。您能否指出您的 github 项目的哪个文件中包含此功能?提前致谢。【参考方案6】:受 Coprox 惊人工作的启发,我编写了一个函数,该函数与 Coprox 的代码一起形成了一个完整的解决方案(因此可以通过复制和粘贴来轻松使用它)。下面的 rotate_max_area 函数只是返回一个没有黑色边界的旋转图像。
def rotate_bound(image, angle):
# CREDIT: https://www.pyimagesearch.com/2017/01/02/rotate-images-correctly-with-opencv-and-python/
(h, w) = image.shape[:2]
(cX, cY) = (w // 2, h // 2)
M = cv2.getRotationMatrix2D((cX, cY), -angle, 1.0)
cos = np.abs(M[0, 0])
sin = np.abs(M[0, 1])
nW = int((h * sin) + (w * cos))
nH = int((h * cos) + (w * sin))
M[0, 2] += (nW / 2) - cX
M[1, 2] += (nH / 2) - cY
return cv2.warpAffine(image, M, (nW, nH))
def rotate_max_area(image, angle):
""" image: cv2 image matrix object
angle: in degree
"""
wr, hr = rotatedRectWithMaxArea(image.shape[1], image.shape[0],
math.radians(angle))
rotated = rotate_bound(image, angle)
h, w, _ = rotated.shape
y1 = h//2 - int(hr/2)
y2 = y1 + int(hr)
x1 = w//2 - int(wr/2)
x2 = x1 + int(wr)
return rotated[y1:y2, x1:x2]
【讨论】:
【参考方案7】:Swift 解决方案
感谢 coproc 提供的出色解决方案。这是swift中的代码
// Given a rectangle of size.width x size.height that has been rotated by 'angle' (in
// radians), computes the width and height of the largest possible
// axis-aligned rectangle (maximal area) within the rotated rectangle.
func rotatedRectWithMaxArea(size: CGSize, angle: CGFloat) -> CGSize
let w = size.width
let h = size.height
if(w <= 0 || h <= 0)
return CGSize.zero
let widthIsLonger = w >= h
let (sideLong, sideShort) = widthIsLonger ? (w, h) : (w, h)
// since the solutions for angle, -angle and 180-angle are all the same,
// if suffices to look at the first quadrant and the absolute values of sin,cos:
let (sinA, cosA) = (sin(angle), cos(angle))
if(sideShort <= 2*sinA*cosA*sideLong || abs(sinA-cosA) < 1e-10)
// half constrained case: two crop corners touch the longer side,
// the other two corners are on the mid-line parallel to the longer line
let x = 0.5*sideShort
let (wr, hr) = widthIsLonger ? (x/sinA, x/cosA) : (x/cosA, x/sinA)
return CGSize(width: wr, height: hr)
else
// fully constrained case: crop touches all 4 sides
let cos2A = cosA*cosA - sinA*sinA
let (wr, hr) = ((w*cosA - h*sinA)/cos2A, (h*cosA - w*sinA)/cos2A)
return CGSize(width: wr, height: hr)
【讨论】:
【参考方案8】:更正了 Coprox 在 2013 年 5 月 27 日给出的上述最受青睐的解决方案:当 cosa = cosb infinity 导致最后两行时。通过在前面的 if 选择器中添加“或 cosa 等于 cosb”来解决。
补充:如果您不知道原始的非旋转 nx 和 ny,但只有旋转的帧(或图像),则找到仅包含此的框(我通过删除空白 = 单色边框来做到这一点)并首先运行逆向编程以找到 nx 和 ny。如果图像被旋转成一个太小的框架,以至于它被沿着侧面切割(成八角形),我首先找到完整包含框架的 x 和 y 扩展。 但是,这也不适用于 45 度左右的角度,结果是正方形,而不是保持未旋转的纵横比。对我来说,这个程序只能在 30 度范围内正常工作。
仍然是一个很棒的例程!它解决了我在天文图像对齐方面的烦恼。
【讨论】:
你的意思是sin(a) = cos(a)
的情况?那么确实cos(2a)
将为零(因为a = pi/4
),这是else 分支的奇点。通过精确计算,我们永远不会进入 else 分支,因为 2*sin(a)*cos(a)
等于 1
对于 a = pi/4
和 side_short <= side_long
根据定义成立。但由于舍入错误,if
条件对于side_short ~= side_long
和a ~= pi/4
仍可能为假。所以我将条件扩展or abs(sin_a - cos_a) < 1e-10
以远离那个奇点。谢谢你的提示!【参考方案9】:
也许更简单的解决方案是:
def crop_image(image, angle):
h, w = image.shape
tan_a = abs(np.tan(angle * np.pi / 180))
b = int(tan_a / (1 - tan_a ** 2) * (h - w * tan_a))
d = int(tan_a / (1 - tan_a ** 2) * (w - h * tan_a))
return image[d:h - d, b:w - b]
无需像许多人那样计算旋转矩形的高度和宽度,只需找到旋转图像时形成的黑色三角形的高度即可。
【讨论】:
【参考方案10】:按正确顺序旋转图像
import cv2
import pytesseract
import urllib
import numpy as np
import re
import imutils #added
import PIL
image = cv2.imread('my_pdf_madan_m/page_1.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
gray = cv2.bitwise_not(gray)
rot_data = pytesseract.image_to_osd(image);
print("[OSD] "+rot_data)
rot = re.search('(?<=Rotate: )\d+',
rot_data).group(0)
angle = float(rot)
# rotate the image to deskew it
rotated = imutils.rotate_bound(image, angle) #added
# TODO: Rotated image can be saved here
print(pytesseract.image_to_osd(rotated));
# Run tesseract OCR on image
text = pytesseract.image_to_string(rotated,
lang='eng', config="--psm 6")
print(text)
【讨论】:
【参考方案11】:最近为 Pytorch 实现了一个解决方案。它可能会派上用场。也可能与“随机旋转变换”一起使用。只需读取变换使用的特定角度,然后将其与 PyTorch 变换一起使用。函数只是简单地接收一批图像并通过裁剪进行随机旋转。
import torchvision.transforms as transforms
import math
def _largest_rotated_rect(w, h, angle):
"""
Given a rectangle of size wxh that has been rotated by 'angle' (in
radians), computes the width and height of the largest possible
axis-aligned rectangle within the rotated rectangle.
Original JS code by 'Andri' and Magnus Hoff from Stack Overflow
Converted to Python by Aaron Snoswell
Source: http://***.com/questions/16702966/rotate-image-and-crop-out-black-borders
"""
quadrant = int(math.floor(angle / (math.pi / 2))) & 3
sign_alpha = angle if ((quadrant & 1) == 0) else math.pi - angle
alpha = (sign_alpha % math.pi + math.pi) % math.pi
bb_w = w * math.cos(alpha) + h * math.sin(alpha)
bb_h = w * math.sin(alpha) + h * math.cos(alpha)
gamma = math.atan2(bb_w, bb_w) if (w < h) else math.atan2(bb_w, bb_w)
delta = math.pi - alpha - gamma
length = h if (w < h) else w
d = length * math.cos(alpha)
a = d * math.sin(alpha) / math.sin(delta)
y = a * math.cos(gamma)
x = y * math.tan(gamma)
return (
bb_w - 2 * x,
bb_h - 2 * y
)
def _rotate_and_crop(image, output_height=32, output_width=32):
"""Rotate the given image with the given rotation degree and crop for the black edges if necessary. For my case, image sizes are 32x32.
Args:
image: A Batch of Tensors- normally from a dataloader.
output_height: The height of the image after preprocessing.
output_width: The width of the image after preprocessing.
Returns:
A rotated image.
"""
# Rotate the given image with the given rotation degree
rotation_transform = transforms.RandomRotation((0, 360))
angle_rot = rotation_transform.angle_rot #you will have to read it from the pytorch library
lrr_width, lrr_height = _largest_rotated_rect(output_height, output_width, math.radians(angle_rot))
croped_image = transforms.CenterCrop((lrr_height, lrr_width))
resize_transform = transforms.Resize(size=(output_height, output_width))
transform = transforms.Compose([rotation_transform, croped_image, resize_transform, transforms.RandomHorizontalFlip(),
transforms.ToTensor(),
])
image = transform(image)
return image
【讨论】:
以上是关于旋转图像并裁剪黑色边框的主要内容,如果未能解决你的问题,请参考以下文章
如何使用 Python(或 Perl)检测“暗”图像边框并裁剪到它?