获取每个坐标的对角线长度的更有效方法

Posted 2023-03-12

技术标签:

【中文标题】获取每个坐标的对角线长度的更有效方法【英文标题】：More efficient way to get diagonal length per coordinate 【发布时间】：2021-11-25 13:37:08 【问题描述】：

我有一个代表匹配项的 x 和 y 值（坐标）数组，对于这些 x,y 中的每一个，我想知道它所属的对角线的长度。例如让我们取这些坐标

数据说明

coords = np.asarray([[0,0], [0,7], [1,1], [1,6], [2,2], [2,5], [3,3],[3,4], [4,4]])
# [[0 0]
#  [0 7]
#  [1 1]
#  [1 6]
#  [2 2]
#  [2 5]
#  [3 3]
#  [3 4]
#  [4 4]]

我们可以将其转换为矩阵，但在我的情况下，这效率太低了（例如，scipy todia() 将引发低效警告；见下文）。无论如何，让我们制作矩阵以使问题更清楚：

[[1 0 0 0 0 0 0 1]
 [0 1 0 0 0 0 1 0]
 [0 0 1 0 0 1 0 0]
 [0 0 0 1 1 0 0 0]
 [0 0 0 0 1 0 0 0]]

目标查看上表，我们看到两条对角线（或一条对角线和一条对角线）。对于对角线的每个位置，我想知道它所在的对角线的长度，所以像这样的表格：

# x, y, diag length
[[0 0 5]
 [1 1 5]
 [2 2 5]
 [3 3 5]
 [4 4 5]
 [3 4 4]
 [2 5 4]
 [1 6 4]
 [0 7 4]]

低效解决方案 我想我可以在sparse scipy matrix 中表示这些数据，而这给出了将稀疏矩阵转换为对角坐标矩阵的所需结果已经是inefficient for 100 diagonals，更不用说我拥有的数千个了。

from scipy.sparse import dia_matrix, coo_matrix
coords = np.asarray([[0,0], [0,7], [1,1], [1,6], [2,2], [2,5], [3,3],[3,4], [4,4]])

# Create the scipy coord matrix
x = coords[:,0]
y = coords[:,1]
tot_elem = coords.shape[0]*2
data = np.repeat(1, len(x))
co_mat = coo_matrix( (data, (x, y)), shape=(max(x)+1, max(y)+1))

# Get the diagonal matrix
dia_mat = dia_matrix(co_mat).tocoo()
diag_coords = np.column_stack((dia_mat.row, dia_mat.col))

# Get the consecutive values to put them to lengths
difs = np.diff(diag_coords[:, 1])
cuts = [0] + list(np.where(difs != 1)[0] + 1) + [diag_coords.shape[0]]
sizes = np.diff(cuts)
sizes = np.repeat(sizes, sizes)

# Combine with the original coords
dia_sizes = np.column_stack((dia_mat.row, dia_mat.col, sizes))
print(dia_sizes)

*刚刚意识到一个坐标可以是对角线和对角线的一部分，在这种情况下，我可以同时报告两者或只报告最长对角线的长度——我的解决方案没有考虑到：( em>

编辑： 更有效的解决方案

查看 todia() 代码here 我注意到他们使用了一个聪明的技巧来查看点是否在对角线上，即x-y 对于同一对角线上的点应该是相同的。但是，对于反对角线，情况并非如此。所以我假设相反，x + y 确实给了我们关于同一个对角线的观点。使用这个我想出了已经比使用 scipy 快得多的代码。

import numpy as np

coords = np.asarray([[0,0], [0,7], [1,1], [1,6], [2,2], [2,5], [3,3],[3,4], [4,4]])
x = coords[:,0]
y = coords[:,1]

# Get the diagonal (inspired by scripy todia code)
ks1 = y - x

# Unlike scipy, I think we can do the same by summing to get the anti-diagonal
ks2 = y + x

# Sort these to get the groups in the same diagonal
idx = np.argsort(ks1)
anti_idx = np.argsort(ks2)

def get_dia_len(arr,ori):
    sizes = np.diff([0] + list(np.where(np.diff(arr)!= ori)[0] + 1) + [arr.shape[0]])
    size_arr = np.repeat(sizes, sizes)
    return size_arr

# Get the diagonal lengths, i.e. cut at changing values and get the gaps between them
norm_sizes = get_dia_len(x[idx],1)
anti_sizes = get_dia_len(y[anti_idx],-1)

# Gather this in a table
norm = np.column_stack([x[idx], y[idx], norm_sizes])
anti = np.column_stack([x[anti_idx], y[anti_idx], anti_sizes])
dia_coord = np.concatenate((norm, anti))

# We only have a diagonal when we have >1 value
dia_coord = dia_coord[dia_coord[:, -1] > 1]
print(dia_coord)

我已经为此低头有一段时间了，很想知道是否有人有聪明的方法来解决这个问题:)

【问题讨论】：

如果你有点 [0, 0], [1, 0], [2, 0], [3, 0] - 它们会形成“对角线”吗？ @Mortz 感谢您的提问，不，那将是一条直线，而不是对角线，即 45° 明确（呃）您正在使用scipy.sparse。 @hpaulj 感谢您的提醒，也忘了提及导入 - 已编辑坐标列表有多长？ x 和 y 的最小值和最大值是多少？ 【参考方案1】：

一种方法是遍历坐标并通过每个点构造 45 度 线（假设这就是“对角线”的意思），然后从 coords 列表中删除任何位于的点在这一行 -

此函数计算固定点45度线上的点，并仅返回coords列表中的点

coords = [[0,0], [0,7], [1,1], [1,6], [2,2], [2,5], [3,3],[3,4], [4,4]]
coords = [tuple(_) for _ in coords]

def get_y(x, fixed_point, allowed_slopes=(1, -1), coords=coords.copy()):
    coords = [tuple(_) for _ in coords]
    x_fixed, y_fixed = fixed_point
    possible_y = [y_fixed + slope*(x - x_fixed) for slope in allowed_slopes]
    possible_coords = [(x, y) for y in possible_y]
    available_coords = list(set(possible_coords) & set(coords))
    return available_coords
print(get_y(1, (0,0)))
#[(1, 1)]
print(get_y(6, (0,0)))
#[] because (6, 6) is not on coords

然后我们可以循环遍历coords，同时删除同一行上的所有点。使用list.pop 确保我们不必为同一组点多次不必要地计算对角线

idx = 0
grouped_points = list()
while coords:
    group = list()
    fixed_point = coords.pop()
    print(f'fixed_point is now fixed_point')
    group.append(fixed_point)
    print(f'group is now group')
    available_x = set([x for (x, y) in coords])
    print(f'available_x is now available_x')
    for x in available_x:
        pt, *_ = get_y(x, fixed_point)
        print(f'pt is now pt')
        if pt and pt in coords:
            group.append(pt)
            coords.remove(pt)
        print(f'coords is now coords')
        print(f'group is now group')
    print(idx, group, sep='\t')
    grouped_points.append(group)
    idx += 1

然后将长度附加到输出以获得所需的结果

grouped_points = [(*pt, len(group)) for group in grouped_points for pt in group]
print(*grouped_points, sep='\n')
#(4, 4, 5)
#(0, 0, 5)
#(1, 1, 5)
#(2, 2, 5)
#(3, 3, 5)
#(3, 4, 4)
#(0, 7, 4)
#(1, 6, 4)
#(2, 5, 4)

使用timeit 计时表明，对于这组coords，此解决方案的速度大约快了 10 倍

【讨论】：

以上是关于获取每个坐标的对角线长度的更有效方法的主要内容，如果未能解决你的问题，请参考以下文章

CodeForces 621BWet Shark and Bishops

QueenAttack