如何用 numpy/pandas 计算“子矩阵”条目的总和?
Posted
技术标签:
【中文标题】如何用 numpy/pandas 计算“子矩阵”条目的总和?【英文标题】:How to calculate the sum of "submatrix" entries with numpy/pandas? 【发布时间】:2019-01-20 09:34:10 【问题描述】:我在 Python 中有以下 8x8 矩阵,我将其表示为 8×8 numpy 数组或 pandas DataFrame:
import numpy as np
import pandas as pd
x = range(64)
x = np.reshape(x,(8,8))
print(x)
# [[ 0 1 2 3 4 5 6 7]
# [ 8 9 10 11 12 13 14 15]
# [16 17 18 19 20 21 22 23]
# [24 25 26 27 28 29 30 31]
# [32 33 34 35 36 37 38 39]
# [40 41 42 43 44 45 46 47]
# [48 49 50 51 52 53 54 55]
# [56 57 58 59 60 61 62 63]]
df = pd.DataFrame(x)
print(df)
# 0 1 2 3 4 5 6 7
# 0 0 1 2 3 4 5 6 7
# 1 8 9 10 11 12 13 14 15
# 2 16 17 18 19 20 21 22 23
# 3 24 25 26 27 28 29 30 31
# 4 32 33 34 35 36 37 38 39
# 5 40 41 42 43 44 45 46 47
# 6 48 49 50 51 52 53 54 55
# 7 56 57 58 59 60 61 62 63
如果它是一个 2×2 矩阵,我正在尝试计算这些值的总和,并用这个总和替换上述值。我的最终结果是
# 0 1 2 3 4 5 6 7
# 0 216 216 216 216 280 280 280 280
# 1 216 216 216 216 280 280 280 280
# 2 216 216 216 216 280 280 280 280
# 3 216 216 216 216 280 280 280 280
# 4 728 728 728 728 792 792 792 792
# 5 728 728 728 728 792 792 792 792
# 6 728 728 728 728 792 792 792 792
# 7 728 728 728 728 792 792 792 792
所以,顶角矩阵的计数为 216,因为
0+1+2+3+8+9+10+11+16+17+18+19+24+25+26+27=216
同样,
32+33+34+35+40+41+42+43+48+49+50+51+56+57+58+59=728
4+5+6+7+12+13+14+15+20+21+22+23+28+29+30+31=280
36+37+38+39+44+45+46+47+52+53+54+55+60+61+62+63=792
是否有 numpy/pandas 功能可以使计算更容易?特别是对于更大的矩阵,手动设置“和矩阵”的坐标可能非常麻烦。
【问题讨论】:
【参考方案1】:使用 NumPy 的一种方法是:
import numpy as np
def as_submatrices(x, rows, cols=None, writeable=False):
from numpy.lib.stride_tricks import as_strided
if cols is None: cols = rows
x = np.asarray(x)
x_rows, x_cols = x.shape
s1, s2 = x.strides
if x_rows % rows != 0 or x_cols % cols != 0:
raise ValueError('Invalid dimensions.')
out_shape = (x_rows // rows, x_cols // cols, rows, cols)
out_strides = (s1 * rows, s2 * cols, s1, s2)
return as_strided(x, out_shape, out_strides, writeable=writeable)
def sum_submatrices(x, rows, cols=None):
if cols is None: cols = rows
x = np.asarray(x)
x_sub = as_submatrices(x, rows, cols)
x_sum = np.sum(x_sub, axis=(2, 3))
x_rows, x_cols = x.shape
return np.repeat(np.repeat(x_sum, rows, axis=0), cols, axis=1)
x = np.arange(64).reshape((8, 8))
print(sum_submatrices(x, 4))
# [[216 216 216 216 280 280 280 280]
# [216 216 216 216 280 280 280 280]
# [216 216 216 216 280 280 280 280]
# [216 216 216 216 280 280 280 280]
# [728 728 728 728 792 792 792 792]
# [728 728 728 728 792 792 792 792]
# [728 728 728 728 792 792 792 792]
# [728 728 728 728 792 792 792 792]]
print(sum_submatrices(x, 2))
# [[ 18 18 26 26 34 34 42 42]
# [ 18 18 26 26 34 34 42 42]
# [ 82 82 90 90 98 98 106 106]
# [ 82 82 90 90 98 98 106 106]
# [146 146 154 154 162 162 170 170]
# [146 146 154 154 162 162 170 170]
# [210 210 218 218 226 226 234 234]
# [210 210 218 218 226 226 234 234]]
print(sum_submatrices(x, 2, 8))
# [[120 120 120 120 120 120 120 120]
# [120 120 120 120 120 120 120 120]
# [376 376 376 376 376 376 376 376]
# [376 376 376 376 376 376 376 376]
# [632 632 632 632 632 632 632 632]
# [632 632 632 632 632 632 632 632]
# [888 888 888 888 888 888 888 888]
# [888 888 888 888 888 888 888 888]]
编辑: As pointed out by Divakar, np.broadcast_to
比 np.repeat
快,所以上面函数的改进版本是:
def sum_submatrices(x, rows, cols=None):
if cols is None: cols = rows
x = np.asarray(x)
x_sub = as_submatrices(x, rows, cols)
x_sum = np.sum(x_sub, axis=(2, 3), keepdims=True)
x_sum = np.broadcast_to(x_sum, x_sub.shape)
return x_sum.transpose((0, 2, 1, 3)).reshape(x.shape)
这与 Divakar 的答案基本相同,只是那个更好,因为它不使用跨步技巧和换位。
【讨论】:
感谢您的帮助。我对上面对as_strided()
的使用感到有些困惑。与上面的四个 4x4 矩阵相比,如何设置 16 个 2x2 矩阵?
@ShanZhengYang 我已将答案改写为更通用并添加了另一个示例。您可以选择任何子矩阵大小,只要它划分得好,它们甚至不需要是正方形的。【参考方案2】:
这是一个使用einsum
和np.repeat
的通用解决方案(一个条件是n
均分数组):
def sum_chunks(n, x):
"""
Tiles an array into NxN chunks, based on the sum of the chunk
:param n: dimension of sub-matrices
:param x: input array
:return: Tiled array
"""
h, w = x.shape
out = x.reshape(h//n, n, -1, n).swapaxes(1,2).reshape(-1, n, n)
s = np.einsum('ijk->i', out)
return np.repeat(np.repeat(s.reshape(h//n, w//n), n, axis=0), n, axis=1)
您可以使用此解决方案将您的数组拆分为任意大小的子数组,求和,然后重复到原始大小:
>>> sum_chunks(4, np.arange(64).reshape(8,8))
array([[216, 216, 216, 216, 280, 280, 280, 280],
[216, 216, 216, 216, 280, 280, 280, 280],
[216, 216, 216, 216, 280, 280, 280, 280],
[216, 216, 216, 216, 280, 280, 280, 280],
[728, 728, 728, 728, 792, 792, 792, 792],
[728, 728, 728, 728, 792, 792, 792, 792],
[728, 728, 728, 728, 792, 792, 792, 792],
[728, 728, 728, 728, 792, 792, 792, 792]])
sum_chunks(2, np.arange(64).reshape(8,8))
array([[ 18, 18, 26, 26, 34, 34, 42, 42],
[ 18, 18, 26, 26, 34, 34, 42, 42],
[ 82, 82, 90, 90, 98, 98, 106, 106],
[ 82, 82, 90, 90, 98, 98, 106, 106],
[146, 146, 154, 154, 162, 162, 170, 170],
[146, 146, 154, 154, 162, 162, 170, 170],
[210, 210, 218, 218, 226, 226, 234, 234],
[210, 210, 218, 218, 226, 226, 234, 234]])
【讨论】:
'arr' 是帖子中的数组'x'? @ShanZhengYang 我用通用函数更新了我的答案,以帮助您处理不同大小的子数组 你能检查一下上面的内容是否正确吗?sum_chunks(4,4, x)
似乎没有给出我输入的预期总和【参考方案3】:
这里有一些考虑到性能的东西,它利用 np.broadcast_to
在总结重塑后进行复制部分 -
def sum_chunks_broadcasted(x, M, N): # M,N : no. of blocks along height and width
m,n = x.shape
s = x.reshape(M,m//M,N,n//N).sum((1,3),keepdims=1)
return np.broadcast_to(s,(M,m//M,N,n//N)).reshape(m,n)
示例运行 -
In [143]: x = np.arange(48).reshape(8,6)
In [144]: x
Out[144]:
array([[ 0, 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10, 11],
[12, 13, 14, 15, 16, 17],
[18, 19, 20, 21, 22, 23],
[24, 25, 26, 27, 28, 29],
[30, 31, 32, 33, 34, 35],
[36, 37, 38, 39, 40, 41],
[42, 43, 44, 45, 46, 47]])
In [145]: sum_chunks_broadcasted(x, M=2, N=3) # 2x3 total windows
Out[145]:
array([[ 76, 76, 92, 92, 108, 108],
[ 76, 76, 92, 92, 108, 108],
[ 76, 76, 92, 92, 108, 108],
[ 76, 76, 92, 92, 108, 108],
[268, 268, 284, 284, 300, 300],
[268, 268, 284, 284, 300, 300],
[268, 268, 284, 284, 300, 300],
[268, 268, 284, 284, 300, 300]])
时间
通过@user3483203's sum_chunks
和@jdehesa's sum_submatrices
在各种窗口形状和数字的大型数组上与其他通用矢量化向量进行比较 -
1) 设置输入数组:
In [83]: x = np.random.rand(8000, 8000)
2) 4x4 总窗口:
In [152]: %timeit sum_submatrices(x, 8000//4, cols=8000//4)
1 loop, best of 3: 271 ms per loop
In [153]: %timeit sum_chunks(8000//4, x)
1 loop, best of 3: 372 ms per loop
In [154]: %timeit sum_chunks_broadcasted(x, M=4, N=4)
10 loops, best of 3: 81 ms per loop
3) 40x40 总窗口:
In [155]: %timeit sum_submatrices(x, 8000//40, cols=8000//40)
1 loop, best of 3: 271 ms per loop
In [156]: %timeit sum_chunks(8000//40, x)
1 loop, best of 3: 385 ms per loop
In [157]: %timeit sum_chunks_broadcasted(x, M=40, N=40)
10 loops, best of 3: 84 ms per loop
4) 400x400 总窗口:
In [158]: %timeit sum_submatrices(x, 8000//400, cols=8000//400)
1 loop, best of 3: 318 ms per loop
In [159]: %timeit sum_chunks(8000//400, x)
1 loop, best of 3: 396 ms per loop
In [160]: %timeit sum_chunks_broadcasted(x, M=400, N=400)
10 loops, best of 3: 123 ms per loop
【讨论】:
【参考方案4】:使用 numpy 也可以这样做(不是最漂亮的):
dx = 8
dy = 8
x_subs = 2
y_subs = 2
arr = np.arange(dx * dy).reshape(dy, dx)
sums = [
[second_split.sum() for second_split in np.split(first_split, y_subs, axis=1)]
for first_split in np.split(arr, x_subs, axis=0)
]
sums_filled = np.repeat(np.repeat(sums, dx, axis=0), dy, axis=1)
我不熟悉“跨步技巧”,但此解决方案可能类似于 jdehesa 的。
【讨论】:
以上是关于如何用 numpy/pandas 计算“子矩阵”条目的总和?的主要内容,如果未能解决你的问题,请参考以下文章