摆脱Python中数组列表中的空数组和零数组
Posted
技术标签:
【中文标题】摆脱Python中数组列表中的空数组和零数组【英文标题】:Getting rid of empty and zeros arrays in a list of arrays in Python 【发布时间】:2013-12-22 10:50:36 【问题描述】:我正在考虑一些 Python 数据,它们是以下形式的数组列表:
LA=
[array([ 99.08322813, 253.42371683, 300.792029 ])
array([ 51.55274095, 106.29707418, 0])
array([0, 0 ,0 , 0, 0])
array([ 149.07283952, 191.45513754, 251.19610503, 393.50806493, 453.56783459])
array([ 105.61643877, 442.76668729, 450.37335607])
array([ 348.84179544])
array([], dtype=float64)]
array([0, 0 , 0])
array([ 295.05603151, 0, 451.77083268, 500.81771919])
array([ 295.05603151, 307.37232315, 451.77083268, 500.81771919])
array([ 91.86758237, 148.70156948, 488.70648486, 507.31389766])
array([ 353.68691095])
array([ 208.21919198, 246.57665959, 0, 251.33820305, 394.34266882])
array([], dtype=float64)]
在我的数据中,我得到了一些 空数组:
array([], dtype=float64)]
和用零填充的数组:
array([0, 0, 0])
如何以自动化的简单方式摆脱这两种数组
LA=
[array([ 99.08322813, 253.42371683, 300.792029 ])
array([ 51.55274095, 106.29707418, 0])
array([ 149.07283952, 191.45513754, 251.19610503, 393.50806493, 453.56783459])
array([ 105.61643877, 442.76668729, 450.37335607])
array([ 348.84179544])
array([ 295.05603151, 0, 451.77083268, 500.81771919])
array([ 295.05603151, 307.37232315, 451.77083268, 500.81771919])
array([ 91.86758237, 148.70156948, 488.70648486, 507.31389766])
array([ 353.68691095])
array([ 208.21919198, 246.57665959, 0, 251.33820305, 394.34266882])
最后我想删除零并保持数组列表格式得到
LA=
[array([ 99.08322813, 253.42371683, 300.792029 ])
array([ 51.55274095, 106.29707418])
array([ 149.07283952, 191.45513754, 251.19610503, 393.50806493, 453.56783459])
array([ 105.61643877, 442.76668729, 450.37335607])
array([ 348.84179544])
array([ 295.05603151, 451.77083268, 500.81771919])
array([ 295.05603151, 307.37232315, 451.77083268, 500.81771919])
array([ 91.86758237, 148.70156948, 488.70648486, 507.31389766])
array([ 353.68691095])
array([ 208.21919198, 246.57665959, 251.33820305, 394.34266882])
提前致谢
【问题讨论】:
你有没有尝试过?向我们展示您的代码。 【参考方案1】:列表理解应该做第一部分
[x for x in LA if x.any()]
你可以用compress
做第二部分
[x.compress(x) for x in LA if x.any()]
基于 Ashwini 想法的更快版本
[x.compress(x) for x in LA if count_nonzero(x)]
时间:
In [89]: %timeit [x.compress(x) for x in LA if count_nonzero(x)] #clear winner
10000 loops, best of 3: 20.2 µs per loop
【讨论】:
【参考方案2】:使用 NumPy 和列表推导:
>>> from numpy import *
解决方案 1:
>>> [x[x!=0] for x in LA if len(x) and len(x[x!=0])]
[array([ 99.08322813, 253.42371683, 300.792029 ]),
array([ 51.55274095, 106.29707418]),
array([ 149.07283952, 191.45513754, 251.19610503, 393.50806493,
453.56783459]),
array([ 105.61643877, 442.76668729, 450.37335607]),
array([ 348.84179544]),
array([ 295.05603151, 451.77083268, 500.81771919]),
array([ 295.05603151, 307.37232315, 451.77083268, 500.81771919]),
array([ 91.86758237, 148.70156948, 488.70648486, 507.31389766]),
array([ 353.68691095]),
array([ 208.21919198, 246.57665959, 251.33820305, 394.34266882])]
解决方案 2:
>>> [x[x!=0] for x in LA if count_nonzero(x)]
[array([ 99.08322813, 253.42371683, 300.792029 ]),
array([ 51.55274095, 106.29707418]),
array([ 149.07283952, 191.45513754, 251.19610503, 393.50806493,
453.56783459]),
array([ 105.61643877, 442.76668729, 450.37335607]),
array([ 348.84179544]),
array([ 295.05603151, 451.77083268, 500.81771919]),
array([ 295.05603151, 307.37232315, 451.77083268, 500.81771919]),
array([ 91.86758237, 148.70156948, 488.70648486, 507.31389766]),
array([ 353.68691095]),
array([ 208.21919198, 246.57665959, 251.33820305, 394.34266882])]
时间对比:
In [56]: %timeit [x[x!=0] for x in LA if len(x) and len(x[x!=0])]
10000 loops, best of 3: 176 µs per loop
In [88]: %timeit [x[x!=0] for x in LA if count_nonzero(x)]
10000 loops, best of 3: 89.7 µs per loop
#@gnibbler's solution:
In [82]: %timeit [x.compress(x) for x in LA if x.any()]
10000 loops, best of 3: 138 µs per loop
较大数组的时序结果:
In [140]: LA = [resize(x, 10**5) for x in LA]
In [142]: %timeit [x[x!=0] for x in LA if len(x) and len(x[x!=0])]
10 loops, best of 3: 26.7 ms per loop
In [143]: %timeit [x[x!=0] for x in LA if count_nonzero(x) > 0]
10 loops, best of 3: 26 ms per loop
In [144]: %timeit [x.compress(x) for x in LA if x.any()]
10 loops, best of 3: 42.7 ms per loop
In [145]: %timeit [x.compress(x) for x in LA if count_nonzero(x)]
10 loops, best of 3: 45.8 ms per loop
In [146]: %timeit [x[x!=0] for x in LA if x.any()]
10 loops, best of 3: 22.9 ms per loop
In [147]: %timeit [x[x!=0] for x in LA if count_nonzero(x)]
10 loops, best of 3: 26.2 ms per loop
【讨论】:
你也可以给我的答案计时吗? @gnibblercompress
一个花了大约 138us。
当你使用count_nonzero
时,你不需要len(x)
检查。
aha...any
是慢的部分。 count_nonzero
快得多
@gnibbler 我实际上预计它会更快,即它应该像 Python 的 any
那样短路。顺便说一句,我将所有项目的大小调整为 100000
并再次计时,这次 [x[x!=0] for x in LA if x.any()]
最快,而令人震惊的是 [x.compress(x) for x in LA if count_nonzero(x)]
最慢。以上是关于摆脱Python中数组列表中的空数组和零数组的主要内容,如果未能解决你的问题,请参考以下文章