从嵌套列表创建数组时抑制 Numpy 中的科学记数法
Posted
技术标签:
【中文标题】从嵌套列表创建数组时抑制 Numpy 中的科学记数法【英文标题】:Suppress Scientific Notation in Numpy When Creating Array From Nested List 【发布时间】:2012-03-19 20:54:15 【问题描述】:我有一个嵌套的 Python 列表,如下所示:
my_list = [[3.74, 5162, 13683628846.64, 12783387559.86, 1.81],
[9.55, 116, 189688622.37, 260332262.0, 1.97],
[2.2, 768, 6004865.13, 5759960.98, 1.21],
[3.74, 4062, 3263822121.39, 3066869087.9, 1.93],
[1.91, 474, 44555062.72, 44555062.72, 0.41],
[5.8, 5006, 8254968918.1, 7446788272.74, 3.25],
[4.5, 7887, 30078971595.46, 27814989471.31, 2.18],
[7.03, 116, 66252511.46, 81109291.0, 1.56],
[6.52, 116, 47674230.76, 57686991.0, 1.43],
[1.85, 623, 3002631.96, 2899484.08, 0.64],
[13.76, 1227, 1737874137.5, 1446511574.32, 4.32],
[13.76, 1227, 1737874137.5, 1446511574.32, 4.32]]
然后我导入 Numpy,并将打印选项设置为 (suppress=True)
。当我创建一个数组时:
my_array = numpy.array(my_list)
我不能一辈子压制科学记数法:
[[ 3.74000000e+00 5.16200000e+03 1.36836288e+10 1.27833876e+10
1.81000000e+00]
[ 9.55000000e+00 1.16000000e+02 1.89688622e+08 2.60332262e+08
1.97000000e+00]
[ 2.20000000e+00 7.68000000e+02 6.00486513e+06 5.75996098e+06
1.21000000e+00]
[ 3.74000000e+00 4.06200000e+03 3.26382212e+09 3.06686909e+09
1.93000000e+00]
[ 1.91000000e+00 4.74000000e+02 4.45550627e+07 4.45550627e+07
4.10000000e-01]
[ 5.80000000e+00 5.00600000e+03 8.25496892e+09 7.44678827e+09
3.25000000e+00]
[ 4.50000000e+00 7.88700000e+03 3.00789716e+10 2.78149895e+10
2.18000000e+00]
[ 7.03000000e+00 1.16000000e+02 6.62525115e+07 8.11092910e+07
1.56000000e+00]
[ 6.52000000e+00 1.16000000e+02 4.76742308e+07 5.76869910e+07
1.43000000e+00]
[ 1.85000000e+00 6.23000000e+02 3.00263196e+06 2.89948408e+06
6.40000000e-01]
[ 1.37600000e+01 1.22700000e+03 1.73787414e+09 1.44651157e+09
4.32000000e+00]
[ 1.37600000e+01 1.22700000e+03 1.73787414e+09 1.44651157e+09
4.32000000e+00]]
如果我直接创建一个简单的numpy数组:
new_array = numpy.array([1.5, 4.65, 7.845])
我没有问题,它打印如下:
[ 1.5 4.65 7.845]
有人知道我的问题是什么吗?
【问题讨论】:
numpy.set_printoptions
控制如何打印 numpy 数组。但是,没有完全禁止科学记数法的选项。它正在切换,因为您的值范围从 1e-2 到 1e9。如果您的范围较小,则不会使用科学计数法来显示它们。为什么用print
显示它们很重要呢?如果您想保存它,请使用savetxt
等。
不是你真正要问的,但使用 numpy.round (即使精度很高)我能够删除在 SVD 重建矩阵中看起来像 7.00000000e+00 的科学记数法。由于科学记数法(?),它以前不会断言平等。我提到它是因为 np.set_printoptions(suppress=True) 没有为我解决这个问题。
【参考方案1】:
这是你需要的:
np.set_printoptions(suppress=True)
这里是documentation。
【讨论】:
你能至少提供一个总结吗? 就我而言,它仍然使用科学记数法 @ZloySmiertniy,使用格式化程序,如下 Eric 的回答。我用np.set_printoptions(formatter='all':lambda x: str(x))
【参考方案2】:
Python 在打印 numpy ndarrays 时强制抑制所有指数符号,纠缠文本对齐,舍入和打印选项:
下面是对正在发生的事情的解释,滚动到底部查看代码演示。
将参数 suppress=True
传递给函数 set_printoptions
仅适用于分配给它的默认 8 个字符空间中的数字,如下所示:
import numpy as np
np.set_printoptions(suppress=True) #prevent numpy exponential
#notation on print, default False
# tiny med large
a = np.array([1.01e-5, 22, 1.2345678e7]) #notice how index 2 is 8
#digits wide
print(a) #prints [ 0.0000101 22. 12345678. ]
但是,如果您传入一个大于 8 个字符宽的数字,则会再次强加指数符号,如下所示:
np.set_printoptions(suppress=True)
a = np.array([1.01e-5, 22, 1.2345678e10]) #notice how index 2 is 10
#digits wide, too wide!
#exponential notation where we've told it not to!
print(a) #prints [1.01000000e-005 2.20000000e+001 1.23456780e+10]
numpy 可以选择将您的数字切成两半从而歪曲它,或者强制使用指数表示法,它会选择后者。
set_printoptions(formatter=...)
来帮助您指定打印和舍入选项。告诉set_printoptions
只打印一个裸浮动:
np.set_printoptions(suppress=True,
formatter='float_kind':':f'.format)
a = np.array([1.01e-5, 22, 1.2345678e30]) #notice how index 2 is 30
#digits wide.
#Ok good, no exponential notation in the large numbers:
print(a) #prints [0.000010 22.000000 1234567799999999979944197226496.000000]
我们强制抑制了指数符号,但它没有四舍五入或对齐,因此请指定额外的格式选项:
np.set_printoptions(suppress=True,
formatter='float_kind':':0.2f'.format) #float, 2 units
#precision right, 0 on left
a = np.array([1.01e-5, 22, 1.2345678e30]) #notice how index 2 is 30
#digits wide
print(a) #prints [0.00 22.00 1234567799999999979944197226496.00]
在 ndarrays 中强制抑制所有指数概念的缺点是,如果您的 ndarray 在其中获得一个接近无穷大的巨大浮点值,并且您打印它,您将被一页充满数字的页面炸飞.
完整示例演示 1:
from pprint import pprint
import numpy as np
#chaotic python list of lists with very different numeric magnitudes
my_list = [[3.74, 5162, 13683628846.64, 12783387559.86, 1.81],
[9.55, 116, 189688622.37, 260332262.0, 1.97],
[2.2, 768, 6004865.13, 5759960.98, 1.21],
[3.74, 4062, 3263822121.39, 3066869087.9, 1.93],
[1.91, 474, 44555062.72, 44555062.72, 0.41],
[5.8, 5006, 8254968918.1, 7446788272.74, 3.25],
[4.5, 7887, 30078971595.46, 27814989471.31, 2.18],
[7.03, 116, 66252511.46, 81109291.0, 1.56],
[6.52, 116, 47674230.76, 57686991.0, 1.43],
[1.85, 623, 3002631.96, 2899484.08, 0.64],
[13.76, 1227, 1737874137.5, 1446511574.32, 4.32],
[13.76, 1227, 1737874137.5, 1446511574.32, 4.32]]
#convert python list of lists to numpy ndarray called my_array
my_array = np.array(my_list)
#This is a little recursive helper function converts all nested
#ndarrays to python list of lists so that pretty printer knows what to do.
def arrayToList(arr):
if type(arr) == type(np.array):
#If the passed type is an ndarray then convert it to a list and
#recursively convert all nested types
return arrayToList(arr.tolist())
else:
#if item isn't an ndarray leave it as is.
return arr
#suppress exponential notation, define an appropriate float formatter
#specify stdout line width and let pretty print do the work
np.set_printoptions(suppress=True,
formatter='float_kind':':16.3f'.format, linewidth=130)
pprint(arrayToList(my_array))
打印:
array([[ 3.740, 5162.000, 13683628846.640, 12783387559.860, 1.810],
[ 9.550, 116.000, 189688622.370, 260332262.000, 1.970],
[ 2.200, 768.000, 6004865.130, 5759960.980, 1.210],
[ 3.740, 4062.000, 3263822121.390, 3066869087.900, 1.930],
[ 1.910, 474.000, 44555062.720, 44555062.720, 0.410],
[ 5.800, 5006.000, 8254968918.100, 7446788272.740, 3.250],
[ 4.500, 7887.000, 30078971595.460, 27814989471.310, 2.180],
[ 7.030, 116.000, 66252511.460, 81109291.000, 1.560],
[ 6.520, 116.000, 47674230.760, 57686991.000, 1.430],
[ 1.850, 623.000, 3002631.960, 2899484.080, 0.640],
[ 13.760, 1227.000, 1737874137.500, 1446511574.320, 4.320],
[ 13.760, 1227.000, 1737874137.500, 1446511574.320, 4.320]])
完整示例演示 2:
import numpy as np
#chaotic python list of lists with very different numeric magnitudes
# very tiny medium size large sized
# numbers numbers numbers
my_list = [[0.000000000074, 5162, 13683628846.64, 1.01e10, 1.81],
[1.000000000055, 116, 189688622.37, 260332262.0, 1.97],
[0.010000000022, 768, 6004865.13, -99e13, 1.21],
[1.000000000074, 4062, 3263822121.39, 3066869087.9, 1.93],
[2.91, 474, 44555062.72, 44555062.72, 0.41],
[5, 5006, 8254968918.1, 7446788272.74, 3.25],
[0.01, 7887, 30078971595.46, 27814989471.31, 2.18],
[7.03, 116, 66252511.46, 81109291.0, 1.56],
[6.52, 116, 47674230.76, 57686991.0, 1.43],
[1.85, 623, 3002631.96, 2899484.08, 0.64],
[13.76, 1227, 1737874137.5, 1446511574.32, 4.32],
[13.76, 1337, 1737874137.5, 1446511574.32, 4.32]]
import sys
#convert python list of lists to numpy ndarray called my_array
my_array = np.array(my_list)
#following two lines do the same thing, showing that np.savetxt can
#correctly handle python lists of lists and numpy 2D ndarrays.
np.savetxt(sys.stdout, my_list, '%19.2f')
np.savetxt(sys.stdout, my_array, '%19.2f')
打印:
0.00 5162.00 13683628846.64 10100000000.00 1.81
1.00 116.00 189688622.37 260332262.00 1.97
0.01 768.00 6004865.13 -990000000000000.00 1.21
1.00 4062.00 3263822121.39 3066869087.90 1.93
2.91 474.00 44555062.72 44555062.72 0.41
5.00 5006.00 8254968918.10 7446788272.74 3.25
0.01 7887.00 30078971595.46 27814989471.31 2.18
7.03 116.00 66252511.46 81109291.00 1.56
6.52 116.00 47674230.76 57686991.00 1.43
1.85 623.00 3002631.96 2899484.08 0.64
13.76 1227.00 1737874137.50 1446511574.32 4.32
13.76 1337.00 1737874137.50 1446511574.32 4.32
0.00 5162.00 13683628846.64 10100000000.00 1.81
1.00 116.00 189688622.37 260332262.00 1.97
0.01 768.00 6004865.13 -990000000000000.00 1.21
1.00 4062.00 3263822121.39 3066869087.90 1.93
2.91 474.00 44555062.72 44555062.72 0.41
5.00 5006.00 8254968918.10 7446788272.74 3.25
0.01 7887.00 30078971595.46 27814989471.31 2.18
7.03 116.00 66252511.46 81109291.00 1.56
6.52 116.00 47674230.76 57686991.00 1.43
1.85 623.00 3002631.96 2899484.08 0.64
13.76 1227.00 1737874137.50 1446511574.32 4.32
13.76 1337.00 1737874137.50 1446511574.32 4.32
请注意,在 2 个单位精度下舍入是一致的,并且在非常大的 e+x
和非常小的 e-x
范围内都抑制了指数表示法。
【讨论】:
【参考方案3】:对于一维和二维数组,您可以使用 np.savetxt 使用特定格式字符串进行打印:
>>> import sys
>>> x = numpy.arange(20).reshape((4,5))
>>> numpy.savetxt(sys.stdout, x, '%5.2f')
0.00 1.00 2.00 3.00 4.00
5.00 6.00 7.00 8.00 9.00
10.00 11.00 12.00 13.00 14.00
15.00 16.00 17.00 18.00 19.00
您在 v1.3 中使用 numpy.set_printoptions 或 numpy.array2string 的选项非常笨拙且有限(例如,无法抑制大数的科学记数法)。看起来这将随着未来的版本而改变,使用 numpy.set_printoptions(formatter=..) 和 numpy.array2string(style=..)。
【讨论】:
【参考方案4】:您可以编写一个将科学记数法转换为常规记数法的函数,例如
def sc2std(x):
s = str(x)
if 'e' in s:
num,ex = s.split('e')
if '-' in num:
negprefix = '-'
else:
negprefix = ''
num = num.replace('-','')
if '.' in num:
dotlocation = num.index('.')
else:
dotlocation = len(num)
newdotlocation = dotlocation + int(ex)
num = num.replace('.','')
if (newdotlocation < 1):
return negprefix+'0.'+'0'*(-newdotlocation)+num
if (newdotlocation > len(num)):
return negprefix+ num + '0'*(newdotlocation - len(num))+'.0'
return negprefix + num[:newdotlocation] + '.' + num[newdotlocation:]
else:
return s
【讨论】:
以上是关于从嵌套列表创建数组时抑制 Numpy 中的科学记数法的主要内容,如果未能解决你的问题,请参考以下文章
pandas to_csv:将 pandas 写入 csv 时抑制 csv 文件中的科学记数法