学习NumPy全套代码超详细基本操作数据类型数组运算复制和试图索引切片和迭代形状操作通用函数线性代数
Posted 报告,今天也有好好学习
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了学习NumPy全套代码超详细基本操作数据类型数组运算复制和试图索引切片和迭代形状操作通用函数线性代数相关的知识,希望对你有一定的参考价值。
大家好,我又来给大家分享新知识了,最近几篇博客(文末有链接)都致力于将以前的学习资源整理成博客的形式分享给大家,真心干货满满,希望能给大家带来收获。
那么本篇博客将会给出大家平时使用NumPy的时候经常需要用到的功能代码,同时也会给出运行结果,以帮助大家更进一步的理解。
另外,我也以注释的形式更进一步的补充说明代码的功能及其作用,需要本篇博文中用到的文档文件以及代码的朋友,也可以三连支持一下,并评论留下你的邮箱,我会在看到后的第一时间发送给你。
当然啦,你也可以把本篇博文当作一本小小的NumPy书籍,当需要用到pandas哪些知识的时候,Ctrl+F就可以搜索到啦,现在不看的话就先收藏着。
目录
一、基本操作
1.1 数组创建
import numpy as np # Shift + Enter
# 创建可以将Python,中list列表转换成NumPy数组
l = [1,2,3,4,5]
# NumPy数组
nd1 = np.array(l) # 输入一部分arr + tab(命令中自动补全,按键) 代码提示,自动补全
print(nd1)
display(nd1) # 显示
[1 2 3 4 5]
array([1, 2, 3, 4, 5])
nd2 = np.zeros(shape = (3,4),dtype = np.int16) # shift + tab提示方法的属性,使用
nd2
array([[0, 0, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 0]], dtype=int16)
nd3 = np.ones(shape = (3,5),dtype=np.float32)
nd3 # juppyter中执行程序,代码,最后一行,默认就是输出
array([[1., 1., 1., 1., 1.],
[1., 1., 1., 1., 1.],
[1., 1., 1., 1., 1.]], dtype=float32)
# 三维数组
nd4 = np.full(shape = (3,4,5),fill_value=3.1415926) # 生成任意指定的数组
nd4
array([[[3.1415926, 3.1415926, 3.1415926, 3.1415926, 3.1415926],
[3.1415926, 3.1415926, 3.1415926, 3.1415926, 3.1415926],
[3.1415926, 3.1415926, 3.1415926, 3.1415926, 3.1415926],
[3.1415926, 3.1415926, 3.1415926, 3.1415926, 3.1415926]],
[[3.1415926, 3.1415926, 3.1415926, 3.1415926, 3.1415926],
[3.1415926, 3.1415926, 3.1415926, 3.1415926, 3.1415926],
[3.1415926, 3.1415926, 3.1415926, 3.1415926, 3.1415926],
[3.1415926, 3.1415926, 3.1415926, 3.1415926, 3.1415926]],
[[3.1415926, 3.1415926, 3.1415926, 3.1415926, 3.1415926],
[3.1415926, 3.1415926, 3.1415926, 3.1415926, 3.1415926],
[3.1415926, 3.1415926, 3.1415926, 3.1415926, 3.1415926],
[3.1415926, 3.1415926, 3.1415926, 3.1415926, 3.1415926]]])
nd5 = np.random.randint(0,100,size = 20) # 从0,到100,生成随机数字,int,整数
nd5
array([26, 97, 32, 88, 28, 65, 77, 74, 68, 97, 32, 35, 46, 55, 33, 83, 63,
6, 77, 5])
nd6 = np.random.rand(3,5) # 生成0~1之间随机数
nd6
array([[0.64501117, 0.58522154, 0.90132264, 0.99409845, 0.63923959],
[0.2164321 , 0.53874694, 0.54988461, 0.82581533, 0.42652412],
[0.01025381, 0.49834132, 0.71353756, 0.44433708, 0.05175048]])
nd7 = np.random.randn(3,5) # 正态分布,平均值是0,标准差是1
display(nd7)
array([[-1.34974864, -0.35255807, -0.06337357, -0.39990286, 0.18276669],
[ 0.42686337, -0.29675634, -0.66351388, 0.15499455, 0.22191029],
[-2.24510816, -0.25372978, 0.61602861, -0.53877681, 1.8443575 ]])
nd8 = np.random.normal(loc = 175,scale = 10,size = (3,5)) # 正态分布,平均值是175,标准差是10
print(nd8)
[[154.35099611 188.63428445 178.86129064 173.99374674 173.92688007]
[173.48768953 185.57252565 172.63843251 192.40089968 177.04776165]
[166.25486758 198.20977267 162.28102209 167.1159521 183.41324182]]
nd9 = np.arange(1,100,step = 10) # 等差数列,左闭右开,100取不到
nd9
array([ 1, 11, 21, 31, 41, 51, 61, 71, 81, 91])
nd10 = np.linspace(1,100,num = 19) # 等差数列,左闭右闭,num表示生成等差数列长度
nd10
array([ 1. , 6.5, 12. , 17.5, 23. , 28.5, 34. , 39.5, 45. ,
50.5, 56. , 61.5, 67. , 72.5, 78. , 83.5, 89. , 94.5,
100. ])
1.2 查看数组属性
import numpy as np
nd = np.random.randn(5,3)
nd
array([[ 0.72059556, -1.95187973, -0.56373137],
[-1.73917205, -1.16500837, 1.42147895],
[ 0.38178684, -1.44311435, -0.47301186],
[-1.02896549, 0.05223197, 0.05234227],
[ 0.21264316, -1.0635056 , -0.98622601]])
# 查看数组形状,返回了形状 shape = (5,3)
nd.shape
(5, 3)
nd.dtype # 告诉数组的数据类型 float64 位,一位占一个0或者一个1
dtype('float64')
nd.size # 尺寸,数组可以是多维的,请问,里面共有多少数据 3*5 = 15
15
nd.ndim # 数组维度
2
nd.itemsize # 条目 尺寸长度 8 字节
# 数据类型是float64 64位 -----> 1个字节8位-----> 64/8 = 8 字节
8
1.3 文件读写
nd1 = np.random.randint(0,100,size = (3,5))
nd2 = np.random.randn(3,5)
display(nd1,nd2)
array([[17, 9, 81, 22, 35],
[33, 3, 76, 56, 39],
[51, 7, 98, 68, 76]])
array([[-1.00157588, -1.42826357, -0.0595288 , 1.52754491, -0.36709515],
[-1.27106787, 0.10364645, 0.32066376, -1.66321598, -1.25959691],
[-1.71740637, -0.25518009, -0.81794158, 0.76914636, 1.14322894]])
np.save('./data',nd1) # 把一个数据存到文件中
np.load('./data.npy') # 默认添加npy后缀
array([[17, 9, 81, 22, 35],
[33, 3, 76, 56, 39],
[51, 7, 98, 68, 76]])
# 多个数据存到一个文件中
np.savez('./data.npz',a = nd1,abc = nd2) # 保存数据是起名:a,abc,称为key,自己命名
data = np.load('./data.npz')
data
<numpy.lib.npyio.NpzFile at 0x2352497f7c8>
data['a'] # 单引号
array([[17, 9, 81, 22, 35],
[33, 3, 76, 56, 39],
[51, 7, 98, 68, 76]])
data['abc']
array([[-1.00157588, -1.42826357, -0.0595288 , 1.52754491, -0.36709515],
[-1.27106787, 0.10364645, 0.32066376, -1.66321598, -1.25959691],
[-1.71740637, -0.25518009, -0.81794158, 0.76914636, 1.14322894]])
# data['www'] # 没有保存,无法获取
np.savez_compressed('./data2.npz',x = nd1,y = nd2)
np.load('./data2.npz')['x']
array([[0, 4, 9, 0, 8],
[6, 2, 5, 1, 1],
[7, 5, 2, 4, 3]])
np.savetxt(fname = './data.txt',# 文件名
X = nd1, # 数据
fmt='%0.2f', # 格式
delimiter=',')# 分隔符
np.savetxt(fname = './data.cvs',# 文件名
X = nd1, # 数据
fmt='%d', # 格式
delimiter=';')# 分隔符
np.loadtxt('./data.cvs',delimiter=';')
array([[0., 4., 9., 0., 8.],
[6., 2., 5., 1., 1.],
[7., 5., 2., 4., 3.]])
np.loadtxt('./data.txt',delimiter=',')
array([[0., 4., 9., 0., 8.],
[6., 2., 5., 1., 1.],
[7., 5., 2., 4., 3.]])
二、数据类型
# int8,int16,int32,int64,uint8无符号
# float16,float32,float64
# str字符串类型
# int8 表示 2**8个数字 256个 -128 ~ 127 有符号
# uint8 表示256个数字,无符号,表明只有正数:0 ~ 255
np.array([2,4,7],dtype = np.int8)
array([2, 4, 7], dtype=int8)
np.array([-3,-7,255,108,0,256],dtype = np.uint8)
array([253, 249, 255, 108, 0, 0], dtype=uint8)
np.random.randint(0,100,size = 10,dtype = 'int64')
array([ 8, 36, 88, 80, 20, 52, 83, 42, 18, 1], dtype=int64)
nd = np.random.rand(10,2)
nd
array([[0.91420288, 0.83364378],
[0.88974876, 0.33297581],
[0.40118288, 0.07479842],
[0.16937616, 0.26712123],
[0.94525666, 0.34948455],
[0.96704048, 0.61851364],
[0.90099123, 0.3941368 ],
[0.58999302, 0.40133028],
[0.95706455, 0.53783821],
[0.19142784, 0.38579803]])
nd.dtype
dtype('float64')
np.asarray(nd,dtype = 'float16')
array([[0.914 , 0.8335 ],
[0.8896 , 0.333 ],
[0.4011 , 0.07477],
[0.1694 , 0.267 ],
[0.9453 , 0.3494 ],
[0.967 , 0.6187 ],
[0.901 , 0.394 ],
[0.59 , 0.4014 ],
[0.957 , 0.5376 ],
[0.1914 , 0.3857 ]], dtype=float16)
nd.astype(dtype = np.float16)
array([[0.914 , 0.8335 ],
[0.8896 , 0.333 ],
[0.4011 , 0.07477],
[0.1694 , 0.267 ],
[0.9453 , 0.3494 ],
[0.967 , 0.6187 ],
[0.901 , 0.394 ],
[0.59 , 0.4014 ],
[0.957 , 0.5376 ],
[0.1914 , 0.3857 ]], dtype=float16)
nd = np.random.randn(1000,3) # 默认数据类型是float64
np.save('./data1',nd)
np.save('./data2',nd.astype('float16'))
nd2 = np.array(list('abcdefghi'))
nd2
array(['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i'], dtype='<U1')
nd2.dtype
dtype('<U1')
三、数组运算
3.1 基本运算
# 加减乘除指数幂运算
nd1 = np.random.randint(0,10,size = 5)
nd2 = np.random.randint(0,10,size = 5)
display(nd1,nd2)
array([7, 9, 4, 6, 7])
array([0, 4, 6, 6, 0])
nd3 = nd1 - nd2 # 返回一个新对象,原来的数组,内容不变!
nd3 # nd3数组操作后,接收的对象
array([ 7, 5, -2, 0, 7])
nd1 * nd2 # 乘法
array([ 0, 36, 24, 36, 0])
nd1 / nd2 # 除法
d:\\python\\lib\\site-packages\\ipykernel_launcher.py:1: RuntimeWarning: divide by zero encountered in true_divide
"""Entry point for launching an IPython kernel.
array([ inf, 2.25 , 0.66666667, 1. , inf])
nd1**nd2 # 幂运算
array([ 1, 6561, 4096, 46656, 1], dtype=int32)
2**3
8
np.power(2,3) # 表示2的3次幂
8
np.power(nd1,nd2) # 表示nd1的nd2次幂,对应位置,进行计算
array([ 1, 6561, 4096, 46656, 1], dtype=int32)
np.log(100) # 底数是自然底数e 2.718
4.605170185988092
np.log10(1000) # 对数运算返回结果是:3
3.0
np.log2(1024) # 返回结果就是:10
10.0
3.2 逻辑运算
display(nd1,nd2)
array([7, 9, 4, 6, 7])
array([0, 4, 6, 6, 0])
nd1 > nd2
array([ True, True, False, False, True])
nd1 < nd2
array([False, False, True, False, False])
nd1 >= nd2 # 表示nd1数组中的数据,是否大于等于nd2中的对应位置的数据,如果大于等于,放回True
array([ True, True, False, True, True])
nd1 == nd2 # 两个等号表示逻辑判断,问,是否相等
array([False, False, False, True, False])
3.3 数组与标量计算
nd1
array([7, 9, 4, 6, 7])
# 数字3,4,5……都是标量
nd1 + 10 # 所有的位置都加了10,广播
array([17, 19, 14, 16, 17])
nd1 - 1024
array([-1017, -1015, -1020, -1018, -1017])
nd1 * 256
array([1792, 2304, 1024, 1536, 1792])
nd1 / 1024
array([0.00683594, 0.00878906, 0.00390625, 0.00585938, 0.00683594])
# 数组可以做分母,注意不能有0
1/nd1
array([0.14285714, 0.11111111, 0.25 , 0.16666667, 0.14285714])
1/np.array([1,3,0,5]) # 0不能作为分母计算结果:inf
d:\\python\\lib\\site-packages\\ipykernel_launcher.py:1: RuntimeWarning: divide by zero encountered in true_divide
"""Entry point for launching an IPython kernel.
array([1. , 0.33333333, inf, 0.2 ])
3.4 -= += *=直接改变原数组
display(nd1,nd2) # 没变化
array([7, 9, 4, 6, 7])
array([0, 4, 6, 6, 0])
nd1 -= 100 # 没有打印输出,说明,改变了原来的数组
nd1
array([-93, -91, -96, -94, -93])
nd2 +=100
nd2
array([100, 104, 106, 106, 100])
nd1 *= 3
nd1
array([-279, -273, -288, -282, -279])
# nd1 /= 10 数组不支持 /=
四、复制和视图
4.1 完全没有复制
a = np.random.randint(0,10,size = 5)
b = a # 赋值操作
display(a,b)
array([7, 9, 3, 2, 9])
array([7, 9, 3, 2, 9])
a is b # 返回True说明,赋值操作,a和b一回事
True
a[0] = 1024 # 改变a那么b也发生了变化
display(a,b)
array([1024, 9, 3, 2, 9])
array([1024, 9, 3, 2, 9])
4.2 视图、查看或者浅拷贝
a = np.random.randint(0,100,size = 5)
b = a.view() # 视图,查看,浅拷贝
display(a,b)
array([69, 16, 91, 6, 96])
array([69, 16, 91, 6, 96])
a is b # 说明a和b不一样
False
a.flags.owndata # a数组数据是自己的
True
b.flags.owndata # b是浅拷贝a的数据,也就是b并不拥有自己的数据
False
a[0] = 1024
b[1] = 2048 # 无论修改谁,最终结果两个数组都发生了变化
display(a,b)
array([1024, 2048, 91, 6, 96])
array([1024, 2048, 91, 6, 96])
4.3 深拷贝
a = np.random.randint(-100,0,size = 10)
b = a.copy() # 深拷贝,此时,a和b没有关系了
display(a,b)
array([-99, -5, -16, -5, -11, -72, -57, -69, -66, -96])
array([-99, -5, -16, -5, -11, -72, -57, -69, -66, -96])
display(a is b)
display(a.flags.owndata)
display(b.flags.owndata) # b 对象拥有自己的数据
False
True
True
a[0] = 1024
b[2] = 2048 # 井水不犯河水
display(a,b)
array([1024, -5, -16, -5, -11, -72, -57, -69, -66, -96])
array([ -99, -5, 2048, -5, -11, -72, -57, -69, -66, -96])
a = np.arange(1e8) # 0 ~ 1亿,数据量非常多的
a
array([0.0000000e+00, 1.0000000e+00, 2.0000000e+00, ..., 9.9999997e+07,
9.9999998e+07, 9.9999999e+07])
b = a[[1,3,5,7,9,99]].copy() # 取出一部分数据,原来的数组,没有了,但是占内存特别大
del a # 删除原来的数组,内存优化
b
array([ 1., 3., 5., 7., 9., 99.])
五、索引、切片和迭代
5.1 基本索引和切片
a = np.random.randint(0,30,size = 10)
a
array([28, 19, 9, 26, 28, 4, 14, 24, 8, 3])
a[3] # 取一个
a[[1,3,5]] # 取多个
array([19, 26, 4])
a[0:3] # 左闭右开
array([28, 19, 9])
a[:3] # 如果冒号前面不写,默认从0开始
array([28, 19, 9])
a[5:9] # 从某个索引开始切片
array([ 4, 14, 24, 8])
a[5:] # 冒号后面不写内容,那么默认就是到左后
array([ 4, 14, 24, 8, 3])
a[::2] # 两个中取一个
array([28, 9, 28, 14, 8])
a[3::3] # 从索引3开始,每三个数中,取一个
array([26, 14, 3])
a[::-1] # 倒着数,数组进行了颠倒
array([ 3, 8, 24, 14, 4, 28, 26, 9, 19, 28])
a
array([28, 19, 9, 26, 28, 4, 14, 24, 8, 3])
a[::-2] # 颠倒,两个中取一个
array([ 3, 24, 4, 26, 19])
a[5::-3]
array([4, 9])
a[1:7:2] # 从索引1开始到7结束,每两个中取一个
array([19, 26, 4])
b = np.random.randint(0,30,size = (10,10))
b # 二维数组,多维数据索引和切片和上面的规律一样
array([[25, 23, 13, 28, 17, 13, 8, 16, 25, 3],
[20, 7, 6, 26, 26, 13, 24, 0, 20, 18],
[ 1, 24, 10, 25, 24, 21, 16, 4, 26, 29],
[12, 0, 18, 20, 9, 16, 14, 19, 19, 20],
[28, 18, 29, 7, 7, 15, 5, 13, 13, 7],
[28, 28, 25, 13, 7, 8, 5, 16, 8, 2],
[27, 9, 2, 25, 14, 7, 26, 5, 14, 11],
[ 6, 14, 10, 20, 24, 28, 10, 0, 24, 12],
[ 8, 21, 22, 21, 24, 6, 25, 21, 12, 26],
[ 0, 19, 8, 5, 20, 1, 3, 3, 15, 27]])
b[1]
array([20, 7, 6, 26, 26, 13, 24, 0, 20, 18])
b[[0,3,5]]
array([[25, 23, 13, 28, 17, 13, 8, 16, 25, 3],
[12, 0, 18, 20, 9, 16, 14, 19, 19, 20],
[28, 28, 25, 13, 7, 8, 5, 16, 8, 2]])
b[1,6]
24
b[3,[2,5,6]] # 多维数组,不怕,我们可以用逗号,分割
array([18, 16, 14])
b[2:7,1::3] # 行:从2到索引7。列:从1开始,每3个中取一个数字
array([[24, 24, 4],
[ 0, 9, 19],
[18, 7, 13],
[28, 7, 16],
[ 9, 14, 5]])
b[-1,-1] # 给-1表示倒着数
27
b[-2,[-2,-3,-4]]
array([12, 21, 25])
5.2 花式索引和索引技巧
a = np.arange(20)
a
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
17, 18, 19])
b = a[3:7] # 切片时,返回的数据,不是深拷贝
b
array([3, 4, 5, 6])
b[0] = 1024
display(a,b)
array([ 0, 1, 2, 1024, 4, 5, 6, 7, 8, 9, 10,
11, 12, 13, 14, 15, 16, 17, 18, 19])
array([1024, 4, 5, 6])
a = np.arange(20)
# 花式索引返回的深拷贝的数据
b = a[[3,4,5,6]] # 花式索引:就是在索引是,给了一个数组
display(a,b)
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
17, 18, 19])
array([3, 4, 5, 6])
b[0] = 1024
display(a,b)
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
17, 18, 19])
array([1024, 4, 5, 6])
a = np.random.randint(0,151,size = (100,3)) # 100名学生,参加了3门考试:Python、Math、En
a
array([[100, 128, 122],
[110, 1, 6],
[120, 100, 108],
[ 57, 112, 145],
[ 46, 121, 130],
[141, 20, 25],
[ 7, 98, 15],
[141, 145, 121],
[ 1, 29, 117],
[ 94, 62, 46],
[ 35, 135, 101],
[ 96, 38, 137],
[ 97, 114, 60],
[121, 113, 38],
[ 4, 85, 107],
[ 13, 79, 12],
[129, 92, 150],
[ 54, 38, 34],
[109, 55, 88],
[ 1, 52, 23],
[ 16, 80, 146],
[122, 72, 126],
[ 39, 76, 52],
[144, 68, 69],
[121, 141, 147],
[141, 71, 110],
[100, 40, 108],
[ 91, 121, 65],
[ 3, 44, 105],
[ 79, 61, 75],
[117, 146, 88],
[ 3, 2, 71],
[ 78, 24, 86],
[125, 62, 93],
[ 53, 88, 132],
[ 96, 75, 36],
[102, 119, 97],
[ 24, 48, 78],
[104, 21, 150],
[ 8, 34, 30],
[108, 56, 58],
[ 22, 144, 100],
[ 53, 115, 94],
[ 52, 104, 9],
[ 59, 2, 4],
[ 85, 48, 138],
[119, 21, 73],
[ 9, 31, 77],
[ 21, 3, 132],
[ 43, 113, 39],
[ 51, 5, 134],
[ 37, 97, 123],
[ 0, 92, 73],
[ 37, 86, 13],
[ 48, 78, 128],
[ 74, 56, 48],
[138, 105, 68],
[129, 129, 103],
[ 48, 42, 14],
[ 50, 102, 123],
[ 6, 97, 16],
[ 22, 88, 122],
[ 98, 23, 137],
[ 95, 74, 20],
[ 20, 111, 132],
[ 67, 9, 28],
[ 74, 1, 79],
[ 62, 8, 27],
[ 24, 12, 22],
[ 58, 111, 130],
[ 29, 23, 28],
[ 10, 30, 38],
[113, 62, 122],
[ 70, 141, 97],
[137, 106, 53],
[ 76, 91, 101],
[128, 22, 101],
[134, 75, 100],
[114, 92, 79],
[103, 20, 145],
[105, 26, 51],
[ 34, 51, 18],
[117, 115, 31],
[ 7, 119, 75],
[ 23, 74, 149],
[ 56, 15, 11],
[143, 73, 148],
[ 11, 22, 26],
[ 18, 113, 120],
[ 37, 150, 21],
[115, 89, 13],
[134, 108, 98],
[115, 58, 4],
[114, 58, 29],
[ 26, 135, 10],
[ 82, 147, 147],
[ 56, 13, 39],
[ 21, 125, 134],
[120, 71, 32],
[143, 46, 124]])
cond = a >= 120 # 逻辑运算
# 根据条件,筛选数据,只要大于120,返回,一门大于120,就会返回这一门
a[cond]
array([128, 122, 120, 145, 121, 130, 141, 141, 145, 121, 135, 137, 121,
129, 150, 146, 122, 126, 144, 121, 141, 147, 141, 121, 146, 125,
132, 150, 144, 138, 132, 134, 123, 128, 138, 129, 129, 123, 122,
137, 132, 130, 122, 141, 137, 128, 134, 145, 149, 143, 148, 120,
150, 134, 135, 147, 147, 125, 134, 120, 143, 124])
# boolean True = 1;False = 0
# 三门科目的条件进行相乘
# 三个科目都是 大于120的同学
cond2 = cond[:,0]*cond[:,1]学习pandas全套代码超详细数据查看输入输出选取集成清洗转换重塑数学和统计方法排序