Pandas 数据结构
first:
import numpy as np
import pandas as pd
Series
1.支持任意类型的一维标签数据,分为数据部分和轴标签部分(索引)
2.可以从list,dict,ndarray,scalar value等数据类型来创建
3.Series的取值和向量运算
From ndarray
python s = pd.Series(np.random.randn(5),index = ‘a b c d e‘.split(‘ ‘))
python s
a 0.299422
b 0.593082
c -2.120001
d -2.062322
e -1.493702
dtype: float64
python s.index
Index([‘a‘, ‘b‘, ‘c‘, ‘d‘, ‘e‘], dtype=‘object‘)
python s.name = ‘test‘
python s.name
‘test‘
From dict
```python d = {} index = ‘abcde‘ for i in range(5): d[index[i]] = i
pd.Series(d) ```
a 0
b 1
c 2
d 3
e 4
dtype: int64
python pd.Series(d,index = ‘e d c b f‘.split(‘ ‘))
e 4.0
d 3.0
c 2.0
b 1.0
f NaN
dtype: float64
From scalar value
python pd.Series(5.,index = [‘a‘,‘b‘,‘c‘])
a 5.0
b 5.0
c 5.0
dtype: float64
ndarray-like operation
python s[0]
0.29942203654066651
python s[:3]
a 0.299422
b 0.593082
c -2.120001
Name: test, dtype: float64
python s[s>0]
a 0.299422
b 0.593082
Name: test, dtype: float64
python s[[1,3]]
b 0.593082 d -2.062322 Name: test, dtype: float64
python np.exp(s)
a 1.349079
b 1.809557
c 0.120032
d 0.127158
e 0.224540
Name: test, dtype: float64
dict-like operation
python s[‘a‘]
0.29942203654066651
python s[‘a‘] = 1.
python s
a 1.000000
b 0.593082
c -2.120001
d -2.062322
e -1.493702
Name: test, dtype: float64
python ‘a‘ in s
True
python s.get(‘a‘)
1.0
python s.get(‘f‘,1)
1
Vectorized operations and label alignment
python s + s
a 2.000000
b 1.186164
c -4.240002
d -4.124643
e -2.987405
Name: test, dtype: float64
python s*2
a 2.000000
b 1.186164
c -4.240002
d -4.124643
e -2.987405
Name: test, dtype: float64
python s[1:] + s[:-1]
a NaN
b 1.186164
c -4.240002
d -4.124643
e NaN
Name: test, dtype: float64