panda2
Posted 小麦粒
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了panda2相关的知识,希望对你有一定的参考价值。
‘‘‘panda‘s index objects are responsible for holding the axis labels,like series‘‘‘
import pandas as pd
obj = Series(range(3),index=[‘a‘,‘b‘,‘c‘])
index = obj.index
index
index[1:]
‘‘‘index = immutable‘‘‘
index[1]=‘d‘
‘‘‘so the index can be valued by function‘‘‘
index = pd.Index(np.arange(3))
obj2 = Series([1.5,-2.5,0],index=index)
obj2
‘‘‘ evaluate the attribute of index 判断属性用Is,判断存不存在用in‘‘‘
obj2.index is index
‘Ohio‘ in frame3.columns
‘2002‘ in obj2.index
‘‘‘Essential functionality‘‘‘
‘‘‘reindexing‘‘‘
obj=Series([4.5,7.2,-5.3,3.6],index=[‘d‘,‘b‘,‘a‘,‘c‘])
obj2=obj.reindex([‘a‘,‘b‘,‘c‘,‘d‘,‘e‘])
obj2
‘‘‘fill the missing data‘‘‘
obj.reindex([‘a‘,‘b‘,‘c‘,‘d‘,‘e‘],fill_value = 0.0)
‘‘‘ordering fill the missing data‘‘‘
obj3=Series([‘blue‘,‘green‘,‘black‘],index=[0,2,4])
obj3.reindex(np.arange(5),method=‘ffill‘)
‘‘‘reindex can be alter row,column and both in data frame‘‘‘
frame = DataFrame(np.arange(9).reshape(3,3),index=[‘a‘,‘b‘,‘c‘],columns=[‘Ohio‘,‘Texas‘,‘California‘])
frame.reindex([‘a‘,‘b‘,‘c‘,‘d‘])
frame.reindex(columns=[‘Ohio‘,‘Texas‘,‘California‘,‘NewYork‘])
months = [‘APR‘,‘MAY‘,‘JUN‘,‘JUL‘,‘AUG‘]
frame.reindex(columns=months)
label=[‘a‘,‘b‘,‘c‘,‘d‘,‘e‘]
states=[‘Ohio‘,‘Texas‘,‘California‘,‘NewYork‘]
‘‘‘reindex 仅对x-axis有效‘‘‘
frame.reindex(label,method=‘ffill‘)
‘‘‘取子矩阵‘‘‘
frame.ix([‘a‘,‘b‘,‘d‘],states)
‘‘‘dropping entries from axis‘‘‘
obj = Series(np.arange(5.),index=[‘a‘,‘b‘,‘c‘,‘d‘,‘e‘])
new_obj = obj.drop(‘c‘)
new_obj
‘‘‘drop from data frame‘‘‘
data=DataFrame(np.arange(16).reshape(4,4),index=[‘Ohio‘,‘Colorado‘,‘Utah‘,‘NewYork‘],columns=[‘one‘,‘two‘,‘three‘,‘four‘])
‘‘‘drop from index‘‘‘
data.drop([‘Colorado‘,‘Utah‘])
‘‘‘drop from column‘‘‘
data.drop(‘two‘,axis=1)
‘‘‘index,selection,filtering‘‘‘
obj=Series(np.arange(4.),index=[‘a‘,‘b‘,‘c‘,‘d‘])
‘‘‘index可以像数组一样,通过数字定位,index 定位,取一个数,一串数‘‘‘
obj[‘b‘]
obj[1]
obj[1:2]
obj[[‘a‘,‘c‘,‘d‘]]
obj[[1,3]]
obj[obj < 2]
obj[‘b‘:‘c‘]=5
data=DataFrame(np.arange(16).reshape(4,4),index=[‘Ohio‘,‘Colorado‘,‘Utah‘,‘New York‘],columns=[‘one‘,‘two‘,‘three‘,‘four‘])
‘‘‘follow by columns,但只是单维度的‘‘‘
data[‘two‘]
data[[‘three‘,‘one‘]]
data.ix[‘Ohio‘]
data[data[‘three‘]>5]
data[:2]
‘‘‘把data小于5的赋值0‘‘‘
data[data<5]=0
‘‘‘按照位置选择值‘‘‘
data.ix[‘Colorado‘,‘two‘]
data.ix[‘Colorado‘,[‘two‘,‘three‘]]
data.ix[[‘Colorado‘,‘Utah‘],[‘three‘,‘four‘]]
data.ix[2]
data.ix[:‘Utah‘,‘two‘]
data.ix[:2,‘two‘]
data.ix[data.three>5,:3]
‘‘‘reindex‘‘‘
data.ix[[‘Colorado‘,‘Utah‘],[3,0,1]]
‘‘‘arithmetic and data alignment‘‘‘
s1=Series([7.3,-2.5,3.4,1.5],index=[‘a‘,‘c‘,‘d‘,‘e‘])
s2=Series([-2.1,3.6,-1.5,4,3.1],index=[‘a‘,‘c‘,‘e‘,‘f‘,‘g‘])
‘‘‘not overlap return NA‘‘‘
s1+s2
‘‘‘dataframe‘‘‘
df1=DataFrame(np.arange(9.).reshape(3,3),columns=list(‘bcd‘),index=[‘Ohio‘,‘Texas‘,‘Colorado‘])
df2=DataFrame(np.arange(12.).reshape(4,3),columns=list(‘bde‘),index=[‘Utah‘,‘Ohio‘,‘Texas‘,‘Oregon‘])
df1+df2
‘‘‘只要有一个为空,就是空‘‘‘
df1.add(df2,fill_value=0)
‘‘‘只要有一个有数,另外一个就设为0‘‘‘
‘‘‘reindex‘‘‘
df1.reindex(columns=df2.columns,fill_value=0)
df1 = DataFrame(np.arange(12.).reshape(3,4),columns=list(‘abcd‘))
df2 = DataFrame(np.arange(20.).reshape(4,5),columns=list(‘abcde‘))
df1.add(df2,fill_value=0)
df1.mul(df2,fill_value=0)
df1.div(df2,fill_value=0)
df1.sub(df2,fill_value=0)
以上是关于panda2的主要内容,如果未能解决你的问题,请参考以下文章