pandas 合并两个多层次系列
Posted
技术标签:
【中文标题】pandas 合并两个多层次系列【英文标题】:pandas merging two multi-level series 【发布时间】:2016-11-24 13:01:13 【问题描述】:我有两个多级 Series
并希望根据两个索引合并它们。第一个Series
看起来像这样:
# of restaurants
BORO CUISINE
BRONX American 425
Chinese 330
Pizza 206
***LYN American 1254
Chinese 750
Cafe/Coffee/Tea 350
第二个有更多的行,是这样的:
# of votes
BORO CUISINE
BRONX American 2425
Caribbean 320
Chinese 3130
Pizza 3336
***LYN American 21254
Caribbean 2320
Chinese 7250
Cafe/Coffee/Tea 3350
Pizza 13336
【问题讨论】:
【参考方案1】:设置:
s1 = pd.Series(('BRONX', 'American'): 425, ('***LYN', 'Chinese'): 750, ('***LYN', 'Cafe/Coffee/Tea'): 350, ('BRONX', 'Pizza'): 206, ('***LYN', 'American'): 1254, ('BRONX', 'Chinese'): 330)
s2 = pd.Series(('BRONX', 'Caribbean'): 320, ('BRONX', 'American'): 2425, ('***LYN', 'Chinese'): 7250, ('***LYN', 'Cafe/Coffee/Tea'): 3350, ('BRONX', 'Pizza'): 3336, ('***LYN', 'American'): 21254, ('***LYN', 'Pizza'): 13336, ('BRONX', 'Chinese'): 3130, ('***LYN', 'Caribbean'): 2320)
s1 = s1.rename_axis(['BORO','CUISINE']).rename('restaurants')
s2 = s2.rename_axis(['BORO','CUISINE']).rename('votes')
print (s1)
BORO CUISINE
BRONX American 425
Chinese 330
Pizza 206
***LYN American 1254
Chinese 750
Cafe/Coffee/Tea 350
Name: restaurants, dtype: int64
print (s2)
BORO CUISINE
BRONX American 2425
Caribbean 320
Chinese 3130
Pizza 3336
***LYN American 21254
Caribbean 2320
Chinese 7250
Cafe/Coffee/Tea 3350
Pizza 13336
Name: votes, dtype: int64
如果需要inner join
,请使用concat
和参数join
:
print (pd.concat([s1,s2], axis=1, join='inner'))
restaurants votes
BORO CUISINE
BRONX American 425 2425
Chinese 330 3130
Pizza 206 3336
***LYN American 1254 21254
Cafe/Coffee/Tea 350 3350
Chinese 750 7250
#join='outer' is by default, so can be omited
print (pd.concat([s1,s2], axis=1))
restaurants votes
BORO CUISINE
BRONX American 425.0 2425
Caribbean NaN 320
Chinese 330.0 3130
Pizza 206.0 3336
***LYN American 1254.0 21254
Cafe/Coffee/Tea 350.0 3350
Caribbean NaN 2320
Chinese 750.0 7250
Pizza NaN 13336
另一种解决方案是使用merge
和reset_index
:
#by default how='inner', so can be omited
print (pd.merge(s1.reset_index(), s2.reset_index(), on=['BORO','CUISINE']))
BORO CUISINE restaurants votes
0 BRONX American 425 2425
1 BRONX Chinese 330 3130
2 BRONX Pizza 206 3336
3 ***LYN American 1254 21254
4 ***LYN Chinese 750 7250
5 ***LYN Cafe/Coffee/Tea 350 3350
#outer join
print (pd.merge(s1.reset_index(), s2.reset_index(), on=['BORO','CUISINE'], how='outer'))
BORO CUISINE restaurants votes
0 BRONX American 425.0 2425
1 BRONX Chinese 330.0 3130
2 BRONX Pizza 206.0 3336
3 ***LYN American 1254.0 21254
4 ***LYN Chinese 750.0 7250
5 ***LYN Cafe/Coffee/Tea 350.0 3350
6 BRONX Caribbean NaN 320
7 ***LYN Caribbean NaN 2320
8 ***LYN Pizza NaN 13336
【讨论】:
以上是关于pandas 合并两个多层次系列的主要内容,如果未能解决你的问题,请参考以下文章
Pandas - 我想要的只是一个单一的系列值输出,而不是其他元数据