Python Pandas:加入唯一列值并连接
Posted
技术标签:
【中文标题】Python Pandas:加入唯一列值并连接【英文标题】:Python Pandas: Join on unique column values and concatenate 【发布时间】:2014-01-25 08:46:30 【问题描述】:我有三个 Pandas 数据框,df1
、df2,
和 df3
,如下:
import pandas as pd
import numpy as np
df1 = pd.DataFrame('id' : ['one', 'two', 'three'], 'score': [56, 45, 78])
df2 = pd.DataFrame('id' : ['one', 'five', 'four'], 'score': [35, 81, 90])
df3 = pd.DataFrame('id' : ['five', 'two', 'six'], 'score': [23, 66, 42])
如何根据id
加入这些数据框,然后将它们的列连接在一起?所需的输出如下:
#join_and_concatenate by id:
id score(df1) score(df2) score(df3)
one 56 35 NaN
two 45 NaN 66
three 78 NaN NaN
four NaN 90 NaN
five NaN 81 23
six NaN NaN 42
我找到了一个相关的page,它谈到了merge()
、concatenate()
和join()
,但我不确定这些是否能满足我的需求。
【问题讨论】:
【参考方案1】:concat
可能有更好的方法,但这应该可行:
In [48]: pd.merge(df1, df2, how='outer', on='id').merge(df3, how='outer', on='id')
Out[48]:
id score_x score_y score
0 one 56 35 NaN
1 two 45 NaN 66
2 three 78 NaN NaN
3 five NaN 81 23
4 four NaN 90 NaN
5 six NaN NaN 42
[6 rows x 4 columns]
要得到你想要的答案:
In [54]: merged = pd.merge(df1, df2, how='outer', on='id').merge(df3, how='outer', on='id')
In [55]: merged.set_index('id').rename(columns='score_x': 'score(df1)', 'score_y': 'score(df2)
', 'score': 'score(df3)')
Out[55]:
score(df1) score(df2) score(df3)
id
one 56 35 NaN
two 45 NaN 66
three 78 NaN NaN
five NaN 81 23
four NaN 90 NaN
six NaN NaN 42
[6 rows x 3 columns]
【讨论】:
以上是关于Python Pandas:加入唯一列值并连接的主要内容,如果未能解决你的问题,请参考以下文章