Python Pandas:如何添加另一个名称的多索引?
Posted
技术标签:
【中文标题】Python Pandas:如何添加另一个名称的多索引?【英文标题】:Python Pandas: How to add another name of multiindex? 【发布时间】:2021-07-28 00:33:06 【问题描述】:我在玩 Pandas 中的加密数据。合并几个数据框后,我得到了这个
timestamp open high low close volume open high low close volume
0 1620202740000 54945.31 54987.01 54945.30 54978.49 118.239 54945.31 54987.01 54945.30 54978.49 4345
1 1620202800000 54978.49 55054.00 54972.04 55027.12 337.619 54945.31 54987.01 54945.30 54978.49 134.239
2 1620202860000 55027.12 55041.05 54950.05 54951.96 131.414 54945.31 54987.01 54945.30 54978.49 14358.239
3 1620202920000 54951.96 55067.36 54951.95 55063.78 176.529 54945.31 54987.01 54945.30 54978.49 1148.239
4 1620202980000 55063.79 55064.00 55000.00 55014.39 107.082 54945.31 54987.01 54945.30 54978.49 18.239
我想在顶部添加另一个级别的索引,所以它会像
btc btc btc btc btc eth eth eth eth eth
timestamp open high low close volume open high low close volume
0 1620202740000 54945.31 54987.01 54945.30 54978.49 118.239 54945.31 54987.01 54945.30 54978.49 4345
1 1620202800000 54978.49 55054.00 54972.04 55027.12 337.619 54945.31 54987.01 54945.30 54978.49 134.239
2 1620202860000 55027.12 55041.05 54950.05 54951.96 131.414 54945.31 54987.01 54945.30 54978.49 14358.239
3 1620202920000 54951.96 55067.36 54951.95 55063.78 176.529 54945.31 54987.01 54945.30 54978.49 1148.239
4 1620202980000 55063.79 55064.00 55000.00 55014.39 107.082 54945.31 54987.01 54945.30 54978.49 18.239
所以我很容易像这样添加更多列:
for x in ['btc', 'eth']:
df.loc[:, (x, 'fast_ema_1min')] = df[x]['close'].rolling(window=1).mean()
df.loc[:, (x, 'slow_ema_20min')] = df[x]['close'].rolling(window=20).mean()
有人可以建议吗?谢谢。
【问题讨论】:
【参考方案1】:为了完整起见,如果用expand=True
拆分列,它们将扩展为MultiIndex
:
df = df.set_index('timestamp')
df.columns = [pre+col for pre,col in zip(['btc_']*5 + ['eth_']*5, df.columns)]
df.columns = df.columns.str.split('_', expand=True)
# btc eth
# open high low close volume open high low close volume
# timestamp
# 1620202740000 54945.31 54987.01 54945.30 54978.49 118.239 54945.31 54987.01 54945.3 54978.49 4345.000
# 1620202800000 54978.49 55054.00 54972.04 55027.12 337.619 54945.31 54987.01 54945.3 54978.49 134.239
# 1620202860000 55027.12 55041.05 54950.05 54951.96 131.414 54945.31 54987.01 54945.3 54978.49 14358.239
# 1620202920000 54951.96 55067.36 54951.95 55063.78 176.529 54945.31 54987.01 54945.3 54978.49 1148.239
# 1620202980000 55063.79 55064.00 55000.00 55014.39 107.082 54945.31 54987.01 54945.3 54978.49 18.239
【讨论】:
【参考方案2】:您可以通过以下几种方式创建MultiIndex
:
new_columns = pd.MultiIndex.from_arrays([
(["btc"] * 5) + (["eth"] * 5),
df.columns[1:] # exclude "timestamp" from our new columns
])
new_df = df.set_index("timestamp").set_axis(new_columns, axis=1)
print(new_df)
btc eth
open high low close volume open high low close volume
timestamp
1620202740000 54945.31 54987.01 54945.30 54978.49 118.239 54945.31 54987.01 54945.3 54978.49 4345.000
1620202800000 54978.49 55054.00 54972.04 55027.12 337.619 54945.31 54987.01 54945.3 54978.49 134.239
1620202860000 55027.12 55041.05 54950.05 54951.96 131.414 54945.31 54987.01 54945.3 54978.49 14358.239
1620202920000 54951.96 55067.36 54951.95 55063.78 176.529 54945.31 54987.01 54945.3 54978.49 1148.239
1620202980000 55063.79 55064.00 55000.00 55014.39 107.082 54945.31 54987.01 54945.3 54978.49 18.239
或者,您可以像这样使用MultiIndex.from_product
:
new_columns = pd.MultiIndex.from_product([
["btc", "eth"],
["open", "high", "low", "close", "volume"]
])
# same as above
new_df = df.set_index("timestamp").set_axis(new_columns, axis=1)
【讨论】:
谢谢@Cameron,它就像一个魅力:)以上是关于Python Pandas:如何添加另一个名称的多索引?的主要内容,如果未能解决你的问题,请参考以下文章
Pandas - 基于另一列(城市名称)创建一个新列(分支名称)
如何在 Python 中使用 Pandas 创建会计年度列?
如何从 python pandas 中的另一个数据框中检索数据? [复制]