熊猫在数据检索后改变列的顺序
Posted
技术标签:
【中文标题】熊猫在数据检索后改变列的顺序【英文标题】:Pandas changing order of columns after data retrieval 【发布时间】:2017-06-16 23:19:30 【问题描述】:我想更改 pd 数据框的列名,但我发现在检索数据后列的顺序发生了变化。以下代码指定行业 ETF 符号并从 yahoo Finance 获取数据。
问题是,一旦我运行代码,例如,'XLY' 不再是数据框中的第一个系列,所以我不能只运行 sec_perf.columns = ['Name1', 'Name2', etc] 作为我通常会,因为它不会正确命名列。我在这里搞砸了什么?
import pandas as pd
import pandas_datareader.data as web
import datetime as datetime
end = datetime.date.today()
secs = ['XLY', 'XLP', 'XLE',
'XLF', 'XLV', 'XLI',
'XLB', 'XLRE', 'XLK', 'XLU']
sec_perf = web.DataReader(secs, 'yahoo',
start = datetime.datetime(2016,12,31),
end = end)['Adj Close']
【问题讨论】:
【参考方案1】:reindex
的解决方案:
print (sec_perf.reindex(columns=secs))
XLY XLP XLE XLF XLV XLI \
Date
2017-01-03 81.879997 51.900002 76.169998 23.510000 69.839996 62.590000
2017-01-04 82.970001 51.900002 76.010002 23.700001 70.389999 62.959999
2017-01-05 82.910004 52.070000 75.820000 23.459999 70.750000 62.779999
2017-01-06 83.320000 52.119999 75.889999 23.540001 70.949997 63.139999
2017-01-09 83.250000 51.700001 74.790001 23.379999 71.250000 62.650002
2017-01-10 83.550003 51.439999 74.110001 23.430000 71.500000 62.910000
2017-01-11 83.730003 51.540001 74.910004 23.580000 70.779999 63.240002
2017-01-12 83.650002 51.490002 74.599998 23.379999 70.849998 62.980000
2017-01-13 83.959999 51.520000 74.379997 23.510000 70.919998 63.220001
2017-01-17 84.099998 52.250000 74.839996 22.950001 70.559998 62.730000
2017-01-18 83.949997 52.430000 74.669998 23.139999 70.470001 62.970001
2017-01-19 83.690002 52.240002 74.260002 23.040001 70.019997 63.430000
2017-01-20 83.930000 52.580002 74.540001 23.150000 69.839996 63.439999
2017-01-23 83.989998 52.560001 73.750000 23.000000 69.550003 63.090000
2017-01-24 84.680000 52.910000 74.559998 23.290001 69.070000 63.720001
2017-01-25 85.199997 52.900002 74.949997 23.680000 69.720001 64.389999
2017-01-26 85.330002 52.669998 75.010002 23.740000 69.180000 64.540001
2017-01-27 85.059998 52.380001 74.230003 23.650000 69.750000 64.489998
2017-01-30 84.959999 52.340000 72.870003 23.459999 69.410004 63.939999
XLB XLRE XLK XLU
Date
2017-01-03 49.990002 30.850000 48.790001 48.450001
2017-01-04 50.720001 31.240000 48.959999 48.630001
2017-01-05 50.570000 31.400000 49.040001 48.680000
2017-01-06 50.619999 31.400000 49.400002 48.830002
2017-01-09 50.610001 31.200001 49.389999 48.189999
2017-01-10 50.639999 30.809999 49.400002 48.040001
2017-01-11 51.049999 30.639999 49.630001 48.540001
2017-01-12 50.950001 30.760000 49.509998 48.580002
2017-01-13 50.869999 30.690001 49.660000 48.509998
2017-01-17 50.639999 30.940001 49.470001 49.040001
2017-01-18 50.959999 31.010000 49.599998 48.980000
2017-01-19 50.639999 30.709999 49.529999 48.549999
2017-01-20 51.090000 30.900000 49.799999 48.639999
2017-01-23 51.189999 31.090000 49.889999 48.389999
2017-01-24 52.509998 31.100000 50.200001 48.380001
2017-01-25 52.860001 30.910000 50.680000 48.380001
2017-01-26 53.000000 30.889999 50.540001 48.400002
2017-01-27 52.810001 30.629999 50.740002 48.389999
2017-01-30 52.270000 30.459999 50.330002 48.430000
【讨论】:
【参考方案2】:使用reindex_axis
sec_perf.reindex_axis(secs, 1)
您也可以使用sec_perf[secs]
来做同样的事情。 But we did this 不久前确定 reindex_axis
最快。
【讨论】:
【参考方案3】:您可以使用字典重命名列,因此无论数据框中的位置如何:
df = df.rename(columns='oldName1': 'newName1', 'oldName2': 'newName2')
【讨论】:
以上是关于熊猫在数据检索后改变列的顺序的主要内容,如果未能解决你的问题,请参考以下文章
wpf 通过为DataGrid所绑定的数据源类型的属性设置Attribute改变DataGrid自动生成列的顺序