熊猫在数据检索后改变列的顺序

Posted

技术标签:

【中文标题】熊猫在数据检索后改变列的顺序【英文标题】:Pandas changing order of columns after data retrieval 【发布时间】:2017-06-16 23:19:30 【问题描述】:

我想更改 pd 数据框的列名,但我发现在检索数据后列的顺序发生了变化。以下代码指定行业 ETF 符号并从 yahoo Finance 获取数据。

问题是,一旦我运行代码,例如,'XLY' 不再是数据框中的第一个系列,所以我不能只运行 sec_perf.columns = ['Name1', 'Name2', etc] 作为我通常会,因为它不会正确命名列。我在这里搞砸了什么?

import pandas as pd
import pandas_datareader.data as web
import datetime as datetime

end = datetime.date.today()
secs = ['XLY', 'XLP', 'XLE', 
       'XLF', 'XLV', 'XLI', 
       'XLB', 'XLRE', 'XLK', 'XLU']

sec_perf = web.DataReader(secs, 'yahoo', 
           start = datetime.datetime(2016,12,31), 
           end = end)['Adj Close']

【问题讨论】:

【参考方案1】:

reindex 的解决方案:

print (sec_perf.reindex(columns=secs))
                  XLY        XLP        XLE        XLF        XLV        XLI  \
Date                                                                           
2017-01-03  81.879997  51.900002  76.169998  23.510000  69.839996  62.590000   
2017-01-04  82.970001  51.900002  76.010002  23.700001  70.389999  62.959999   
2017-01-05  82.910004  52.070000  75.820000  23.459999  70.750000  62.779999   
2017-01-06  83.320000  52.119999  75.889999  23.540001  70.949997  63.139999   
2017-01-09  83.250000  51.700001  74.790001  23.379999  71.250000  62.650002   
2017-01-10  83.550003  51.439999  74.110001  23.430000  71.500000  62.910000   
2017-01-11  83.730003  51.540001  74.910004  23.580000  70.779999  63.240002   
2017-01-12  83.650002  51.490002  74.599998  23.379999  70.849998  62.980000   
2017-01-13  83.959999  51.520000  74.379997  23.510000  70.919998  63.220001   
2017-01-17  84.099998  52.250000  74.839996  22.950001  70.559998  62.730000   
2017-01-18  83.949997  52.430000  74.669998  23.139999  70.470001  62.970001   
2017-01-19  83.690002  52.240002  74.260002  23.040001  70.019997  63.430000   
2017-01-20  83.930000  52.580002  74.540001  23.150000  69.839996  63.439999   
2017-01-23  83.989998  52.560001  73.750000  23.000000  69.550003  63.090000   
2017-01-24  84.680000  52.910000  74.559998  23.290001  69.070000  63.720001   
2017-01-25  85.199997  52.900002  74.949997  23.680000  69.720001  64.389999   
2017-01-26  85.330002  52.669998  75.010002  23.740000  69.180000  64.540001   
2017-01-27  85.059998  52.380001  74.230003  23.650000  69.750000  64.489998   
2017-01-30  84.959999  52.340000  72.870003  23.459999  69.410004  63.939999   

                  XLB       XLRE        XLK        XLU  
Date                                                    
2017-01-03  49.990002  30.850000  48.790001  48.450001  
2017-01-04  50.720001  31.240000  48.959999  48.630001  
2017-01-05  50.570000  31.400000  49.040001  48.680000  
2017-01-06  50.619999  31.400000  49.400002  48.830002  
2017-01-09  50.610001  31.200001  49.389999  48.189999  
2017-01-10  50.639999  30.809999  49.400002  48.040001  
2017-01-11  51.049999  30.639999  49.630001  48.540001  
2017-01-12  50.950001  30.760000  49.509998  48.580002  
2017-01-13  50.869999  30.690001  49.660000  48.509998  
2017-01-17  50.639999  30.940001  49.470001  49.040001  
2017-01-18  50.959999  31.010000  49.599998  48.980000  
2017-01-19  50.639999  30.709999  49.529999  48.549999  
2017-01-20  51.090000  30.900000  49.799999  48.639999  
2017-01-23  51.189999  31.090000  49.889999  48.389999  
2017-01-24  52.509998  31.100000  50.200001  48.380001  
2017-01-25  52.860001  30.910000  50.680000  48.380001  
2017-01-26  53.000000  30.889999  50.540001  48.400002  
2017-01-27  52.810001  30.629999  50.740002  48.389999  
2017-01-30  52.270000  30.459999  50.330002  48.430000  

【讨论】:

【参考方案2】:

使用reindex_axis

sec_perf.reindex_axis(secs, 1)

您也可以使用sec_perf[secs] 来做同样的事情。 But we did this 不久前确定 reindex_axis 最快。

【讨论】:

【参考方案3】:

您可以使用字典重命名列,因此无论数据框中的位置如何:

df = df.rename(columns='oldName1': 'newName1', 'oldName2': 'newName2')

【讨论】:

以上是关于熊猫在数据检索后改变列的顺序的主要内容,如果未能解决你的问题,请参考以下文章

熊猫数据框 - 按字符串过滤/选择列是不是保留顺序?

按给定列表的顺序选择重复的熊猫数据框行并保留原始索引

python pandas 读取csv后怎么改变列的顺序?

wpf 通过为DataGrid所绑定的数据源类型的属性设置Attribute改变DataGrid自动生成列的顺序

vs2017 c#窗口应用程序 datagridview查询到mysql数据,如何改变列字段的顺序

pandas 中 DataFramt 改变 列的顺序