pandas.concat() 不填充列

Posted

技术标签:

【中文标题】pandas.concat() 不填充列【英文标题】:pandas.concat() does not fill the columns 【发布时间】:2017-08-25 16:31:39 【问题描述】:

我正在尝试按如下方式创建虚拟数据:

import numpy as np
import pandas as pd    

def dummy_historical(seclist, dates, startvalues):

    dfHist = pd.DataFrame(0, index=[0], columns=seclist)

    for sec in seclist:

        # (works fine)
        svalue   = startvalues[sec].max()

        # this creates a random sequency of 84 rows and 1 column (works fine)
        dfRandom = pd.DataFrame(np.random.randint(svalue-10,svalue+10, size=(dates.size, 1 )), index=dates, columns=[sec])

        # does not work
        dfHist[sec] = pd.concat([ dfHist[sec] , dfRandom ])

return dfHist

当我打印dfHist 时,它只显示第一行(与启动时一样)。因此什么都没有填满。


以下是数据示例:

seclist = ['AAPL', 'GOOGL']

# use any number for startvalues

dates = DatetimeIndex(['2017-01-05', '2017-01-06', '2017-01-07', '2017-01-08',
                   '2017-01-09', '2017-01-10', '2017-01-11', '2017-01-12',
                   '2017-01-13', '2017-01-14', '2017-01-15', '2017-01-16',
                   '2017-01-17', '2017-01-18', '2017-01-19', '2017-01-20',
                   '2017-01-21', '2017-01-22', '2017-01-23', '2017-01-24',
                   '2017-01-25', '2017-01-26', '2017-01-27', '2017-01-28',
                   '2017-01-29', '2017-01-30', '2017-01-31', '2017-02-01',
                   '2017-02-02', '2017-02-03', '2017-02-04', '2017-02-05',
                   '2017-02-06', '2017-02-07', '2017-02-08', '2017-02-09',
                   '2017-02-10', '2017-02-11', '2017-02-12', '2017-02-13',
                   '2017-02-14', '2017-02-15', '2017-02-16', '2017-02-17',
                   '2017-02-18', '2017-02-19', '2017-02-20', '2017-02-21',
                   '2017-02-22', '2017-02-23', '2017-02-24', '2017-02-25',
                   '2017-02-26', '2017-02-27', '2017-02-28', '2017-03-01',
                   '2017-03-02', '2017-03-03', '2017-03-04', '2017-03-05',
                   '2017-03-06', '2017-03-07', '2017-03-08', '2017-03-09',
                   '2017-03-10', '2017-03-11', '2017-03-12', '2017-03-13',
                   '2017-03-14', '2017-03-15', '2017-03-16', '2017-03-17',
                   '2017-03-18', '2017-03-19', '2017-03-20', '2017-03-21',
                   '2017-03-22', '2017-03-23', '2017-03-24', '2017-03-25',
                   '2017-03-26', '2017-03-27', '2017-03-28', '2017-03-29'],
                  dtype='datetime64[ns]', freq='D')

【问题讨论】:

你能举一个你给这个函数输入的例子吗? 添加了上面的数据示例。使用任何你想要的起始值(它现在的工作方式)。 【参考方案1】:

如果要连接列,则需要将 axis=1 传递给 concat。另外,你不需要在开始时用数据初始化你的数据框(除非你想有 0 值):

def dummy_historical(seclist, dates, startvalues):

    dfHist = pd.DataFrame()

    for sec in seclist:
        svalue   = startvalues[sec].max()   
        dfRandom = pd.DataFrame(np.random.randint(svalue-10,svalue+10, size=(dates.size, 1 )), index=dates, columns=[sec])
        dfHist = pd.concat([ dfHist , dfRandom ], axis=1)

    return dfHist

你甚至可以用更简洁的方式来避免concat,比如:

def generate(sec):
    svalue = startvalues[sec].max()
    return np.random.randint(svalue-10,svalue+10, size=dates.size)

dfHist = pd.DataFrame(sec: generate(sec) for sec in seclist, index=dates)

【讨论】:

以上是关于pandas.concat() 不填充列的主要内容,如果未能解决你的问题,请参考以下文章

pandas concat/merge/join 多个数据帧,该列只有一列

为 40 个数据帧加速 pandas concat 函数,每个数据帧有 100 万行和 100 列

pandas concat 2个数据框,并在合并数据中添加一列新数据。

pandas 合并数据函数merge join concat combine_first 区分

Python Pandas Concat "WHERE" 满足条件

Pandas concat:ValueError:传递值的形状是blah,索引暗示blah2