将单个 Pandas 数据框转换为多年数据框的功能

Posted 2023-03-12

技术标签:

【中文标题】将单个 Pandas 数据框转换为多年数据框的功能【英文标题】：Function to turn single Pandas dataframe into multi-year dataframe 【发布时间】：2017-10-18 20:37:45 【问题描述】：

我有这个 Pandas 数据框，它是一年的快照：

data = pd.DataFrame('ID' : (1, 2),
                'area': (2, 3),
                'population' : (100, 200),
                'demand' : (100, 200))

我想将其制成一个时间序列，其中人口每年增长 10%，需求每年增长 20%。在这个例子中，我会多做两年。

这应该是输出（注意：它包括一个添加的“年份”列）：

output = pd.DataFrame('ID': (1,2,1,2,1,2),
                'year': (1,1,2,2,3,3),
                'area': (2,3,2,3,2,3),
                'population': (100,200,110,220,121,242),
                'demand': (100,200,120,240,144,288))

【问题讨论】：

【参考方案1】：

设置变量：

k = 5     #Number of years to forecast
a = 1.20 #Demand Growth
b = 1.10 #Population Growth

预测数据框：

df_out = (data[['ID','area']].merge(pd.concat([(data[['demand','population']].mul([pow(a,i),pow(b,i)])).assign(year=i+1) for i in range(k)]), 
                           left_index=True, right_index=True)
                    .sort_values(by='year'))

print(df_out)

输出：

   ID  area  demand  population  year
0   1     2  100.00      100.00     1
1   2     3  200.00      200.00     1
0   1     2  120.00      110.00     2
1   2     3  240.00      220.00     2
0   1     2  144.00      121.00     3
1   2     3  288.00      242.00     3
0   1     2  172.80      133.10     4
1   2     3  345.60      266.20     4
0   1     2  207.36      146.41     5
1   2     3  414.72      292.82     5

【讨论】：

【参考方案2】： 使用[1.1, 1.2] 创建一个numpy 数组，我重复该数组和cumprod 添加一组 [1.0, 1.0] 以说明初始条件乘以方便堆叠的pd.Series 的值操作成pd.DataFrame构造函数清理索引等等

k = 5
cols = ['ID', 'area']
cum_ret = np.vstack(
    [np.ones((1, 2)), np.array([[1.2, 1.1]]
)[[0] * k].cumprod(0)])[:, [0, 0, 1, 1]]
s = data.set_index(cols).unstack(cols) 

pd.DataFrame(
    cum_ret * s.values,
    columns=s.index
).stack(cols).reset_index(cols).reset_index(drop=True)

    ID  area   demand  population
0    1     2  100.000     100.000
1    2     3  200.000     200.000
2    1     2  120.000     110.000
3    2     3  240.000     220.000
4    1     2  144.000     121.000
5    2     3  288.000     242.000
6    1     2  172.800     133.100
7    2     3  345.600     266.200
8    1     2  207.360     146.410
9    2     3  414.720     292.820
10   1     2  248.832     161.051
11   2     3  497.664     322.102

【讨论】：

以上是关于将单个 Pandas 数据框转换为多年数据框的功能的主要内容，如果未能解决你的问题，请参考以下文章