如何在python中使用pandas在现有的excel工作表中追加列

Posted

技术标签:

【中文标题】如何在python中使用pandas在现有的excel工作表中追加列【英文标题】:how to append columns in existing excel sheet using panda in python 【发布时间】:2017-08-25 09:25:48 【问题描述】:
import pandas as pd

from pandas import ExcelWriter

trans=pd.read_csv('HMIS-DICR-2011-12-Manipur-Bishnupur.csv')
df=trans[["April 10-11","May 10-11","June 10-11","July 10-11","August 10-11","September 10-11","October 10-11","November 10-11","December 10-11","January 10-11","February 10-11","March 10-11","April 11-12","May 11-12","June 11-12","July 11-12","August 11-12","September 11-12","October 11-12","November 11-12","December 11-12","January 11-12","February 11-12","March 11-12"]]
writer1 = ExcelWriter('manipur1.xlsx')
df.to_excel(writer1,'Sheet1',index=False)
writer1.save()

此代码成功将数据写入工作表 1,但如何将来自不同 excel 文件(下文提及)的另一个数据框(df)的数据附加到现有工作表(工作表 1)“manipur1”excel 文件中

例如: 我的数据框是这样的:

 trans=pd.read_csv('HMIS-DICR-2013-2014-Manipur-Bishnupur.csv')
    df=trans[["April 12-13","May 12-13","June 12-13","July 12-13","August 12-13","September 12-13","October 12-13","November 12-13","December 12-13","January 12-13","February 12-13","March 12-13","April 13-14","May 13-14","June 13-14","July 13-14","August 13-14","September 13-14","October 13-14","November 13-14","December 13-14","January 13-14","February 13-14","March 13-14"]]

【问题讨论】:

【参考方案1】:

您只能将新数据附加到现有的 excel 文件中,同时将现有数据加载到 pandas、附加新数据并再次保存连接的数据框。

要保留应该保持不变的现有工作表,您需要遍历整个工作簿并处理每个工作表。要更改和附加的工作表在 to_update 字典中定义。

# get data to be appended
trans=pd.read_csv('HMIS-DICR-2011-12-Manipur-Bishnupur.csv')
df_append = trans[["April 12-13","May 12-13","June 12-13","July 12-13","August 12-13","September 12-13","October 12-13","November 12-13","December 12-13","January 12-13","February 12-13","March 12-13","April 13-14","May 13-14","June 13-14","July 13-14","August 13-14","September 13-14","October 13-14","November 13-14","December 13-14","January 13-14","February 13-14","March 13-14"]]

# define what sheets to update
to_update = "Sheet1": df_append

# load existing data
file_name = 'manipur1.xlsx'
excel_reader = pd.ExcelFile(file_name)

# write and update
excel_writer = pd.ExcelWriter(file_name)

for sheet in excel_reader.sheet_names:
    sheet_df = excel_reader.parse(sheet)
    append_df = to_update.get(sheet)

    if append_df is not None:
        sheet_df = pd.concat([sheet_df, append_df], axis=1)

    sheet_df.to_excel(excel_writer, sheet, index=False)

excel_writer.save()

但是,现有 Excel 中的任何布局/格式都将丢失。如果要保留格式,可以使用openpyxl,但这更复杂。

【讨论】:

在 1 个工作簿中,我有 21 张工作表,执行后只删除了剩下的 1 张工作表。concat 像连接函数一样执行。行也附加例如:df_append 有 452*24 行col 和 df_temp 有 452*12 行col 所以 concat 的结果必须是 452*36 并且这段代码给出 904*36..

以上是关于如何在python中使用pandas在现有的excel工作表中追加列的主要内容,如果未能解决你的问题,请参考以下文章

pandas如何在现有的Excel表格上新建工作表并添加dataframe

使用 Python 在现有的 excel 文件中修改和写入数据

如何在现有的 Hadoop 2.x 中使用 spark

如何在现有的 Web 应用程序中使用 apache spark

如何在现有的 QSettings 文件中添加组

如何在现有的基于 Storyboard 的项目中设置 @EnvironmentObject?