将多个 DataFrame 附加到多个现有的 Excel 工作表
Posted
技术标签:
【中文标题】将多个 DataFrame 附加到多个现有的 Excel 工作表【英文标题】:Append multiple DataFrame to multiple existing excel sheets 【发布时间】:2021-11-21 17:15:18 【问题描述】:我与多个送货和多个地址有关系。
我为每个区域(5 个区域)制作了一个数据透视表列表
在 jupyter notebook 中使用“for”,每个列表中的每个项目都显示为一个单独的数据透视表,一个在另一个之上,就像我需要的那样。
但是如何将它们保存在 5 张的 Excel 中?
我已经尝试了所有方法,只保存使用每个区域列表创建的最后一个数据透视,或者保存现有数据并删除所有数据。
我目前为每个区域创建了一个空电子表格,在 D 列第 1 行中只有一个标题。
expedition.xlsx(内有5张,北、东北、中西、东南、南)
当我尝试保存时,它最终会删除其他人并仅保留“北”
我设置了一个规则来识别D列是否有一个空单元格,如果它被填充,则通过向下跳过一行再试一次,如果它是空的,理论上它应该用列表的数据框填充。
in this image it is like jupyter notebook display, as I would like it to be saved in excel (one pivot below the other, with 2 spaces)
Using openpyxl I managed to make the rule work and fill the first column below the spacing with an example 'aaaaaa' without needing to delete the other sheets
如何将一个数据透视表填充到另一个下方?对于每个区域和列表项。
代码:https://pastebin.com/Cx3Zvf6D
import pandas as pd
import openpyxl
writer = pd.ExcelWriter("expedition.xlsx", engine='xlsxwriter')
# Creating the base sheet for each region, empty
pivot1 = pd.DataFrame('Lista de Romaneio para Região Norte': [' '])
pivot2 = pd.DataFrame('Lista de Romaneio para Região Nordeste': [' '])
pivot3 = pd.DataFrame('Lista de Romaneio para Região Centro Oeste': [''])
pivot4 = pd.DataFrame('Lista de Romaneio para Região Sudeste': [' '])
pivot5 = pd.DataFrame('Lista de Romaneio para Região Sul': [' '])
# creating a sheet in the spreadsheet for each region, with the title in column D, row 1
pivot1.to_excel(writer, sheet_name='Norte', index=False, startcol=3, freeze_panes=(1,0))
pivot2.to_excel(writer, sheet_name='Nordeste', index=False, startcol=3, freeze_panes=(1,0))
pivot3.to_excel(writer, sheet_name='Centro Oeste', index=False, startcol=3, freeze_panes=(1,0))
pivot4.to_excel(writer, sheet_name='Sudeste', index=False, startcol=3, freeze_panes=(1,0))
pivot5.to_excel(writer, sheet_name='Sul', index=False, startcol=3, freeze_panes=(1,0))
writer.close()
# List with the "keys" of each pivot table for each region
norte = ['PA_BEL', 'TO_PMW', 'AC_RBR']
nordeste = ['AL_MCZ', 'PB_JPA', 'BA_SSA', 'RN_NAT', 'PE_REC', 'CE_FOR', 'MA_IMP', 'MA_THE', 'PI_THE', 'BA_FEC']
centro_oeste = ['GO_GYN', 'DF_BSB', 'GO_BSB', 'MT_CGB', 'MS_CGR']
sudeste = ['ES_SRR', 'MG_BHZ', 'SP_PNM', 'SP_JDI', 'RJ_RIO', 'MG_UDI']
sul = ['RS_POA', 'PR_CWB', 'SC_CCM', 'RS_RIA', 'SC_FLN']
# example for the north (norte) region
if len(norte) > 0:
frete_expresso_norte = 0
for filial in norte:
# creating a pivot table for each flilial(key)
pivot1 = df[df.Filial_Transportador == filial].pivot_table(
index=['BU', 'Sold to Region', 'Filial_Transportador', 'Sold_to_Name', 'Sold to City', 'Delivery'],
values=['Quantidade','Volume', 'Palete', 'Net Value'], aggfunc='sum',
margins=True)
# reorders columns and renames ALL of pivot table to Total
ordem_das_colunas = ['Quantidade', 'Volume', 'Palete', 'Net Value']
pivot1 = pivot1[ordem_das_colunas].rename(index=dict(All='Total Romaneio'))
# creating subtotals and finding express shipping (if any)
total_palete_norte = pivot1.groupby('Filial_Transportador')['Palete'].sum()[1]
total_net_value_norte = pivot1.groupby('Filial_Transportador')['Net Value'].sum()[1]
# save the pivot table (pivot1) in excel in the north sheet where it is blank
# Here's where I want to put the code below saving the pivot table before starting the creation of the next one.
# code under construction
# after saving it continues normally
if total_palete_norte >= 29 or total_net_value_norte >= 2500000:
frete_expresso_norte = frete_expresso_norte + 1
else:
pass
display(pivot1)
else:
expedicao_norte = 'Não há volume para ser expedido à região Norte'
# Code under construction to insert inside the loop:
import openpyxl
# opening the spreadsheet with specific name
n = 0 # 0 = Norte / 1 = Nordeste / 2 = Centro Oeste / 3 = Sudeste / 4 = Sul
planilha_cx = openpyxl.load_workbook("Expedition.xlsx")
folhas = planilha_cx.sheetnames
folha = planilha_cx[folhas[n]]
# reading the cell
coluna = 4 # column D of the selected sheet
linha = 1 # start on the first line of the sheet
celula = folha.cell(linha, coluna).value
while celula != None: # looping while cell in column D is not blank
celula = folha.cell(linha, coluna).value # cell current value
if celula == None: # filling the cell if it is blank
linha = linha + 2
folha.cell(row=linha, column=1).value = 'aaaaaaa' # inserts the word 'aaaaa' but doesn't work with pivot1
planilha_cx.save("Expedition.xlsx")
break
else: # while cell D1 is not blank, add +1 to row
linha = linha + 1
pass
【问题讨论】:
【参考方案1】:问题是,当加载到工作簿时,您正在为每个加载创建一个具有新工作表名称的新对象(您在某种程度上解决了这个问题)。 诀窍是让 python/pandas 知道,您正在使用不同的表格上传到同一本书。
尝试使用此代码:
from openpyxl import load_workbook
book = load_workbook(your_destination_file)
writer = pd.ExcelWriter(your_destination_file, engine='openpyxl')
writer.book = book
writer.sheets = dict((ws.title, ws) for ws in book.worksheets) # tells
pandas/python what the sheet names are
Your_dataframe.to_excel(writer, sheet_name=DesiredSheetname, startcol=3,
freeze_panes=(1,0))
writer.save()
以上代码应该可以解决您的问题。
【讨论】:
感谢您的帮助,这应该至少可以解决一个问题。您知道如何将数据透视表保存到现有的 excel 文件中吗? Pandas 有一个可以使用的内置数据透视函数。直播对你来说重要吗?您可以考虑将您的 pandas 输出保存为 csv 文件,然后使用“获取数据”功能作为表格将它们加载到 excel 中。这样您就不会将框架写入 excel,而是将它们导入,并且您的所有图片/轴等仍然存在。 有趣,所以我可以为每个数据透视创建很多 csv 文件,它们将数据放入主 excel 文件并在之后删除它们(它将每天运行 1 次)。但首先我不想尝试这个内置的枢轴功能,它是如何工作的?我正在使用 pivot_table 函数创建枢轴,但我无法复制 我认为你最好选择后者。这样,您可以每天将 5 个 csv 文件转储到同一个位置,让它们覆盖旧文件,然后只需在 excel 中单击“全部刷新”,即可更新所有 5 个数据透视表。以上是关于将多个 DataFrame 附加到多个现有的 Excel 工作表的主要内容,如果未能解决你的问题,请参考以下文章
将多个字典附加到 Pandas 数据框:错误 DataFrame 构造函数未正确调用?