SQLDF 提取值并将其保存到文本文件

Posted

技术标签:

【中文标题】SQLDF 提取值并将其保存到文本文件【英文标题】:SQLDF extracting the values and saving it to a text file 【发布时间】:2021-12-24 11:30:23 【问题描述】:

我将一个 DBF 文件输入到数据框中并运行查询。

这是代码。

from dbf import Table
import pandasql as ps

dfPath1 = Table('filename.dbf')
dfPath1.open()

df1 = pd.DataFrame(dfPath1, columns=['column1', 'column2', 'column3', 'column4'])

hour1 = ps.sqldf("Select df1.date, df1.session_no AS 'session_number', SUM(df1.received) AS 'sales_for_12am', SUM(df1.taxes)+SUM(df1.auto_grat)+SUM(df1.discount) AS 'gross_vat_sales', SUM(df1.taxes) AS 'total_vat', SUM(df1.discount) AS 'discount', SUM(df1.auto_grat) AS 'service_charge' From df1 Where open_time >= '00:00:00' And open_time < '00:59:59' And date= '" + str + "'")
with pd.option_context('display.max_rows', None, 'display.max_columns', None, 'display.width', None):
   hourly1 = hour1.fillna(0)
   print(hourly1)

hour2 = ps.sqldf("Select df1.date, df1.session_no AS 'session_number', SUM(df1.received) AS 'sales_for_1am', SUM(df1.taxes)+SUM(df1.auto_grat)+SUM(df1.discount) AS 'gross_vat_sales', SUM(df1.taxes) AS 'total_vat', SUM(df1.discount) AS 'discount', SUM(df1.auto_grat) AS 'service_charge' From df1 Where open_time >= '01:00:00' And open_time < '01:59:59' And date= '" + str + "'")
with pd.option_context('display.max_rows', None, 'display.max_columns', None, 'display.width', None):
   hourly2 = hour2.fillna(0)
   print(hourly2)

hour3 = ps.sqldf("Select df1.date, df1.session_no AS 'session_number', SUM(df1.received) AS 'sales_for_2am', SUM(df1.taxes)+SUM(df1.auto_grat)+SUM(df1.discount) AS 'gross_vat_sales', SUM(df1.taxes) AS 'total_vat', SUM(df1.discount) AS 'discount', SUM(df1.auto_grat) AS 'service_charge' From df1 Where open_time >= '02:00:00' And open_time < '02:59:59' And date= '" + str + "'")
with pd.option_context('display.max_rows', None, 'display.max_columns', None, 'display.width', None):
   hourly3 = hour3.fillna(0)
   print(hourly3)

hour4 = ps.sqldf("Select df1.date, df1.session_no AS 'session_number', SUM(df1.received) AS 'sales_for_3am', SUM(df1.taxes)+SUM(df1.auto_grat)+SUM(df1.discount) AS 'gross_vat_sales', SUM(df1.taxes) AS 'total_vat', SUM(df1.discount) AS 'discount', SUM(df1.auto_grat) AS 'service_charge' From df1 Where open_time >= '03:00:00' And open_time < '03:59:59' And date= '" + str + "'")
with pd.option_context('display.max_rows', None, 'display.max_columns', None, 'display.width', None):
   hourly4 = hour4.fillna(0)
   print(hourly4)

hour5 = ps.sqldf("Select df1.date, df1.session_no AS 'session_number', SUM(df1.received) AS 'sales_for_4am', SUM(df1.taxes)+SUM(df1.auto_grat)+SUM(df1.discount) AS 'gross_vat_sales', SUM(df1.taxes) AS 'total_vat', SUM(df1.discount) AS 'discount', SUM(df1.auto_grat) AS 'service_charge' From df1 Where open_time >= '04:00:00' And open_time < '04:59:59' And date= '" + str + "'")
with pd.option_context('display.max_rows', None, 'display.max_columns', None, 'display.width', None):
   hourly5 = hour5.fillna(0)
   print(hourly5)

data = [name,hour1.iloc[:,0],hour1.iloc[:,1],hour1.iloc[:,2],hour1.iloc[:,3],hour1.iloc[:,4],hour1.iloc[:,5],hour1.iloc[:,6]]
data2 = [name,hour2.iloc[:,0],hour2.iloc[:,1],hour2.iloc[:,2],hour2.iloc[:,3],hour2.iloc[:,4],hour2.iloc[:,5],hour2.iloc[:,6]]
data3 = [name,hour3.iloc[:,0],hour3.iloc[:,1],hour3.iloc[:,2],hour3.iloc[:,3],hour3.iloc[:,4],hour3.iloc[:,5],hour3.iloc[:,6]]
data4 = [name,hour4.iloc[:,0],hour4.iloc[:,1],hour4.iloc[:,2],hour4.iloc[:,3],hour4.iloc[:,4],hour4.iloc[:,5],hour4.iloc[:,6]]
data5 = [name,hour5.iloc[:,0],hour5.iloc[:,1],hour5.iloc[:,2],hour5.iloc[:,3],hour5.iloc[:,4],hour5.iloc[:,5],hour5.iloc[:,6]]

hour1['name'] = name
hour1.to_csv('sample_output.txt', index=False, sep=' ')

然后得到这样的错误.. KeyError:[Int64Index([0], dtype='int64')] 中没有一个在 [columns] 中

这是我想要的文本文件的输出.. "2020-01-01 943 527.0 56.46 56.46 0.0 0.0"

【问题讨论】:

你可以查看simpledbf包:***.com/questions/41898561/… 我已经看过并尝试了所有这些,但我很难将 dbf 文件输入到数据框中。到目前为止,SQLDF 是可以将 dbf 文件导入数据框的。 为什么标记为 R? 【参考方案1】:

问题很可能出在下面一行

data = [name,hour1[[0]],hour1[[1]],hour1[[2]],hour1[[3]],hour1[[4]],hour1[[5]],hour1[[6]]]

您可以使用iloc访问这些列

data = [name,hour1.iloc[:,0],hour1.iloc[:,1],hour1.iloc[:,2],hour1.iloc[:,3],hour1.iloc[:,4],hour1.iloc[:,5],hour1.iloc[:,6]]

虽然您可以使用 pandas.DataFrame.to_csv 更轻松地写入 csv。 例如,

# add a name column
hour1['name'] = name

# write to csv
hour1.to_csv('sample_output.txt', index=False, sep=' ')

通过使用sep=' ' ,输出将在每一行上用空格分隔,如 OP 所述。


替代方案:

dbfread 包:https://gist.github.com/jamespaultg/990e4650a384ade5c57a2eb56515ba62 https://dbfread.readthedocs.io/en/latest/exporting_data.html#pandas-data-frames

该 sql 查询的 DataFrame 等效操作是:

import pandas as pd

df1['open_time_hr'] = pd.to_datetime(df1['open_time']).dt.hour
df1['gross_nat_value'] = df1.taxes+df1.auto_grat+df1.discount
df_agg = df1.groupby(['date', 'session_no', 'open_time_hr']).sum()
df_agg.to_csv('sample_output.txt', index=False, sep=' ')

【讨论】:

它生成一个文本文件,但文本文件中的值是这样的............Sample Store,"0 None Name: date, dtype: object","0 None Name : session_number, dtype: object","0 None Name: sales_for_12am, dtype: object","0 None Name: Gross_vat_sales, dtype: object","0 None Name: total_vat, dtype: object","0 None Name: discount , dtype: object","0 None Name: service_charge, dtype: object" 文本文件的输出值应该是这样的... "2020-01-01 943 527.0 56.46 56.46 0.0 0.0" 这可能是 sql 的问题。当您打印hour1y1 时,它会显示什么? header = date session_number sales_for_12am Gross_vat_sales total_vat discount service_charge, value = 0 0 0 0 0 0 0 我完成了顺便说一句谢谢你的帮助。

以上是关于SQLDF 提取值并将其保存到文本文件的主要内容,如果未能解决你的问题,请参考以下文章

React Native:如何创建文本文件并将其保存到外部存储

如何从 .t​​xt 中提取文本并将其存储到动态二维数组中?

是否可以将 16 位二进制数转换为两个字符并将其保存到文本文件中?

在 Windows power shell 中,如何提取属性文件值并将其保存到 env var?

使用正则表达式提取文件路径并将其保存在python中

如何写入文本文件并将其保存以供多次使用