SQLDF 提取值并将其保存到文本文件
Posted
技术标签:
【中文标题】SQLDF 提取值并将其保存到文本文件【英文标题】:SQLDF extracting the values and saving it to a text file 【发布时间】:2021-12-24 11:30:23 【问题描述】:我将一个 DBF 文件输入到数据框中并运行查询。
这是代码。
from dbf import Table
import pandasql as ps
dfPath1 = Table('filename.dbf')
dfPath1.open()
df1 = pd.DataFrame(dfPath1, columns=['column1', 'column2', 'column3', 'column4'])
hour1 = ps.sqldf("Select df1.date, df1.session_no AS 'session_number', SUM(df1.received) AS 'sales_for_12am', SUM(df1.taxes)+SUM(df1.auto_grat)+SUM(df1.discount) AS 'gross_vat_sales', SUM(df1.taxes) AS 'total_vat', SUM(df1.discount) AS 'discount', SUM(df1.auto_grat) AS 'service_charge' From df1 Where open_time >= '00:00:00' And open_time < '00:59:59' And date= '" + str + "'")
with pd.option_context('display.max_rows', None, 'display.max_columns', None, 'display.width', None):
hourly1 = hour1.fillna(0)
print(hourly1)
hour2 = ps.sqldf("Select df1.date, df1.session_no AS 'session_number', SUM(df1.received) AS 'sales_for_1am', SUM(df1.taxes)+SUM(df1.auto_grat)+SUM(df1.discount) AS 'gross_vat_sales', SUM(df1.taxes) AS 'total_vat', SUM(df1.discount) AS 'discount', SUM(df1.auto_grat) AS 'service_charge' From df1 Where open_time >= '01:00:00' And open_time < '01:59:59' And date= '" + str + "'")
with pd.option_context('display.max_rows', None, 'display.max_columns', None, 'display.width', None):
hourly2 = hour2.fillna(0)
print(hourly2)
hour3 = ps.sqldf("Select df1.date, df1.session_no AS 'session_number', SUM(df1.received) AS 'sales_for_2am', SUM(df1.taxes)+SUM(df1.auto_grat)+SUM(df1.discount) AS 'gross_vat_sales', SUM(df1.taxes) AS 'total_vat', SUM(df1.discount) AS 'discount', SUM(df1.auto_grat) AS 'service_charge' From df1 Where open_time >= '02:00:00' And open_time < '02:59:59' And date= '" + str + "'")
with pd.option_context('display.max_rows', None, 'display.max_columns', None, 'display.width', None):
hourly3 = hour3.fillna(0)
print(hourly3)
hour4 = ps.sqldf("Select df1.date, df1.session_no AS 'session_number', SUM(df1.received) AS 'sales_for_3am', SUM(df1.taxes)+SUM(df1.auto_grat)+SUM(df1.discount) AS 'gross_vat_sales', SUM(df1.taxes) AS 'total_vat', SUM(df1.discount) AS 'discount', SUM(df1.auto_grat) AS 'service_charge' From df1 Where open_time >= '03:00:00' And open_time < '03:59:59' And date= '" + str + "'")
with pd.option_context('display.max_rows', None, 'display.max_columns', None, 'display.width', None):
hourly4 = hour4.fillna(0)
print(hourly4)
hour5 = ps.sqldf("Select df1.date, df1.session_no AS 'session_number', SUM(df1.received) AS 'sales_for_4am', SUM(df1.taxes)+SUM(df1.auto_grat)+SUM(df1.discount) AS 'gross_vat_sales', SUM(df1.taxes) AS 'total_vat', SUM(df1.discount) AS 'discount', SUM(df1.auto_grat) AS 'service_charge' From df1 Where open_time >= '04:00:00' And open_time < '04:59:59' And date= '" + str + "'")
with pd.option_context('display.max_rows', None, 'display.max_columns', None, 'display.width', None):
hourly5 = hour5.fillna(0)
print(hourly5)
data = [name,hour1.iloc[:,0],hour1.iloc[:,1],hour1.iloc[:,2],hour1.iloc[:,3],hour1.iloc[:,4],hour1.iloc[:,5],hour1.iloc[:,6]]
data2 = [name,hour2.iloc[:,0],hour2.iloc[:,1],hour2.iloc[:,2],hour2.iloc[:,3],hour2.iloc[:,4],hour2.iloc[:,5],hour2.iloc[:,6]]
data3 = [name,hour3.iloc[:,0],hour3.iloc[:,1],hour3.iloc[:,2],hour3.iloc[:,3],hour3.iloc[:,4],hour3.iloc[:,5],hour3.iloc[:,6]]
data4 = [name,hour4.iloc[:,0],hour4.iloc[:,1],hour4.iloc[:,2],hour4.iloc[:,3],hour4.iloc[:,4],hour4.iloc[:,5],hour4.iloc[:,6]]
data5 = [name,hour5.iloc[:,0],hour5.iloc[:,1],hour5.iloc[:,2],hour5.iloc[:,3],hour5.iloc[:,4],hour5.iloc[:,5],hour5.iloc[:,6]]
hour1['name'] = name
hour1.to_csv('sample_output.txt', index=False, sep=' ')
然后得到这样的错误.. KeyError:[Int64Index([0], dtype='int64')] 中没有一个在 [columns] 中
这是我想要的文本文件的输出.. "2020-01-01 943 527.0 56.46 56.46 0.0 0.0"
【问题讨论】:
你可以查看simpledbf
包:***.com/questions/41898561/…
我已经看过并尝试了所有这些,但我很难将 dbf 文件输入到数据框中。到目前为止,SQLDF 是可以将 dbf 文件导入数据框的。
为什么标记为 R?
【参考方案1】:
问题很可能出在下面一行
data = [name,hour1[[0]],hour1[[1]],hour1[[2]],hour1[[3]],hour1[[4]],hour1[[5]],hour1[[6]]]
您可以使用iloc
访问这些列
data = [name,hour1.iloc[:,0],hour1.iloc[:,1],hour1.iloc[:,2],hour1.iloc[:,3],hour1.iloc[:,4],hour1.iloc[:,5],hour1.iloc[:,6]]
虽然您可以使用 pandas.DataFrame.to_csv
更轻松地写入 csv。
例如,
# add a name column
hour1['name'] = name
# write to csv
hour1.to_csv('sample_output.txt', index=False, sep=' ')
通过使用sep=' '
,输出将在每一行上用空格分隔,如 OP 所述。
替代方案:
dbfread
包:https://gist.github.com/jamespaultg/990e4650a384ade5c57a2eb56515ba62
https://dbfread.readthedocs.io/en/latest/exporting_data.html#pandas-data-frames
该 sql 查询的 DataFrame 等效操作是:
import pandas as pd
df1['open_time_hr'] = pd.to_datetime(df1['open_time']).dt.hour
df1['gross_nat_value'] = df1.taxes+df1.auto_grat+df1.discount
df_agg = df1.groupby(['date', 'session_no', 'open_time_hr']).sum()
df_agg.to_csv('sample_output.txt', index=False, sep=' ')
【讨论】:
它生成一个文本文件,但文本文件中的值是这样的............Sample Store,"0 None Name: date, dtype: object","0 None Name : session_number, dtype: object","0 None Name: sales_for_12am, dtype: object","0 None Name: Gross_vat_sales, dtype: object","0 None Name: total_vat, dtype: object","0 None Name: discount , dtype: object","0 None Name: service_charge, dtype: object" 文本文件的输出值应该是这样的... "2020-01-01 943 527.0 56.46 56.46 0.0 0.0" 这可能是 sql 的问题。当您打印hour1y1
时,它会显示什么?
header = date session_number sales_for_12am Gross_vat_sales total_vat discount service_charge, value = 0 0 0 0 0 0 0
我完成了顺便说一句谢谢你的帮助。以上是关于SQLDF 提取值并将其保存到文本文件的主要内容,如果未能解决你的问题,请参考以下文章
React Native:如何创建文本文件并将其保存到外部存储
如何从 .txt 中提取文本并将其存储到动态二维数组中?
是否可以将 16 位二进制数转换为两个字符并将其保存到文本文件中?