使用Python Dictionary在Python中合并CSV文件
Posted
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了使用Python Dictionary在Python中合并CSV文件相关的知识,希望对你有一定的参考价值。
您好我正在尝试根据公共列或主键合并两个CSV文件中的特定字段来创建新的CSV文件。我尝试在powershell中做同样的事情并且它工作但是在完成这个过程非常慢 - 超过30分钟合并5000+行文件所以在Python中尝试这个。我是新人,所以请放轻松我。
因此,两个文件是infile.csv和checkfile.csv,创建的输出文件中的列将基于infile.csv中的列。代码检查checkfile.csv中的值,创建outfile.csv,从infile.csv复制列,并需要根据checkfile.com中的相应值重写两个字段的值。以下是详细信息
infile.csv -
"StockNumber","SKU","ChannelProfileID","CostPrice"
"10m_s-vid#APTIIAMZ","2VV-10",3746,0.33
"10m_s-vid#CSE","2VV-10",3746,0.98
"1RR-01#CSE","1RR-01",3746
"1RR-01#PCAWS","1RR-01",3746,
"1m_s-vid_ext#APTIIAMZ","2VV-101",3746,0.42
checkfile.csv
ProductCode, Description, Supplier, CostPrice, RRPPrice, Stock, Manufacturer, SupplierProductCode, ManuCode, LeadTime
2VV-03,3MTR BLACK SVHS M - M GOLD CABLE - B/Q 100,Cables Direct Ltd,0.43,,930,CDL,2VV-03,2VV-03,1
2VV-05,5MTR BLACK SVHS M - M GOLD CABLE - B/Q 100,Cables Direct Ltd,0.54,,1935,CDL,2VV-05,2VV-05,1
2VV-10,10MTR BLACK SVHS M - M GOLD CABLE - B/Q 50,Cables Direct Ltd,0.86,,1991,CDL,2VV-10,2VV-10,1
我得到的outfile.csv是 -
StockNumber,SKU,ChannelProfileID,CostPrice
10m_s-vid#APTIIAMZ,2VV-10,"(' ',)",
10m_s-vid#CSE,2VV-10,"(' ',)",
1RR-01#CSE,1RR-01,"(' ',)",
1RR-01#PCAWS,1RR-01,"(' ',)",
1m_s-vid_ext#APTIIAMZ,2VV-101,"(' ',)",
但我需要的outfile.csv是 -
StockNumber,SKU,ChannelProfileID,CostPrice
10m_s-vid#APTIIAMZ,2VV-10,1991,0.86
10m_s-vid#CSE,2VV-10,1991,0.86
1RR-01#CSE,1RR-01
1RR-01#PCAWS,1RR-01
1m_s-vid_ext#APTIIAMZ,2VV-101
最后的代码 -
import csv
with open('checkfile.csv', 'rb') as checkfile:
checkreader = csv.DictReader(checkfile)
product_result = dict(
((v['ProductCode'], v[' Stock']), (v['ProductCode'], v[' CostPrice'])) for v in checkreader
)
with open('infile.csv', 'rb') as infile:
with open('outfile.csv', 'wb') as outfile:
reader = csv.DictReader(infile)
writer = csv.DictWriter(outfile, reader.fieldnames)
writer.writeheader()
for item in reader:
result = product_result.get(item['SKU'], " ")
item['ChannelProfileID'] = result,
item['CostPrice'] = result
writer.writerow(item)
答案
你可以让它变得更简单:
import csv
with open('checkfile.csv', 'rb') as checkfile:
product_result = {
record['ProductCode']: record for record in csv.DictReader(checkfile)}
with open('infile.csv', 'rb') as infile:
with open('outfile.csv', 'wb') as outfile:
reader = csv.DictReader(infile)
writer = csv.DictWriter(outfile, reader.fieldnames)
writer.writeheader()
for item in reader:
record = product_result.get(item['SKU'], None)
if record:
item['ChannelProfileID'] = record[' Stock'] # ???
item['CostPrice'] = record[' CostPrice']
else:
item['ChannelProfileID'] = None
item['CostPrice'] = None
writer.writerow(item)
我不确定我用???
评论过的那一行。
此外,如果您确实想要生成损坏的CSV,请随意省略else子句。
我用StringIO对象测试了它。它生成了您指定的结果,但是使用尾随逗号,其中checkfile中没有匹配项。
我使用Python 2.7 dict理解,因为你用python-2.7标记了你的问题。
另一答案
import csv
product_result = {}
with open('checkfile.csv', 'rb') as checkfile:
checkreader = csv.DictReader(checkfile)
for v in checkreader:
product_result[v['ProductCode']] = (v[' Stock'], v[' CostPrice'])
with open('infile.csv', 'rb') as infile:
with open('outfile.csv', 'wb') as outfile:
reader = csv.DictReader(infile)
writer = csv.DictWriter(outfile, reader.fieldnames)
writer.writeheader()
for item in reader:
result = product_result.get(item['SKU'])
if result:
item['ChannelProfileID'], item['CostPrice'] = result
else:
item['ChannelProfileID'] = item['CostPrice'] = None
writer.writerow(item)
另一答案
import re
import csv
import collections
import glob
# Variables
total_record = []
headerCount = 0
for file in glob.glob("*.csv"):
print(file)
with open(file, 'r') as f:
reader = csv.reader(f)
list_record = list(reader)
if headerCount == 0:
headerCount = 1
total_record.extend(list_record)
else:
list_record.pop(0)
total_record.extend(list_record)
with open('combine.csv', 'w') as csvFile:
writer = csv.writer(csvFile)
writer.writerows(total_record)
以上是关于使用Python Dictionary在Python中合并CSV文件的主要内容,如果未能解决你的问题,请参考以下文章
在Python中,何时使用Dictionary,List或Set?
Python 字典(Dictionary) update()方法