按年份和 ID 拆分 txt 文件,并将每个新的 txt 文件重命名为“Year_ID.txt”
Posted
技术标签:
【中文标题】按年份和 ID 拆分 txt 文件,并将每个新的 txt 文件重命名为“Year_ID.txt”【英文标题】:Split txt file by Year and ID and rename each new txt file as "Year_ID.txt" 【发布时间】:2021-09-29 08:24:36 【问题描述】:我有一堆 txt 文件(逗号分隔),我想使用第 1 列(年份)和第 3 列(ID)中的通用组标识符将文件拆分为单独的文本文件。另外,我想将新文件名保存为“Column1_Column3.txt”。我不想为这些文件保留任何标题。 我尝试了许多其他问题的脚本/建议,但似乎没有任何效果。 我是 python 新手,任何建议都会非常有帮助。非常感谢。
文件格式:
1.0,9.0,0.0,0.0,5.0,13.2,143.2,993.8529934630001,18.005554199200002,92.5999984741,0.0,0.0,159.882055791 1.0,9.0,0.0,1.0,5.0,13.3,142.8,992.4,19.0,91.5013544438,0.0,0.0,202.645072402 1.0,9.0,0.0,2.0,5.0,13.4,142.5,989.0,21.2,90.4027104135,0.0,0.0,235.39787781 1.0,9.0,0.0,3.0,5.0,13.5,142.2,986.5,22.7,89.3040663832,0.0,0.0,268.74681081200004 1.0,11.0,1.0,1.0,5.0,11.5,175.6,995.6,18.7,18.5200004578,0.0,0.0,680.61138846 1.0,11.0,1.0,5.0,5.0,12.2,174.1,988.9,23.4,18.5200004578,0.0,0.0,645.040646961 1.0,11.0,1.0,6.0,5.0,12.4,173.9,986.5,24.9,18.5200004578,0.0,0.0,654.7981628169999 1.0,9.0,2.0,4.0,5.0,10.7,146.8,986.0,23.2,68.3182237413,0.0,0.0,364.724300756 1.0,9.0,2.0,5.0,5.0,10.8,146.2,982.9,25.0,66.8777792189,0.0,0.0,317.156397048
所以我的输出应该是: 文件1:
1.0,9.0,0.0,0.0,5.0,13.2,143.2,993.8529934630001,18.005554199200002,92.5999984741,0.0,0.0,159.882055791 1.0,9.0,0.0,1.0,5.0,13.3,142.8,992.4,19.0,91.5013544438,0.0,0.0,202.645072402 1.0,9.0,0.0,2.0,5.0,13.4,142.5,989.0,21.2,90.4027104135,0.0,0.0,235.39787781
文件2:
1.0,11.0,1.0,1.0,5.0,11.5,175.6,995.6,18.7,18.5200004578,0.0,0.0,680.61138846 1.0,11.0,1.0,5.0,5.0,12.2,174.1,988.9,23.4,18.5200004578,0.0,0.0,645.040646961 1.0,11.0,1.0,6.0,5.0,12.4,173.9,986.5,24.9,18.5200004578,0.0,0.0,654.7981628169999
文件3:
1.0,9.0,2.0,4.0,5.0,10.7,146.8,986.0,23.2,68.3182237413,0.0,0.0,364.724300756 1.0,9.0,2.0,5.0,5.0,10.8,146.2,982.9,25.0,66.8777792189,0.0,0.0,317.156397048
【问题讨论】:
【参考方案1】:假设:
-
所有条目都是统一的
条目位于二维列表中
所有条目的长度至少为 3(以包括两个分隔字段)
有点担心:
在 File1 中,第二个条目的前面是否应该有“2055791”?这意味着列表条目对于您想要的不是太统一。如果是这种情况,那么我建议您事先清理数据或添加到此代码中,以便它可以忽略它。#grab the full list
full_list = []
#grab every value of column 1
col_one_list = [a[0] for a in full_list]
#grab every value of column 3
col_three_list = [b[2] for b in full_list]
#sort by them
for i in col_one_list:
for j in col_three_list:
separate_list = []
for entry in full_list:
if (entry[0] == i and entry[2] == j):
separate_list.append(entry)
with open(str(i) + "_" +str(j)+".txt", "w" ) as file:
for item in separate_list:
file.write("%s\n" % item)
这应该足够了。
【讨论】:
您好 dperry5910,非常感谢您的反馈。我会试试这个脚本。 2055791 值只是这篇文章中的复制粘贴错误....该值实际上属于文件中的第 1 行。所以格式是统一的。以上是关于按年份和 ID 拆分 txt 文件,并将每个新的 txt 文件重命名为“Year_ID.txt”的主要内容,如果未能解决你的问题,请参考以下文章