删除除包含给定字符串的第一行之外的所有内容
Posted
技术标签:
【中文标题】删除除包含给定字符串的第一行之外的所有内容【英文标题】:Remove all but the first line containing a given string 【发布时间】:2021-07-11 10:49:32 【问题描述】:我有一个名为 duration.log
的日志文件,其输出如下:
2021-04-15 20:25:45.639181: --- DURATION: 0:00:02.928309 --- ROUTE NAME: 210415-821-PK-JDoe --- HEADLESS: 0 ---
2021-04-15 20:25:48.756914: --- DURATION: 0:00:03.000727 --- ROUTE NAME: 210415-821-PK-JDoe --- HEADLESS: 0 ---
2021-04-15 20:25:51.948027: --- DURATION: 0:00:03.068122 --- ROUTE NAME: 210415-821-PK-JDoe --- HEADLESS: 0 ---
2021-04-15 20:25:55.075158: --- DURATION: 0:00:02.987064 --- ROUTE NAME: 210415-821-PK-JDoe --- HEADLESS: 0 ---
2021-04-15 20:25:58.274715: --- DURATION: 0:00:03.063948 --- ROUTE NAME: 210415-821-PK-JDoe --- HEADLESS: 0 ---
2021-04-15 20:26:01.753367: --- DURATION: 0:00:03.273167 --- ROUTE NAME: 210415-821-PK-JDoe --- HEADLESS: 0 ---
2021-04-15 20:26:05.001949: --- DURATION: 0:00:03.050073 --- ROUTE NAME: 210415-821-PK-JDoe --- HEADLESS: 0 ---
2021-04-15 20:26:08.206065: --- DURATION: 0:00:03.073367 --- ROUTE NAME: 210415-821-PK-JDoe --- HEADLESS: 0 ---
2021-04-16 09:03:24.188722: --- DURATION: 0:00:21.995238 --- ROUTE NAME: None --- HEADLESS: 0 ---
2021-04-16 09:03:50.434883: --- DURATION: 0:00:26.140902 --- ROUTE NAME: None --- HEADLESS: 0 ---
2021-04-16 09:04:18.552286: --- DURATION: 0:00:27.793468 --- ROUTE NAME: None --- HEADLESS: 0 ---
2021-04-16 09:06:27.015632: --- DURATION: 0:00:33.688867 --- ROUTE NAME: 210416-829-PK-JDoe --- HEADLESS: 0 ---
2021-04-16 09:07:10.487733: --- DURATION: 0:00:42.421573 --- ROUTE NAME: 210416-830-PK-JDoe --- HEADLESS: 0 ---
2021-04-16 09:07:39.247244: --- DURATION: 0:00:28.391001 --- ROUTE NAME: 210416-831-PK-JDoe --- HEADLESS: 0 ---
2021-04-16 09:08:06.292683: --- DURATION: 0:00:26.790946 --- ROUTE NAME: 210416-832-PK-JDoe --- HEADLESS: 0 ---
2021-04-16 09:08:29.929427: --- DURATION: 0:00:19.462734 --- ROUTE NAME: 210416-833-PK-JDoe --- HEADLESS: 0 ---
2021-04-16 09:08:53.306396: --- DURATION: 0:00:23.140965 --- ROUTE NAME: 210416-834-PK-JDoe --- HEADLESS: 0 ---
我使用awk '!seen[$0]++' duration.log
删除了重复的行
How to delete duplicate lines in a file without sorting it in Unix?.
现在,如何删除除包含字符串210415-821-PK-JDoe
的第一行之外的所有内容? awk 或其他 bash 工具。
更新:
我正在寻找以下输出:
2021-04-15 20:25:45.639181: --- DURATION: 0:00:02.928309 --- ROUTE NAME: 210415-821-PK-JDoe --- HEADLESS: 0 ---
2021-04-16 09:03:24.188722: --- DURATION: 0:00:21.995238 --- ROUTE NAME: None --- HEADLESS: 0 ---
2021-04-16 09:03:50.434883: --- DURATION: 0:00:26.140902 --- ROUTE NAME: None --- HEADLESS: 0 ---
2021-04-16 09:04:18.552286: --- DURATION: 0:00:27.793468 --- ROUTE NAME: None --- HEADLESS: 0 ---
2021-04-16 09:06:27.015632: --- DURATION: 0:00:33.688867 --- ROUTE NAME: 210416-829-PK-JDoe --- HEADLESS: 0 ---
2021-04-16 09:07:10.487733: --- DURATION: 0:00:42.421573 --- ROUTE NAME: 210416-830-PK-JDoe --- HEADLESS: 0 ---
2021-04-16 09:07:39.247244: --- DURATION: 0:00:28.391001 --- ROUTE NAME: 210416-831-PK-JDoe --- HEADLESS: 0 ---
2021-04-16 09:08:06.292683: --- DURATION: 0:00:26.790946 --- ROUTE NAME: 210416-832-PK-JDoe --- HEADLESS: 0 ---
2021-04-16 09:08:29.929427: --- DURATION: 0:00:19.462734 --- ROUTE NAME: 210416-833-PK-JDoe --- HEADLESS: 0 ---
2021-04-16 09:08:53.306396: --- DURATION: 0:00:23.140965 --- ROUTE NAME: 210416-834-PK-JDoe --- HEADLESS: 0 ---
【问题讨论】:
感谢您的努力,您能否在您的问题中也发布示例预期输出,以便更清楚地了解您的预期输出,谢谢,干杯。 请检查更新后的问题。 【参考方案1】:请根据您显示的示例尝试以下操作。
awk '/210415-821-PK-JDoe/ && ++count>1next 1' Input_file
或者根据上面 Sundeep 的评论可以写成:
awk '!/210415-821-PK-JDoe/ || !count++' Input_file
说明:为上述添加详细说明。
awk ' ##Starting awk program from here.
/210415-821-PK-JDoe/ && ++count>1 ##checking condition if line contains 210415-821-PK-JDoe AND count is greater than 1 then do following.
next ##next will skip all further statements from here.
1 ##1 will print current line here.
' Input_file ##Mentioning Input_file name here.
【讨论】:
你可以简化为awk '!/210415-821-PK-JDoe/ || !c++'
@Sundeep,当然,谢谢,我现在已经添加了解决方案,干杯。以上是关于删除除包含给定字符串的第一行之外的所有内容的主要内容,如果未能解决你的问题,请参考以下文章