删除除包含给定字符串的第一行之外的所有内容

Posted

技术标签:

【中文标题】删除除包含给定字符串的第一行之外的所有内容【英文标题】:Remove all but the first line containing a given string 【发布时间】:2021-07-11 10:49:32 【问题描述】:

我有一个名为 duration.log 的日志文件,其输出如下:

2021-04-15 20:25:45.639181: --- DURATION: 0:00:02.928309 --- ROUTE NAME: 210415-821-PK-JDoe --- HEADLESS: 0 --- 
2021-04-15 20:25:48.756914: --- DURATION: 0:00:03.000727 --- ROUTE NAME: 210415-821-PK-JDoe --- HEADLESS: 0 --- 
2021-04-15 20:25:51.948027: --- DURATION: 0:00:03.068122 --- ROUTE NAME: 210415-821-PK-JDoe --- HEADLESS: 0 --- 
2021-04-15 20:25:55.075158: --- DURATION: 0:00:02.987064 --- ROUTE NAME: 210415-821-PK-JDoe --- HEADLESS: 0 --- 
2021-04-15 20:25:58.274715: --- DURATION: 0:00:03.063948 --- ROUTE NAME: 210415-821-PK-JDoe --- HEADLESS: 0 --- 
2021-04-15 20:26:01.753367: --- DURATION: 0:00:03.273167 --- ROUTE NAME: 210415-821-PK-JDoe --- HEADLESS: 0 --- 
2021-04-15 20:26:05.001949: --- DURATION: 0:00:03.050073 --- ROUTE NAME: 210415-821-PK-JDoe --- HEADLESS: 0 --- 
2021-04-15 20:26:08.206065: --- DURATION: 0:00:03.073367 --- ROUTE NAME: 210415-821-PK-JDoe --- HEADLESS: 0 --- 
2021-04-16 09:03:24.188722: --- DURATION: 0:00:21.995238 --- ROUTE NAME: None --- HEADLESS: 0 --- 
2021-04-16 09:03:50.434883: --- DURATION: 0:00:26.140902 --- ROUTE NAME: None --- HEADLESS: 0 --- 
2021-04-16 09:04:18.552286: --- DURATION: 0:00:27.793468 --- ROUTE NAME: None --- HEADLESS: 0 --- 
2021-04-16 09:06:27.015632: --- DURATION: 0:00:33.688867 --- ROUTE NAME: 210416-829-PK-JDoe --- HEADLESS: 0 --- 
2021-04-16 09:07:10.487733: --- DURATION: 0:00:42.421573 --- ROUTE NAME: 210416-830-PK-JDoe --- HEADLESS: 0 --- 
2021-04-16 09:07:39.247244: --- DURATION: 0:00:28.391001 --- ROUTE NAME: 210416-831-PK-JDoe --- HEADLESS: 0 --- 
2021-04-16 09:08:06.292683: --- DURATION: 0:00:26.790946 --- ROUTE NAME: 210416-832-PK-JDoe --- HEADLESS: 0 --- 
2021-04-16 09:08:29.929427: --- DURATION: 0:00:19.462734 --- ROUTE NAME: 210416-833-PK-JDoe --- HEADLESS: 0 --- 
2021-04-16 09:08:53.306396: --- DURATION: 0:00:23.140965 --- ROUTE NAME: 210416-834-PK-JDoe --- HEADLESS: 0 --- 

我使用awk '!seen[$0]++' duration.log 删除了重复的行 How to delete duplicate lines in a file without sorting it in Unix?.

现在,如何删除除包含字符串210415-821-PK-JDoe 的第一行之外的所有内容? awk 或其他 bash 工具。

更新:

我正在寻找以下输出:

2021-04-15 20:25:45.639181: --- DURATION: 0:00:02.928309 --- ROUTE NAME: 210415-821-PK-JDoe --- HEADLESS: 0 --- 
2021-04-16 09:03:24.188722: --- DURATION: 0:00:21.995238 --- ROUTE NAME: None --- HEADLESS: 0 --- 
2021-04-16 09:03:50.434883: --- DURATION: 0:00:26.140902 --- ROUTE NAME: None --- HEADLESS: 0 --- 
2021-04-16 09:04:18.552286: --- DURATION: 0:00:27.793468 --- ROUTE NAME: None --- HEADLESS: 0 --- 
2021-04-16 09:06:27.015632: --- DURATION: 0:00:33.688867 --- ROUTE NAME: 210416-829-PK-JDoe --- HEADLESS: 0 --- 
2021-04-16 09:07:10.487733: --- DURATION: 0:00:42.421573 --- ROUTE NAME: 210416-830-PK-JDoe --- HEADLESS: 0 --- 
2021-04-16 09:07:39.247244: --- DURATION: 0:00:28.391001 --- ROUTE NAME: 210416-831-PK-JDoe --- HEADLESS: 0 --- 
2021-04-16 09:08:06.292683: --- DURATION: 0:00:26.790946 --- ROUTE NAME: 210416-832-PK-JDoe --- HEADLESS: 0 --- 
2021-04-16 09:08:29.929427: --- DURATION: 0:00:19.462734 --- ROUTE NAME: 210416-833-PK-JDoe --- HEADLESS: 0 --- 
2021-04-16 09:08:53.306396: --- DURATION: 0:00:23.140965 --- ROUTE NAME: 210416-834-PK-JDoe --- HEADLESS: 0 --- 

【问题讨论】:

感谢您的努力,您能否在您的问题中也发布示例预期输出,以便更清楚地了解您的预期输出,谢谢,干杯。 请检查更新后的问题。 【参考方案1】:

请根据您显示的示例尝试以下操作。

awk '/210415-821-PK-JDoe/ && ++count>1next 1'  Input_file

或者根据上面 Sundeep 的评论可以写成:

awk '!/210415-821-PK-JDoe/ || !count++'  Input_file

说明:为上述添加详细说明。

awk '                                ##Starting awk program from here.
/210415-821-PK-JDoe/ && ++count>1   ##checking condition if line contains 210415-821-PK-JDoe AND count is greater than 1 then do following.
  next                               ##next will skip all further statements from here.

1                                    ##1 will print current line here.
'  Input_file                        ##Mentioning Input_file name here.

【讨论】:

你可以简化为awk '!/210415-821-PK-JDoe/ || !c++' @Sundeep,当然,谢谢,我现在已经添加了解决方案,干杯。

以上是关于删除除包含给定字符串的第一行之外的所有内容的主要内容,如果未能解决你的问题,请参考以下文章

删除除文件扩展名之外的所有内容[重复]

从 PHP 中的字符串中删除除字母数字字符之外的所有内容

PHP 删除字符串中除数字之外的所有内容

如何从字符串中删除除字母、数字、空格、感叹号和问号之外的所有内容?

如何将逗号添加到字符串中除最后一项之外的所有内容

从 Python3.3 中的字符串中删除除字母和空格之外的所有内容