打印与日志文件中条件匹配的数据

Posted 2023-04-18

技术标签:

【中文标题】打印与日志文件中条件匹配的数据【英文标题】：Print the data that matches to the condition in log file 【发布时间】：2020-08-29 15:32:12 【问题描述】：

log.txt 如下，数据在一个集合中，包含 ID、detection_time、Age 和 Height。我的第一部分是使用我已经完成的 shell 脚本在第一次出现后 15 秒内打印再次出现在log.txt 中的 ID。第二部分是打印符合我条件的ID的“detection_time”，“Age”“Height”也将打印ID。第二部分是我卡住的地方，因为我不知道如何开发编程算法。

ID = 4231
detection_time = 1595556730 
Age = 25
Height = 182
ID = 3661
detection_time = 1595556737
Age = 24
Height = 182
ID = 2654
detection_time = 1595556740
Age = 22
Height = 184    
ID = 3661
detection_time = 1595556746
Age = 27
Height = 175
ID = 4231
detection_time = 1595556752
Age = 25

例如，从上面记录下来，ID 3661 首先出现在时间 1595556737，然后在 1595556746 再次出现，这距离第一次出现仅 9 秒。所以它符合我想要在 15 秒内再次出现的 ID 的条件。运行 shell 脚本后，我想要的输出将是 ID3661，其最新的 detection_time Age 和 Height 此 ID 为 3661 的数据是

the matched ID is 3661
detection_time = 1595556746
Age = 27
Height = 175

这是我的代码。我使用关联数组arr，其中 id 作为键，detection_time 作为值。运行此脚本后，上面的日志文件作为输入。输出将是The matched ID is 3661，没有我卡住的 ID detection_time、Age 和 Height。谁能帮我这个？谢谢。

#!/bin/bash
input="/tmp/log.txt"
declare -A arr
while read -r line
do

if [[ $line =~ ID ]] ; then
 id=$(echo $line | awk -F " " 'print $3')
elif [[ $line =~ detection_time ]] ; then
 dtime=$(echo $line | awk -F " " 'print $3')
 if  [[ arr["$id"] -ge $((dtime - 15)) ]]; then
  echo 'The matched ID is' "$id"
 fi
 arr["$id"]=$dtime
fi
done < "$input"

【问题讨论】：

如果还没有完全理解你想要什么，但至少我可以通过解析一行给你一些提示：if [[ $line =~ ^ID[[:space:]]=[[:space:]]([[:digit:]]+)$ ]]; then 这会将id 放入$BASH_REMATCH[1] 并消除需要为此使用awk。另请参阅同一 OP ***.com/questions/63228001/… 的早期问题不要为此使用 shell 循环，请参阅why-is-using-a-shell-loop-to-process-text-considered-bad-practice 【参考方案1】：

此任务适合awk。下面的脚本将打印所有在 <= 上一次检测后 15 秒内重新出现某些 id 的情况。

文件tst.awk：

$1=="ID"  if (p) print(rec); p=0; rec=""; id=$NF 
$1=="detection_time"  if ($NF - t[id] <= 15) p=1; t[id]=$NF 
 rec = (rec? rec RS $0: $0) 
END  if (p) print(rec)

用法：

> awk -f tst.awk file
ID = 3661
detection_time = 1595556746
Age = 27
Height = 175

【讨论】：

【参考方案2】：

我不确定这是否正是您期望的输出。

#!/bin/bash
input="/tmp/log.txt"
declare -A arr
mode=""
while read -r line
do

if [[ $line =~ ^ID[[:space:]]=[[:space:]]([[:digit:]]+)$ ]] ; then
 id=$BASH_REMATCH[1]
 [ -n "$mode" ] && printf "\n"
elif [[ $line =~ ^(Age|Height)[[:space:]]=[[:space:]]([[:digit:]]+)$ ]] && [ -n "$mode" ]; then
 printf "and %s is %s " "$BASH_REMATCH[1]" "$BASH_REMATCH[2]"
elif [[ $line =~ ^detection_time[[:space:]]=[[:space:]]([[:digit:]]+)$ ]] ; then
 dtime=$BASH_REMATCH[1]
 if  [[ arr["$id"] -ge $((dtime - 15)) ]]; then
  printf 'The matched ID is %d at first time: %s ' "$id" "$( date -Iseconds -d @$arr[$id] )"
  mode="1"
 else
  mode=""
 fi
 arr["$id"]=$dtime
fi
done < "$input"

这会从您的输入中打印以下内容：

The matched ID is 3661 at first time: 2020-07-24T04:12:17+02:00 and Age is 27 and Height is 175

希望这可以帮助您按照自己的需要进行操作。

【讨论】：

以上是关于打印与日志文件中条件匹配的数据的主要内容，如果未能解决你的问题，请参考以下文章