如何获取所有文件中出现的单词？但是每个目录的单词计数而不是单个数字

Posted 2023-03-15

技术标签:

【中文标题】如何获取所有文件中出现的单词？但是每个目录的单词计数而不是单个数字【英文标题】：How to get occurrences of word in all files? But with count of the words per directory instead of single number 【发布时间】：2019-12-28 02:08:24 【问题描述】：

我想在所有文件中获得给定的字数，但每个目录而不是单个字数。通过转到特定目录，我确实可以通过简单的grep foo error*.log | wc -l 获得字数。当目录结构如下时，我想获取每个目录的字数。

Directory tree                 
.
├── dir1
│   └── error2.log
    └── error1.log
└── dir2
     └── error_123.log
     └── error_234.log
 ── dir3
     └── error_12345.log
     └── error_23554.log

【问题讨论】：

也许你可以这样开始：wc -w * 2>/dev/null | tail -n1 | read N _; echo $N; 您能否澄清并指出我们在哪里给出要搜索的单词？例如grep -- "searchfor" * 2>/dev/null | wc -l | tail -n1可用于获取所有包含字符串searchfor的文件中的行数 【参考方案1】：

更新：可以在 AIX 上使用以下命令：

#!/bin/bash

for name in /path/to/folder/* ; do
    if [ ! -d "$name" ] ; then
        continue
    fi
    # See: https://unix.stackexchange.com/a/398414/45365
    count="$(cat "$name"/error*.log | tr '[:space:]' '[\n*]' | grep -c 'SEARCH')"
    printf "%s %s\n" "$name" "$count"
done

在 GNU/Linux 上，使用 GNU findutils 和 GNU grep：

find /path/to/folder -maxdepth 1 -type d \
    -printf "%p " -exec bash -c 'grep -ro 'SEARCH'  | wc -l' \;

用实际的搜索词替换SEARCH。

【讨论】：

AIX 没有-maxdepth，我相信我们必须为-prune 提供目录列表。在 Linux 上也尝试了该命令，但它没有按预期工作。这里有什么遗漏吗？在 AIX 上也没有 grep -o。那么整个解决方案就过时了 @cnu 我添加了一个可以在 GNU/Linux 和 AIX 上运行的命令。不幸的是，我无法在 AIX 上亲自尝试。让我知道这是否适合您。完整命令产生以下错误。 find: 0652-009 There is a missing conjunction。 find /path/to/folder/* -type d -prune -print 部分列出了所有目录，但没有列出剩余的命令。不确定是什么问题，因为 AIX 上也安装了 bash。也许this 有帮助？否则我需要请求 shell 访问权限:)

以上是关于如何获取所有文件中出现的单词？但是每个目录的单词计数而不是单个数字的主要内容，如果未能解决你的问题，请参考以下文章