在 Unix shell 中添加一列数字

Posted 2023-03-24

技术标签:

【中文标题】在 Unix shell 中添加一列数字【英文标题】：Add up a column of numbers at the Unix shell 【发布时间】：2010-10-29 21:22:40 【问题描述】：

给定files.txt 中的文件列表，我可以像这样获得它们的大小列表：

cat files.txt | xargs ls -l | cut -c 23-30

产生这样的东西：

我怎样才能得到所有这些数字的总数？

【问题讨论】：

【参考方案1】：

我会改用“du”。

$ cat files.txt | xargs du -c | tail -1
4480    total

如果你只想要数字：

cat files.txt | xargs du -c | tail -1 | awk 'print $1'

【讨论】：

磁盘使用情况！= 文件大小。 du 报告磁盘使用情况。我认为 -b 开关可以让 du 做我需要的事情。 @0x6adb015 知识渊博。谢谢，我没有意识到。对于 OP 希望添加数字列的具体原因，这是一个有用的答案，但对于一般的数字添加情况，它不够。（我自己一直使用“du”，但我来这里是为了寻找命令行数学。:-)） 当files.txt 很大时，这将不起作用。 如果通过管道传送到xargs 的参数数量达到某个阈值，它会在多次调用@987654325 时将它们分解@。最后显示的总数是最后一次调用 du 的总数，而不是整个列表。【参考方案2】：

来了

cat files.txt | xargs ls -l | cut -c 23-30 | 
  awk 'total = total + $1ENDprint total'

【讨论】：

使用 awk 是个好主意，但为什么要保留 cut？这是一个可预测的列号，所以使用... | xargs ls -l | awk 'total = total + $5ENDprint total' 你当然是对的 - 只是附加到已经存在的内容的末尾更容易:-) @dmckee 的回答中一个括号太多了 :) 为了缩短一点，您可以使用total+=$1 而不是total = total + $1【参考方案3】：

管道到傻瓜：

 cat files.txt | xargs ls -l | cut -c 23-30 | gawk 'BEGIN  sum = 0  //  sum = sum + $0  END  print sum '

【讨论】：

【参考方案4】：

如果您只想使用没有 awk 或其他解释器的 shell 脚本，可以使用以下脚本：

#!/bin/bash

total=0

for number in `cat files.txt | xargs ls -l | cut -c 23-30`; do
   let total=$total+$number
done

echo $total

【讨论】：

【参考方案5】：

这是我的

cat files.txt | xargs ls -l | cut -c 23-30 | sed -e :a -e '$!N;s/\n/+/;ta' | bc

【讨论】：

+1 用于一劳永逸地证明有比 perl 更丑的语言 :)【参考方案6】：

在 ksh 中：

echo " 0 $(ls -l $(<files.txt) | awk 'print $5' | tr '\n' '+') 0" | bc

【讨论】：

很适合跳过cut，但你忽略了 awks 做数学的能力......【参考方案7】：

不用cut从ls -l的输出中获取文件大小，你可以直接使用：

$ cat files.txt | xargs ls -l | awk 'total += $5 END print "Total:", total, "bytes"'

Awk 将“$5”解释为第五列。这是 ls -l 中提供文件大小的列。

【讨论】：

【参考方案8】：

#
#       @(#) addup.sh 1.0 90/07/19
#
#       Copyright (C) <heh> SjB, 1990
#       Adds up a column (default=last) of numbers in a file.
#       95/05/16 updated to allow (999) negative style numbers.


case $1 in

-[0-9])

        COLUMN=`echo $1 | tr -d -`

        shift

;;

*)

        COLUMN="NF"

;;

esac

echo "Adding up column .. $COLUMN .. of file(s) .. $*"

nawk  ' OFMT="%.2f"                                       # 1 "%12.2f"

         x = '$COLUMN'                                   # 2

          neg = index($x, "$")                            # 3

          if (neg > 0) X = gsub("\\$", "", $x)

          neg = index($x, ",")                            # 4

          if (neg > 1) X = gsub(",", "", $x)

          neg = index($x, "(")                            # 8 neg (123 & change

          if (neg > 0) X = gsub("\\(", "", $x)

          if (neg > 0) $x = (-1 * $x)                     # it to "-123.00"

          neg = index($x, "-")                            # 5

          if (neg > 1) $x = (-1 * $x)                     # 6

          t += $x                                         # 7

          print "x is <<<", $x+0, ">>> running balance:", t

         ' $*


# 1.  set numeric format to eliminate rounding errors
# 1.1 had to reset numeric format from 12.2f to .2f 95/05/16
#     when a computed number is assigned to a variable ( $x = (-1 * $x) )
#     it causes $x to use the OFMT so -1.23 = "________-1.23" vs "-1.23"
#     and that causes my #5 (negative check) to not work correctly because
#     the index returns a number >1 and to the neg neg than becomes a positive
#     this only occurs if the number happened to b a "(" neg number
# 2.  find the field we want to add up (comes from the shell or defaults
#     to the last field "NF") in the file
# 3.  check for a dollar sign ($) in the number - if there get rid of it
#     so we may add it correctly - $12 $1$2 $1$2$ $$1$$2$$ all = 12
# 4.  check for a comma (,) in the number - if there get rid of it so we
#     may add it correctly - 1,2 12, 1,,2 1,,2,, all = 12   (,12=0)
# 5.  check for negative numbers
# 6.  if x is a negative number in the form 999- "make" it a recognized
#     number like -999 - if x is a negative number like -999 already
#     the test fails (y is not >1) and this "true" negative is not made
#     positive
# 7.  accumulate the total
# 8.  if x is a negative number in the form (999) "make it a recognized
#     number like -999
# * Note that a (-9) (neg neg number) returns a postive
# * Mite not work rite with all forms of all numbers using $-,+. etc. *

【讨论】：

【参考方案9】：

如果文件名中有空格，cat 将不起作用。这是一个 perl 单行代码。

perl -nle 'chomp; $x+=(stat($_))[7]; ENDprint $x' files.txt

【讨论】：

【参考方案10】：

TMTWWTDI: Perl 有一个文件大小操作符（-s）

perl -lne '$t+=-s;ENDprint $t' files.txt

【讨论】：

【参考方案11】：

我喜欢用....

echo "
1
2
3 " | sed -e 's,$, + p,g' | dc

他们将显示每一行的总和......

适用于这种情况：

ls -ld $(< file.txt) | awk 'print $5' | sed -e 's,$, + p,g' | dc

总计是最后一个值...

【讨论】：

【参考方案12】：

python3 -c"import os; print(sum(os.path.getsize(f) for f in open('files.txt').read().split()))"

或者，如果您只想对数字求和，请输入：

python3 -c"import sys; print(sum(int(x) for x in sys.stdin))"

【讨论】：

... | python -c'import sys; print(sum(int(x) for x in sys.stdin))' 今年年底python 2消失时。 don@oysters:~/Documents$ cat tax | python3 -c"import sys; print(sum(int(x) for x in sys.stdin))" Traceback（最近一次调用最后）：文件“”，第 1 行，在文件中“"，第 1 行，在中 ValueError：int() 以 10 为底的无效文字：'\n'【参考方案13】：

... | paste -sd+ - | bc

是我找到的最短的一个（来自UNIX Command Line 博客）。

编辑：为可移植性添加了- 参数，感谢@Dogbert 和@Owen。

【讨论】：

不错。也需要最后一个 - 在 Solaris 上 alias sum="paste -sd+ - | bc" 添加到 shell 完成中，谢谢队友 @slf，小心，你刚刚超载了/usr/bin/sum 当心，bc 在某些系统上不可用！另一方面，awk（我相信）是 POSIX 合规性所必需的。 @donbright，确保输入文件的每一行都只包含一个数字，没有别的。您可以通过省略| bc 来调试它，并目视检查输出以发现语法错误（其格式应为“a + b + c + ...").【参考方案14】：

当你有 stat 时，整个 ls -l 然后 cut 相当复杂。它也容易受到 ls -l 的确切格式的影响（直到我更改了 cut 的列号之后它才起作用）

另外，修复了useless use of cat。

<files.txt  xargs stat -c %s | paste -sd+ - | bc

【讨论】：

嗯。使用 Unix 已经 32 年了，从不知道<infile command 与command <infile 相同（而且顺序比command <infile 更好。【参考方案15】：

如果你没有安装 bc，试试

echo $(( $(... | paste -sd+ -) ))

而不是

... | paste -sd+ - | bc

$( )

$(( 1+2 ))

echo

【讨论】：

【参考方案16】：

在我看来，最简单的解决方案是“expr” unix 命令：

s=0; 
for i in `cat files.txt | xargs ls -l | cut -c 23-30`
do
   s=`expr $s + $i`
done
echo $s

【讨论】：

【参考方案17】：

纯 bash

total=0; for i in $(cat files.txt | xargs ls -l | cut -c 23-30); do 
total=$(( $total + $i )); done; echo $total

【讨论】：

【参考方案18】：

sizes=( $(cat files.txt | xargs ls -l | cut -c 23-30) )
total=$(( $(IFS="+"; echo "$sizes[*]") ))

或者您可以在阅读尺寸时将它们相加

declare -i total=0
while read x; total+=x; done < <( cat files.txt | xargs ls -l | cut -c 23-30 )

如果您不关心咬合大小和块数就可以了，那么就

declare -i total=0
while read s junk; total+=s; done < <( cat files.txt | xargs ls -s )

【讨论】：

【参考方案19】：

cat files.txt | awk ' total += $1 END print total'

你可以使用 awk 来做同样的事情，它甚至可以跳过非整数

$ cat files.txt
1
2.3
3.4
ew
1

$ cat files.txt | awk ' total += $1 END print total'
7.7

或者你可以使用 ls 命令计算人类可读的输出

$ ls -l | awk ' sum += $5 END  hum[1024^3]="Gb"; hum[1024^2]="Mb"; hum[1024]="Kb"; for (x=1024^3; x>=1024; x/=1024)  if (sum>=x)  printf "%.2f %s\n",sum/x,hum[x]; break;   if (sum<1024) print "1kb"; '
15.69 Mb

$ ls -l *.txt | awk ' sum += $5 END  hum[1024^3]="Gb"; hum[1024^2]="Mb"; hum[1024]="Kb"; for (x=1024^3; x>=1024; x/=1024)  if (sum>=x)  printf "%.2f %s\n",sum/x,hum[x]; break;   if (sum<1024) print "1kb"; '
2.10 Mb

【讨论】：

你甚至不需要管道：awk ' total += $1 END print total' files.txt 更快【参考方案20】：

如果你有 R，你可以使用：

> ... | Rscript -e 'print(sum(scan("stdin")));'
Read 4 items
[1] 2232320

因为我对 R 很熟悉，所以我实际上有几个这样的别名，所以我可以在 bash 中使用它们，而不必记住这个语法。例如：

alias Rsum=$'Rscript -e \'print(sum(scan("stdin")));\''

让我来做

> ... | Rsum
Read 4 items
[1] 2232320

灵感：Is there a way to get the min, max, median, and average of a list of numbers in a single command?

【讨论】：

【参考方案21】：

当管道的开头可以产生 0 行时，最流行的答案不起作用，因为它最终输出的不是 0 而是什么。您可以通过始终添加 0 来获得正确的行为：

... | (cat && echo 0) | paste -sd+ - | bc

【讨论】：

【参考方案22】：

粘贴时不需要 -。只要 files.txt 包含一个或多个有效文件名，以下将执行：

<files.txt xargs stat -c %s | paste -sd+ | bc

如果没有文件，cat 不需要插入 0。如果没有管道，可能在脚本中更方便，您可以使用：

(xargs -a files.txt stat -c %s || echo 0) | paste -sd+ | bc

【讨论】：

【参考方案23】：

... |xargs|tr \  +|bc
... |paste -sd+ -|bc

第一个命令只是长了一个符号（注意，它必须在反斜杠后有两个空格！），但它处理一列中有空行的情况，而第二个命令导致带有额外加号的无效表达式。

例如：

echo "2
3
5

" | paste -sd+ -

结果

2+3+5++

哪个 bc 无法处理，而

echo "2
3
5

" | xargs | tr \  +

给出一个有效的表达式

 2+3+5

可以通过管道输入 bc 以获得最终结果

【讨论】：

以上是关于在 Unix shell 中添加一列数字的主要内容，如果未能解决你的问题，请参考以下文章