bash 使用散列键从 CSV 多列中读取
Posted
技术标签:
【中文标题】bash 使用散列键从 CSV 多列中读取【英文标题】:bash read from CSV multiple columns whith hash key 【发布时间】:2022-01-07 21:24:55 【问题描述】:我尝试如下垂直读取 csv 文件以插入石墨/碳 DB。
"No.","time","00:00:00","00:00:01","00:00:02","00:00:03","00:00:04","00:00:05","00:00:06","00:00:07","00:00:08","00:00:09","00:00:0A"
"1","2021/09/12 02:16",235,610,345,997,446,130,129,94,555,274,4
"2","2021/09/12 02:17",364,210,371,341,294,87,179,106,425,262,3
"3","2021/09/12 02:18",297,343,860,216,275,81,73,113,566,274,3
"4","2021/09/12 02:19",305,243,448,262,387,64,63,119,633,249,3
"5","2021/09/12 02:20",276,151,164,263,315,86,92,175,591,291,1
"6","2021/09/12 02:21",264,343,287,542,312,83,72,122,630,273,4
"7","2021/09/12 02:22",373,157,266,446,246,90,173,90,442,273,2
"8","2021/09/12 02:23",265,112,241,307,329,64,71,82,515,260,3
"9","2021/09/12 02:24",285,247,240,372,176,92,67,83,609,620,1
"10","2021/09/12 02:25",289,964,277,476,356,84,74,104,560,294,1
"11","2021/09/12 02:26",279,747,227,573,569,82,77,99,589,229,5
"12","2021/09/12 02:27",338,370,315,439,653,85,165,346,367,281,2
"13","2021/09/12 02:28",269,135,372,262,307,73,86,93,512,283,4
"14","2021/09/12 02:29",281,207,688,322,233,75,69,85,663,276,2
...
我希望为每个列标题 00:00:XX 生成命令,同时考虑到第 2 列中的小时以及这段时间内的值
echo "perf.$type.$serial.$object.00:00:00.TOTAL_IOPS" "235" "epoch time (2021/09/12 02:16)" | nc "localhost" "2004"
echo "perf.$type.$serial.$object.00:00:00.TOTAL_IOPS" "364" "epoch time (2021/09/12 02:17)" | nc "localhost" "2004"
...
echo "perf.$type.$serial.$object.00:00:01.TOTAL_IOPS" "610" "epoch time (2021/09/12 02:16)" | nc "localhost" "2004"
echo "perf.$type.$serial.$object.00:00:01.TOTAL_IOPS" "210" "epoch time (2021/09/12 02:17)" | nc "localhost" "2004"
.. etc..
我不知道从哪个开始,我用awk试过没有成功
Trial1: awk -F "," 'BEGINFS=","NR==1for(i=1;i<=NF;i++) header[i]=$ifor(i=1;i<=NF;i++) print header[i] ' file.csv
Trial2: awk 'time=$2; for(i=3;i<=NF;i++)time=time" "$i; print time' file.csv
非常感谢您的帮助。
【问题讨论】:
请发布相关的预期输出。不要以评论、图像、表格或非现场服务的链接的形式发布,而是使用文本并将其包含在您的原始问题中。谢谢。 【参考方案1】:在普通的 bash 中:
#!/bin/bash
IFS=',' read -ra header
header=("$header[@]//\"")
nf=$#header[@]
row_nr=0
while IFS=',' read -ra flds; do
datetime[row_nr++]=$(date -d "$flds[1]//\"" '+%s')
for ((i = 2; i < nf; ++i)); do
col[i]+=" $flds[i]"
done
done
< file
for ((i = 2; i < nf; ++i)); do
v=($col[i])
for ((j = 0; j < row_nr; ++j)); do
printf 'echo "perf.$type.$serial.$object.%s.TOTAL_IOPS" "%s" "epoch time (%s)" | nc "localhost" "2004"\n' \
"$header[i]" "$v[j]" "$datetime[j]"
done
done
【讨论】:
非常感谢,它的作品。但是我怎样才能将日期“2021/09/12 02:25”转换为像 date -d '2021/09/12 02:16' +“%s”这样的纪元 @Indi59 我已编辑答案以反映此要求。 我刚刚看到,它完美,它有效,非常感谢 Nejat【参考方案2】:请您尝试以下方法:
awk -F, '
NR==1 # process the header line
for (i = 3; i <= NF; i++)
gsub(/"/, "", $i) # remove double quotes
tt[i-2] = $i # assign time array
next
# process the body
gsub(/"/, "", $0)
dt[NR - 1] = $2 # assign datetime array
for (i = 3; i <= NF; i++)
key[NR-1, i-2] = $i # assign key values
END
for (i = 1; i <= NF - 2; i++)
for (j = 1; j <= NR - 1; j++)
printf "echo \"perf.$type.$serial.$object.%s.TOTAL_IOPS\" \"%d\" \"epoch time (%s)\" | nc \"localhost\" \"2004\"\n", tt[i], key[j, i], dt[j]
' file.csv
【讨论】:
非常感谢它也有效,我无法检查您的答案是否已解决,但 awk 代码也可以帮助我! 我将此代码保存在我的库中,非常感谢 Tshiono!以上是关于bash 使用散列键从 CSV 多列中读取的主要内容,如果未能解决你的问题,请参考以下文章