从 while 循环导出到 csv 文件
Posted
技术标签:
【中文标题】从 while 循环导出到 csv 文件【英文标题】:export from a while loop to a csv file 【发布时间】:2022-01-23 11:18:53 【问题描述】:给定以下脚本和数据集: 脚本:
while IFS=","
read v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 v11 v12 v13;
do if [ -z "$v12" ];
then echo "$v1,$v2,$v3,$v4,$v5,$v6,$v7,$v8,$v9,$v10,$v11,'unknown',$v13";
else echo "$v1, $v2,$v3,$v4,$v5,$v6,$v7,$v8,$v9,$v10,$v11,$v12,$v13";
fi;
done
>train3.csv
数据集:
PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
1,0,3,"Braund, Mr. Owen Harris",male,22,1,0,A/5 21171,7.25,,S
2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Thayer)",female,38,1,0,PC 17599,71.2833,C85,C
3,1,3,"Heikkinen, Miss. Laina",female,26,0,0,STON/O2. 3101282,7.925,,S
4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35,1,0,113803,53.1,C123,S
5,0,3,"Allen, Mr. William Henry",male,35,0,0,373450,8.05,,S
6,0,3,"Moran, Mr. James",male,,0,0,330877,8.4583,,Q
7,0,1,"McCarthy, Mr. Timothy J",male,54,0,0,17463,51.8625,E46,S
8,0,3,"Palsson, Master. Gosta Leonard",male,2,3,1,349909,21.075,,S
9,1,3,"Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)",female,27,0,2,347742,11.1333,,S
10,1,2,"Nasser, Mrs. Nicholas (Adele Achem)",female,14,1,0,237736,30.0708,,C
我想导出为名为“train3.csv”的 CSV 文件,但我的做法不起作用,它不显示已完成的更改或保存为 CSV 文件。
我该如何解决这个问题?
预期的结果是:
PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
1,0,3,"Braund, Mr. Owen Harris",male,22,1,0,A/5 21171,7.25,'unknown',S
2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Thayer)",female,38,1,0,PC 17599,71.2833,C85,C
3,1,3,"Heikkinen, Miss. Laina",female,26,0,0,STON/O2. 3101282,7.925,'unknown',S
4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35,1,0,113803,53.1,C123,S
5,0,3,"Allen, Mr. William Henry",male,35,0,0,373450,8.05,'unknown',S
6,0,3,"Moran, Mr. James",male,,0,0,330877,8.4583,'unknown',Q
7,0,1,"McCarthy, Mr. Timothy J",male,54,0,0,17463,51.8625,E46,S
8,0,3,"Palsson, Master. Gosta Leonard",male,2,3,1,349909,21.075,'unknown',S
9,1,3,"Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)",female,27,0,2,347742,11.1333,'unknown',S
10,1,2,"Nasser, Mrs. Nicholas (Adele Achem)",female,14,1,0,237736,30.0708,'unknown',C
还包括创建新的 CSV 文件。
谢谢。
【问题讨论】:
【参考方案1】:稍微修改你的代码:
#!/bin/bash
datafile='dataset.txt'
outputfile='train3.csv'
>"$outputfile"
while IFS="," read -r v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 v11 v12 v13
do
if [[ -z "$v12" ]]
then
echo "$v1,$v2,$v3,$v4,$v5,$v6,$v7,$v8,$v9,$v10,$v11,'unknown',$v13"
else
echo "$v1, $v2,$v3,$v4,$v5,$v6,$v7,$v8,$v9,$v10,$v11,$v12,$v13"
fi
done < "$datafile" >"$outputfile"
从文件中读取数据的一个很好的参考是https://mywiki.wooledge.org/BashFAQ/001
【讨论】:
最好将>>"$outputfile"
移到循环外(较少打开/关闭文件),就在done
之后。
确实,谢谢,已修改。【参考方案2】:
不要为此使用 Bash。您的输入 CSV 包含带引号的字符串。您可能无法保证带引号的字符串必须恰好包含一个逗号。如果它包含更少或更多的逗号,这将破坏您的代码。
请改用专用工具,它可以正确处理带引号的字符串。最容易使用的工具是带有模块DBD::CSV 的Perl。以下命令将在 Debian 上安装它。
sudo apt-get install libdbd-csv-perl
现在您可以使用 SQL 来修复您的 CSV 文件。
#! /usr/bin/perl
use DBI;
$dbh = DBI->connect ("dbi:CSV:")
or die "Cannot connect: $DBI::errstr";
my $sth = $dbh->prepare ("UPDATE train3.csv SET cabin = ? WHERE cabin is null");
$sth->execute ("'unknown'");
$sth->finish;
$dbh->disconnect;
如果您不想学习 Perl,您可以在命令行中将该脚本用作即用型程序。将其保存在csv.pl
并使其可执行:
#! /usr/bin/perl
use DBI;
$dbh = DBI->connect ("dbi:CSV:")
or die "Cannot connect: $DBI::errstr";
my $sth = $dbh->prepare (shift);
$sth->execute (@ARGV);
$sth->finish;
$dbh->disconnect;
接下来您可以只传递查询及其参数:
./csv.pl 'UPDATE train3.csv SET cabin = ? WHERE cabin is null' \'unknown\'
留意报价。
【讨论】:
【参考方案3】:您的脚本不工作,因为read
不知道从哪里读取,并且重定向应该在done
之后。
我还通过参数赋值改进了脚本:$parameter:-word
将在参数为空时使用word
。
while IFS="," read -r v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 v11 v12 v13; do
echo "$v1,$v2,$v3,$v4,$v5,$v6,$v7,$v8,$v9,$v10,$v11,$v12:-'unknown',$v13"
done <dataset.csv >train3.csv
您可以使用其他工具避免while
循环
awk -F, -v unknown="'unknown'" 'BEGIN OFS="," !$12 $12=unknown 1' < dataset.csv >train3.csv
两种解决方案都会被字段 2 中的逗号混淆(这就是字段 12 而不是 11 被更改的原因)。如果名称不带逗号,则会检查错误的字段。
当你知道Embarked
是一个没有逗号的字段时,你可以使用
awk -F, -v unknown="'unknown'" '
BEGIN OFS=","
!$(NF-1) $(NF-1)=unknown
1' < dataset.csv >train3.csv
但是,您应该使用真正了解 csv 格式的解决方案,例如 @ceving 的答案。
【讨论】:
看起来不错,但我需要将脚本作为``` ./script.sh< dataset.csv
。您可以编辑您的问题并解释您如何调用脚本。以上是关于从 while 循环导出到 csv 文件的主要内容,如果未能解决你的问题,请参考以下文章