由于列中的双引号超过 2 个,sqlldr 失败
Posted
技术标签:
【中文标题】由于列中的双引号超过 2 个,sqlldr 失败【英文标题】:sqlldr is failing due to more than 2 double quotes in a column 【发布时间】:2019-03-19 13:48:56 【问题描述】:问题是当我在 NAME 列中放置两个以上的双引号时(检查下面的代码)然后 sqlldr 失败。 在 data.txt 文件中,前 3 条记录失败,因为 NAME 列(记录中的第二个位置)有超过 2 个双引号。 请协助。我认为我需要针对 NAME 列将一些函数放入控制文件中。
您可以运行以下代码并检查,前 3 条记录将失败,最后一条记录将被插入,我想插入所有带或不带双引号的记录:
--table creation
create table employee
(
name varchar2(4000)
);
--control file
load data
infile '/tmp/swetabh/data.txt'
into table employee
truncate
fields terminated by "," OPTIONALLY ENCLOSED by '"' TRAILING NULLCOLS
(
name
)
--data.txt (SEE ALL RECORDS ARE HAVING MORE THAN 2 DOUBLE QUOTE AND WANT TO INSERT DATA TO THE TABLE WITH OR WITHOUT DOUBLE QUOTE)
""start with double quote "
"end with double quote ""
"double quote " in the middle "
-bash-4.1$ sqlldr userid=xxx/xxx@xxxx control=/tmp/swetabh/control.txt log=/tmp/swetabh/control.log bad=/tmp/swetabh/bad.txt readsize=2000000000 bindsize =2000000000
【问题讨论】:
这里有一些关于这个的扩展讨论:***.com/questions/41431545/… 【参考方案1】:您对流程的提取部分有任何控制权吗?如果是这样,您可以指定数据中的双引号加倍吗 (引)?或者更好的是,指定使用管道作为分隔符,并且不要使用双引号。
【讨论】:
【参考方案2】:如何应用 REPLACE
函数将 替换 双引号 (CHR(34)
) 与 nothing (NULL
)?另外,删除OPTIONALLY ENCLOSED
子句。
控制文件:
load data
infile *
replace
into table employee
fields terminated by "," TRAILING NULLCOLS
(
name "replace(:name, chr(34), null)"
)
begindata
""start with double quote "
"end with double quote ""
"double quote " in the middle "
测试:
SQL> $sqlldr scott/tiger control=test04.ctl log=test04.log
SQL*Loader: Release 11.2.0.2.0 - Production on Uto O×u 19 19:14:49 2019
Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved.
Commit point reached - logical record count 2
Commit point reached - logical record count 3
SQL> select * From employee;
NAME
--------------------------------------------------
start with double quote
end with double quote
double quote in the middle
SQL>
【讨论】:
我们将终止符作为逗号,而字符串可能包含逗号,因此它被双引号包围。我在文件中有多个列。我无法更改终结者。当逗号后面有两个以上的双引号时,就会出现问题。请将以下数据复制到您的文本编辑器 "1","2col","3col " " "2","2col"," 3c,ol " " "3","2col"," 3co,l" " "4","2col","3co,l" "5","2col",""3co,l "" ""6","2col",""3c,ol ""3c,ol"""创建表员工(id号,cmets varchar2(4000),name varchar2(4000));以上是关于由于列中的双引号超过 2 个,sqlldr 失败的主要内容,如果未能解决你的问题,请参考以下文章
Spark数据框databricks csv附加额外的双引号