sqlldr 直接路径加载direct=true的副作用

Posted P10ZHUO

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了sqlldr 直接路径加载direct=true的副作用相关的知识,希望对你有一定的参考价值。

direct=true的官方解释

接上篇文章sqlldr加载19c pdb最佳实践。最后提升加载速度的内容,其中有一项,是使用直接路径加载:direct=true。直接加载全部记录,提高速度。但是此方法有个副作用,就是会使索引失效。mos上面也有说明,根据文章Common Maintenance Commands That Cause Indexes to Become Unusable (Doc ID 165917.1)的描述:

Six types of maintenance operations can mark index partitions INDEX UNUSABLE. In all cases, you must rebuild the index partitions when the operation is complete.
1) Operations like Import Partition or conventional path SQL*Loader that offer an option to bypass local index maintenance. When the Import is complete, the affected local index partitions are marked IU.

2) Direct path SQL*Loader leaves affected local index partitions and global indexes in an IU state if the index is out of date with respect to the data that it indexes. The index can be out of date for the following reasons:

 a) The index was not maintained during the load due to a space management error (for example, out of extents ORA-1653 or ORA-1652).

b) The user requested the SKIP_INDEX_MAINTENANCE clause.

3) Partition maintenance operations like ALTER TABLE MOVE PARTITION that change rowids. These operations mark the affected local index partition and all global index partitions IU.

4) Partition maintenance operations like ALTER TABLE TRUNCATE PARTITION or DROP PARTITION that remove rows from the table. These operations mark the affected local index partition and all global index partitions IU.

5) Partition maintenance operations like ALTER TABLE SPLIT PARTITION that modify the partition definition of local indexes but do not automatically rebuild the index data to match the new definitions. These operations mark the affected local index partitions IU. ALTER TABLE SPLIT PARTITION also marks all global index partitions IU because it results in changes to rowids. 

6) Index maintenance operations like ALTER INDEX SPLIT PARTITION that modify the partitioning definition of the index but do not automatically rebuild the affected partitions. These operations mark the affected index partitions IU. However, if you split a USABLE partition of a global index, resulting partitions are created USABLE. If the partition that was split was marked IU, then so are the partitions resulting from the split. Note that dropping a partition of a global index that is either IU or is not empty causes the next partition of the index to become IU.

以下进行测试验证。

sqlldr索引失效测试

创建测试环境

SQL> conn zhuo/zhuo@orclpdb
Connected.
SQL> create table t2(id int);

Table created.
SQL> declare
  2  begin
  3  for i in 1..100 loop
  4  insert into t2(id) values(i);
  5  end loop;
  6  commit;
  7  end;
  8  /

PL/SQL procedure successfully completed.

SQL> select count(*) from t2;

  COUNT(*)
----------
       100

待加载的文件内容:

[oracle@oracle12c ~]$ more t2.txt 
id
1
2
3 
4 
......
99
100

在控制文件种,对于插入的方式,insert,truncate,replace,append,我们采用append,因为truncate,replace会删除源数据,相当于空表,没意义。insert的时候,表也必须为空,也没意义。所以采用append追加模式测试。

direct=false && 普通索引

创建普通索引

SQL> create index idx_id on t2(id);

Index created.

1)direct=false(默认)
创建控制文件:

OPTIONS (skip=1)     --因为第一行是id,跳过第一行
LOAD DATA
CHARACTERSET 'ZHS16GBK'
INFILE '/home/oracle/t2.txt'
APPEND INTO TABLE zhuo.t2
FIELDS TERMINATED BY '&' OPTIONALLY ENCLOSED BY '"'
trailing nullcols
(id
)

加载:

[oracle@oracle12c ~]$ sqlldr userid=system/oracle@orclpdb control=/home/oracle/t2.ctl log=/home/oracle/t2.log

SQL*Loader: Release 12.2.0.1.0 - Production on Fri Sep 3 11:08:25 2021

Copyright (c) 1982, 2017, Oracle and/or its affiliates.  All rights reserved.

ORA-28002: the password will expire within 6 days
Path used:      Conventional
Commit point reached - logical record count 64
Commit point reached - logical record count 101

Table ZHUO.T2:
  100 Rows successfully loaded.

Check the log file:
  /home/oracle/t2.log
for more information about the load.

验证:

SQL> select count(*) from t2;

  COUNT(*)
----------
       200
SQL> select index_name,index_type,status from user_indexes where index_name='IDX_ID';
INDEX_NAME INDEX_TYPE STATUS
---------- ---------- --------
IDX_ID     NORMAL     VALID

direct=false,普通索引,加载重复值后,索引没影响。

direct=true && 普通索引

创建控制文件:

OPTIONS (skip=1,direct=true)    --命令行就不用写直接路径参数了
LOAD DATA
CHARACTERSET 'ZHS16GBK' 
INFILE '/home/oracle/t2.txt'
APPEND INTO TABLE zhuo.t2
FIELDS TERMINATED BY '&' OPTIONALLY ENCLOSED BY '"'
trailing nullcols
(id
)

执行加载:

[oracle@oracle12c ~]$ sqlldr userid=system/oracle@orclpdb control=/home/oracle/t2.ctl log=/home/oracle/t2.log

SQL*Loader: Release 12.2.0.1.0 - Production on Fri Sep 3 11:16:27 2021

Copyright (c) 1982, 2017, Oracle and/or its affiliates.  All rights reserved.

ORA-28002: the password will expire within 6 days
Path used:      Direct

Load completed - logical record count 100.

Table ZHUO.T2:
  100 Rows successfully loaded.

Check the log file:
  /home/oracle/t2.log
for more information about the load.

验证:

SQL> /

INDEX_NAME INDEX_TYPE STATUS
---------- ---------- --------
IDX_ID     NORMAL     VALID

SQL> select count(*) from t2;

  COUNT(*)
----------
       300

direct=true,普通索引,加载重复值后,索引没影响。

direct=false && 唯一索引

SQL> truncate table t2;

Table truncated.

SQL>  create unique index idx_id on t2(id);

Index created.

SQL> declare
  2  begin
  3  for i in 1..100 loop
  4  insert into t2(id) values(i);
  5  end loop;
  6  commit;
  7  end;
  8  /

PL/SQL procedure successfully completed.

待加载文件内容不变。
1)direct=false(默认)
创建控制文件:

OPTIONS (skip=1)
LOAD DATA
CHARACTERSET 'ZHS16GBK' 
INFILE '/home/oracle/t2.txt'
APPEND INTO TABLE zhuo.t2
FIELDS TERMINATED BY '&' OPTIONALLY ENCLOSED BY '"'
trailing nullcols
(id
)

执行加载:

[oracle@oracle12c ~]$ sqlldr userid=system/oracle@orclpdb control=/home/oracle/t2.ctl log=/home/oracle/t2.log

SQL*Loader: Release 12.2.0.1.0 - Production on Fri Sep 3 11:23:02 2021

Copyright (c) 1982, 2017, Oracle and/or its affiliates.  All rights reserved.

ORA-28002: the password will expire within 6 days
Path used:      Conventional
Commit point reached - logical record count 64

Table ZHUO.T2:
  0 Rows successfully loaded.

Check the log file:
  /home/oracle/t2.log
for more information about the load.

0行被插入。
查看日志:

Record 1: Rejected - Error on table ZHUO.T2.
ORA-00001: unique constraint (ZHUO.IDX_ID) violated

Record 2: Rejected - Error on table ZHUO.T2.
ORA-00001: unique constraint (ZHUO.IDX_ID) violated

。。。。。
Record 51: Rejected - Error on table ZHUO.T2.
ORA-00001: unique constraint (ZHUO.IDX_ID) violated


MAXIMUM ERROR COUNT EXCEEDED - Above statistics reflect partial run.

Table ZHUO.T2:
  0 Rows successfully loaded.
  51 Rows not loaded due to data errors.
  0 Rows not loaded because all WHEN clauses were failed.
  0 Rows not loaded because all fields were null.


Space allocated for bind array:                  16512 bytes(64 rows)
Read   buffer bytes: 1048576

Total logical records skipped:          1
Total logical records read:            64
Total logical records rejected:        51
Total logical records discarded:        0

Run began on Fri Sep 03 11:23:02 2021
Run ended on Fri Sep 03 11:23:02 2021

Elapsed time was:     00:00:00.07
CPU time was:         00:00:00.02

里面都是违反唯一索引的报错,因为插入的都是重复值,所以插入报错。
验证:

SQL> select index_name,index_type,status from user_indexes where index_name='IDX_ID';

INDEX_NAME INDEX_TYPE STATUS
---------- ---------- --------
IDX_ID     NORMAL     VALID

SQL> select count(*) from t2;

  COUNT(*)
----------
       100

direct=false,唯一索引,加载重复值后,索引没影响。无任何数据加载进去。

direct=true && 唯一索引

创建控制文件:

OPTIONS (skip=1,direct=true)
LOAD DATA
CHARACTERSET 'ZHS16GBK' 
INFILE '/home/oracle/t2.txt'
APPEND INTO TABLE zhuo.t2
FIELDS TERMINATED BY '&' OPTIONALLY ENCLOSED BY '"'
trailing nullcols
(id
)

执行加载:

[oracle@oracle12c ~]$ sqlldr userid=system/oracle@orclpdb control=/home/oracle/t2.ctl log=/home/oracle/t2.log

SQL*Loader: Release 12.2.0.1.0 - Production on Fri Sep 3 11:27:59 2021

Copyright (c) 1982, 2017, Oracle and/or its affiliates.  All rights reserved.

ORA-28002: the password will expire within 6 days
Path used:      Direct

Load completed - logical record count 100.

Table ZHUO.T2:
  100 Rows successfully loaded.

Check the log file:
  /home/oracle/t2.log
for more information about the load.

加载成功,100行全部插入。有重复值,在唯一约束下,尽然还能插入成功。
查看日志:

Control File:   /home/oracle/t2.ctl
Character Set ZHS16GBK specified for all input.

Data File:      /home/oracle/t2.txt
  Bad File:     /home/oracle/t2.bad
  Discard File:  none specified
 
 (Allow all discards)

Number to load: ALL
Number to skip: 1
Errors allowed: 50
Continuation:    none specified
Path used:      Direct

Table ZHUO.T2, loaded from every logical record.
Insert option in effect for this table: APPEND
TRAILING NULLCOLS option in effect

   Column Name                  Position   Len  Term Encl Datatype
------------------------------ ---------- ----- ---- ---- ---------------------
ID                                  FIRST     *   &  O(") CHARACTER            

The following index(es) on table ZHUO.T2 were processed:
index ZHUO.IDX_ID was made unusable due to:
ORA-00001: unique constraint (ZHUO.IDX_ID) violated

Table ZHUO.T2:
  100 Rows successfully loaded.
  0 Rows not loaded due to data errors.
  0 Rows not loaded because all WHEN clauses were failed.
  0 Rows not loaded because all fields were null.

Bind array size not used in direct path.
Column array  rows :    5000
Stream buffer bytes:  256000
Read   buffer bytes: 1048576

Total logical records skipped:          1
Total logical records read:           100
Total logical records rejected:         0
Total logical records discarded:        0
Direct path multithreading optimization is disabled

Run began on Fri Sep 03 11:27:59 2021
Run ended on Fri Sep 03 11:27:59 2021

Elapsed time was:     00:00:00.11
CPU time was:         00:00:00.04

里面关键日志:
The following index(es) on table ZHUO.T2 were processed:
index ZHUO.IDX_ID was made unusable due to:
ORA-00001: unique constraint (ZHUO.IDX_ID) violated
先把索引置为unusable,然后再插入数据。
验证:

SQL> select index_name,index_type,status from user_indexes where index_name='IDX_ID';

INDEX_NAME INDEX_TYPE STATUS
---------- ---------- --------
IDX_ID     NORMAL     UNUSABLE

SQL> select count(*) from t2;

  COUNT(*)
----------
       200

direct=true,唯一索引,加载重复值后,索引变为unusable,重复数据全被插入进去。
处理:

SQL> truncate table t2;

Table truncated.

SQL> alter index idx_id rebuild;

Index altered.
SQL> select index_name,index_type,status from user_indexes where index_name='IDX_ID';

INDEX_NAME INDEX_TYPE STATUS
---------- ---------- --------
IDX_ID     NORMAL     VALID

总结:


使用直接路径装载的时候,如果表的字段有主键或者唯一约束,并且装载的数据违反了唯一性约束,那么SQLLOADER将相关索引置为无效,继续装载。
装载完毕后,索引不会自动置为有效,需要DBA的手工干预。

3、mos说了还有一个参数:skip_index_maintenance – do not maintain indexes, mark affected indexes as unusable (Default FALSE) 不维护索引,将受到影响的索引标记为失效(默认false)。
前提是direct=true。

direct=true && 普通索引&&skip_index_maintenance=true

初始化环境:

SQL> drop index idx_id;

Index dropped.

SQL> create index idx_id on t2(id);

Index created.

SQL> select count(*) from t2;

  COUNT(*)
----------
       100

SQL> select index_name,index_type,status from user_indexes where index_name='IDX_ID';

INDEX_NAME INDEX_TYPE STATUS
---------- ---------- --------
IDX_ID     NORMAL     VALID

控制文件:

OPTIONS (skip=1,direct=true,skip_index_maintenance=true)
LOAD DATA
CHARACTERSET 'ZHS16GBK' 
INFILE '/home/oracle/t2.txt'
APPEND INTO TABLE zhuo.t2
FIELDS TERMINATED BY '&' OPTIONALLY ENCLOSED BY '"'
trailing nullcols
(id
)

加载数据:

[oracle@oracle12c ~]$ sqlldr userid=system/oracle@orclpdb control=/home/oracle/t2.ctl log=/home/oracle/t2.log

SQL*Loader: Release 12.2.0.1.0 - Production on Fri Sep 3 11:49:01 2021

Copyright (c) 1982, 2017, Oracle and/or its affiliates.  All rights reserved.

ORA-28002: the password will expire within 6 days
Path used:      Direct

Load completed - logical record count 100.

Table ZHUO.T2:
  100 Rows successfully loaded.

Check the log file:
  /home/oracle/t2.log
for more information about the load.

100条数据加载成功。
查看日志:

[oracle@oracle12c ~]$ more t2.log 

SQL*Loader: Release 12.2.0.1.0 - Production on Fri Sep 3 11:49:01 2021

Copyright (c) 1982, 2017, Oracle and/or its affiliates.  All rights reserved.


Control File:   /home/oracle/t2.ctl
Character Set ZHS16GBK specified for all input.

Data File:      /home/oracle/t2.txt
  Bad File:     /home/oracle/t2.bad
  Discard File:  none specified
 
 (Allow all discards)

Number to load: ALL
Number to skip: 1
Errors allowed: 50
Continuation:    none specified
Path used:      Direct

Table ZHUO.T2, loaded from every logical record.
Insert option in effect for this table: APPEND
TRAILING NULLCOLS option in effect

   Column Name                  Position   Len  Term Encl Datatype
------------------------------ ---------- ----- ---- ---- ---------------------
ID                                  FIRST     *   &  O(") CHARACTER            

The following index(es) on table ZHUO.T2 were processed:
index ZHUO.IDX_ID was made unusable due to:
SKIP_INDEX_MAINTENANCE option requested

Table ZHUO.T2:
  100 Rows successfully loaded.
  0 Rows not loaded due to data errors.
  0 Rows not loaded because all WHEN clauses were failed.
  0 Rows not loaded because all fields were null.

Bind array size not used in direct path.
Column array  rows :    5000
Stream buffer bytes:  256000
Read   buffer bytes: 1048576

Total logical records skipped:          1
Total logical records read:           100
Total logical records rejected:         0
Total logical records discarded:        0
Direct path multithreading optimization is disabled

Run began on Fri Sep 03 11:49:01 2021
Run ended on Fri Sep 03 11:49:01 2021

Elapsed time was:     00:00:00.07
CPU time was:         00:00:00.02

关键日志:
The following index(es) on table ZHUO.T2 were processed:
index ZHUO.IDX_ID was made unusable due to:
SKIP_INDEX_MAINTENANCE option requested
日志说的很清楚,由于使用了这个参数,导致索引失效。
验证:

SQL> /

INDEX_NAME INDEX_TYPE STATUS
---------- ---------- --------
IDX_ID     NORMAL     UNUSABLE

SQL> select count(*) from t2;

  COUNT(*)
----------
       200

direct=true && 唯一索引&&skip_index_maintenance=true

初始化环境:

SQL> drop index idx_id;

Index dropped.

SQL> truncate table t2;

Table truncated.

SQL> create unique index idx_id on t2(id);

Index created.

SQL> declare
  2  begin
  3  for i in 1..100 loop
  4  insert into t2(id) values(i);
  5  end loop;
  6  commit;
  7  end;
  8  /

PL/SQL procedure successfully completed.

控制文件沿用上面的:

OPTIONS (skip=1,direct=true,skip_index_maintenance=true)
LOAD DATA
CHARACTERSET 'ZHS16GBK' 
INFILE '/home/oracle/t2.txt'
APPEND INTO TABLE zhuo.t2
FIELDS TERMINATED BY '&' OPTIONALLY ENCLOSED BY '"'
trailing nullcols
(id
)

加载数据:

[oracle@oracle12c ~]$ sqlldr userid=system/oracle@orclpdb control=/home/oracle/t2.ctl log=/home/oracle/t2.log

SQL*Loader: Release 12.2.0.1.0 - Production on Fri Sep 3 11:56:26 2021

Copyright (c) 1982, 2017, Oracle and/or its affiliates.  All rights reserved.

ORA-28002: the password will expire within 6 days
Path used:      Direct

Load completed - logical record count 100.

Table ZHUO.T2:
  100 Rows successfully loaded.

Check the log file:
  /home/oracle/t2.log
for more information about the load.

100条数据加载成功。
查看日志:

[oracle@oracle12c ~]$ cat t2.log 

SQL*Loader: Release 12.2.0.1.0 - Production on Fri Sep 3 11:56:26 2021

Copyright (c) 1982, 2017, Oracle and/or its affiliates.  All rights reserved.


Control File:   /home/oracle/t2.ctl
Character Set ZHS16GBK specified for all input.

Data File:      /home/oracle/t2.txt
  Bad File:     /home/oracle/t2.bad
  Discard File:  none specified
 
 (Allow all discards)

Number to load: ALL
Number to skip: 1
Errors allowed: 50
Continuation:    none specified
Path used:      Direct

Table ZHUO.T2, loaded from every logical record.
Insert option in effect for this table: APPEND
TRAILING NULLCOLS option in effect

   Column Name                  Position   Len  Term Encl Datatype
------------------------------ ---------- ----- ---- ---- ---------------------
ID                                  FIRST     *   &  O(") CHARACTER            

The following index(es) on table ZHUO.T2 were processed:
index ZHUO.IDX_ID was made unusable due to:
SKIP_INDEX_MAINTENANCE option requested

Table ZHUO.T2:
  100 Rows successfully loaded.
  0 Rows not loaded due to data errors.
  0 Rows not loaded because all WHEN clauses were failed.
  0 Rows not loaded because all fields were null.

Bind array size not used in direct path.
Column array  rows :    5000
Stream buffer bytes:  256000
Read   buffer bytes: 1048576

Total logical records skipped:          1
Total logical records read:           100
Total logical records rejected:         0
Total logical records discarded:        0
Direct path multithreading optimization is disabled

Run began on Fri Sep 03 11:56:26 2021
Run ended on Fri Sep 03 11:56:26 2021

Elapsed time was:     00:00:00.06
CPU time was:         00:00:00.02

关键日志:
The following index(es) on table ZHUO.T2 were processed:
index ZHUO.IDX_ID was made unusable due to:
SKIP_INDEX_MAINTENANCE option requested

可见唯一索引的时候,也是由于这个参数的副作用。而且本身不使用这个参数的化,唯一索引再direct=true下,也会失效。
所以优先级不一样,当在direct=true下,使用SKIP_INDEX_MAINTENANCE参数,不管什么索引,都会失效,因为它的涵义就是不维护索引。

更进一步的总结:

总结

索引失效的两种原因:
1)sqlldr 【sqlldr ( parallel or direct ) append 】【sqlldr direct=true + 主键重复(append)】
1)sqlldr 【direct=true && skip_index_maintenance=true】

采用direct方式,虽然可以提高速度,但是有可能造成索引失效。加载成功后,检查结果时,不能只看sqlldr的log中,还要检查目标表上唯一索引的status

参考

http://www.itpub.net/thread-730141-1-1.html
http://blog.chinaunix.net/uid-22948773-id-2600900.html
https://blog.csdn.net/u012619290/article/details/83561273

以上是关于sqlldr 直接路径加载direct=true的副作用的主要内容,如果未能解决你的问题,请参考以下文章

Oracle sqlldr 是不是并行处理多个 INFILE

sqlldr上传oracle

批量生成sqlldr文件,高速卸载数据

sqlldr错误:field in data file exceeds maximum length

sqlldr错误:field in data file exceeds maximum length

sqlldr加载19c pdb最佳实践