无法从 Hive 的 RDMS 导入表

Posted

技术标签:

【中文标题】无法从 Hive 的 RDMS 导入表【英文标题】:Unable to Import table from RDMS from Hive 【发布时间】:2021-09-27 12:28:37 【问题描述】:

我正在尝试将表从 RDMS 导入 Hive,但它给出了错误。 Sqoop eval 命令运行良好,能够获取记录。

sqoop import "-Dorg.apache.sqoop.splitter.allow_text_splitter=false" \
> --connect "jdbc:tibero:thin:@hostname:8629:DBI" \
> --driver com.tmax.tibero.jdbc.TbDriver \
> --username XXX --password XXXX \
> --table DMSDBA.cmm_cadorg_Tb \
> --hive-import \
> --create-hive-table \
> --hive-table DMSDBA.cmm_cadorg_Tb1

SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/hdp/3.0.1.0-187/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hdp/3.0.1.0-187/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
21/09/27 15:16:38 INFO sqoop.Sqoop: Running Sqoop version: 1.4.8.3.0.1.0-187
21/09/27 15:16:38 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
21/09/27 15:16:38 INFO tool.BaseSqoopTool: Using Hive-specific delimiters for output. You can override
21/09/27 15:16:38 INFO tool.BaseSqoopTool: delimiters with --fields-terminated-by, etc.
21/09/27 15:16:38 WARN sqoop.ConnFactory: Parameter --driver is set to an explicit driver however appropriate connection manager is not being set (via --connection-manager). Sqoop is going to fall back to org.apache.sqoop.manager.GenericJdbcManager. Please specify explicitly which connection manager should be used next time.
21/09/27 15:16:38 INFO manager.SqlManager: Using default fetchSize of 1000
21/09/27 15:16:38 INFO tool.CodeGenTool: Beginning code generation
21/09/27 15:16:39 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM DMSDBA.cmm_cadorg_Tb AS t WHERE 1=0
21/09/27 15:16:39 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM DMSDBA.cmm_cadorg_Tb AS t WHERE 1=0
21/09/27 15:16:39 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/hdp/3.0.1.0-187/hadoop-mapreduce
21/09/27 15:16:41 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-root/compile/8fd46edea8696bb5156e637113626b10/DMSDBA.cmm_cadorg_Tb.jar
21/09/27 15:16:41 ERROR tool.ImportTool: Import failed: No primary key could be found for table DMSDBA.cmm_cadorg_Tb. Please specify one with --split-by or perform a sequential import with '-m 1'.

我也尝试过 --split-by

> --table DMSDBA.cmm_cadorg_Tb \
> --hive-import \
> --create-hive-table \
> --hive-table DMSDBA.cmm_cadorg_Tb1\
> --split-by DORG_UPDT_EMP_NO
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/hdp/3.0.1.0-187/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hdp/3.0.1.0-187/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
21/09/27 13:41:10 INFO sqoop.Sqoop: Running Sqoop version: 1.4.8.3.0.1.0-187
21/09/27 13:41:10 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
21/09/27 13:41:10 ERROR tool.BaseSqoopTool: Error parsing arguments for import:
21/09/27 13:41:10 ERROR tool.BaseSqoopTool: Unrecognized argument: DORG_UPDT_EMP_NO

也尝试传递 MapReduce 参数但再次出错

> --table DMSDBA.cmm_cadorg_Tb \
> --hive-import \
> --create-hive-table \
> --hive-table DMSDBA.cmm_cadorg_Tb1\
> -m 4
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/hdp/3.0.1.0-187/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hdp/3.0.1.0-187/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
21/09/27 15:16:38 INFO sqoop.Sqoop: Running Sqoop version: 1.4.8.3.0.1.0-187
21/09/27 15:16:38 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
21/09/27 15:16:38 INFO tool.BaseSqoopTool: Using Hive-specific delimiters for output. You can override
21/09/27 15:16:38 INFO tool.BaseSqoopTool: delimiters with --fields-terminated-by, etc.
21/09/27 15:16:38 WARN sqoop.ConnFactory: Parameter --driver is set to an explicit driver however appropriate connection manager is not being set (via --connection-manager). Sqoop is going to fall back to org.apache.sqoop.manager.GenericJdbcManager. Please specify explicitly which connection manager should be used next time.
21/09/27 15:16:38 INFO manager.SqlManager: Using default fetchSize of 1000
21/09/27 15:16:38 INFO tool.CodeGenTool: Beginning code generation
21/09/27 15:16:39 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM DMSDBA.cmm_cadorg_Tb AS t WHERE 1=0
21/09/27 15:16:39 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM DMSDBA.cmm_cadorg_Tb AS t WHERE 1=0
21/09/27 15:16:39 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/hdp/3.0.1.0-187/hadoop-mapreduce
21/09/27 15:16:41 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-root/compile/8fd46edea8696bb5156e637113626b10/DMSDBA.cmm_cadorg_Tb.jar
21/09/27 15:16:41 ERROR tool.ImportTool: Import failed: No primary key could be found for table DMSDBA.cmm_cadorg_Tb. Please specify one with --split-by or perform a sequential import with '-m 1'.

请帮助我。非常感谢您提前。 如果格式有任何问题,请忽略格式。

【问题讨论】:

更新:在上述问题得到解决后应用参数拆分后。但现在面临其他问题。我已经提到了这个链接issues.apache.org/jira/browse/HIVE-25567的所有细节。请帮我解决这个问题。 【参考方案1】:

尝试传递目标目录。

sqoop import "-Dorg.apache.sqoop.splitter.allow_text_splitter=true" \
 --connect "jdbc:tibero:thin:@hostname:8629:DBI" \
 --driver com.tmax.tibero.jdbc.TbDriver \
 --target-dir /project/whatever \
 --username XXX --password XXXX \
 --table DMSDBA.cmm_cadorg_Tb \
 --hive-import \
 --create-hive-table \
 --hive-table DMSDBA.cmm_cadorg_Tb1

【讨论】:

以上是关于无法从 Hive 的 RDMS 导入表的主要内容,如果未能解决你的问题,请参考以下文章

hive 数据导入

hive常用功能:Hive数据导入导出方式

013-HQL中级3-Hive四种数据导入方式介绍

将查询结果从 mysql 导入现有 hive 表的问题

我想知道为啥我使用 sqoop 从 sqlserver 导入的 hive db 中的表正在消失

hive使用教程(2)--数据导入导出、查询与排序