Sqoop 创建配置单元表 SQL Server 非默认模式

Posted

技术标签:

【中文标题】Sqoop 创建配置单元表 SQL Server 非默认模式【英文标题】:Sqoop create-hive-table SQL Server NON-DEFAULT schema 【发布时间】:2016-08-11 08:40:46 【问题描述】:

使用 Ambari 2.2.2.0 安装的 HDP-2.4.2.0-258

在 SQL Server 中:

TABLE_CATALOG   TABLE_SCHEMA        TABLE_NAME
Management  Administration      SettingAttribute
Management  Administration      SettingAttributeGroup
Management  Administration      SettingAttributeValue
Management  Administration      SettingValue
Management  ape                 DatabaseScriptLog
Management  ape                 DatabaseLog
Management  Common              Language
Management  Common              ThirdPartyType
Management  Common              Country
Management  Company             DistributorCow
Management  Company             CustomerSetting
Management  Company             CustomerSettingAttributeValue

我可以在一个模式中列出数据库和表:

-bash-4.2$ sqoop list-databases --connect 'jdbc:sqlserver://<IP>;database=FleetManagement' --username --password 
find: failed to restore initial working directory: Permission denied
16/08/11 11:25:39 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6.2.4.2.0-258
16/08/11 11:25:39 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
16/08/11 11:25:39 INFO manager.SqlManager: Using default fetchSize of 1000
master
tempdb
model
msdb
Auth
FeatureToggle
FleetManagementCoach
LatestRuntime
FleetManagementThirdParty
VehicleDriverServicesFollowUp
FleetManagementCustomer
FleetManagementMessaging
FleetManagementSubscription
FleetManagementSupport
FleetManagementFollowUp
FleetManagementDatawarehouse
FleetManagement
FleetManagementPositioning

-bash-4.2$ sqoop list-tables --connect 'jdbc:sqlserver://<IP>;database=FleetManagement' --username --password  -- --schema Administration
find: failed to restore initial working directory: Permission denied
16/08/11 11:25:12 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6.2.4.2.0-258
16/08/11 11:25:12 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
16/08/11 11:25:12 INFO manager.SqlManager: Using default fetchSize of 1000
16/08/11 11:25:12 INFO manager.SQLServerManager: We will use schema Administration
SettingAttribute
SettingAttributeGroup
SettingAttributeValue
SettingValue

现在,在使用 create-hive-table 时,Sqoop 无法创建 SettingAttribute 表

我徒劳地尝试了以下命令:

sqoop create-hive-table --driver 'com.microsoft.sqlserver.jdbc.SQLServerDriver' --connection-manager org.apache.sqoop.manager.SQLServerManager --connect 'jdbc:sqlserver://<IP>;database=FleetManagement' --username  --password  --table 'Administration.SettingAttribute'

输出:

16/08/10 16:40:32 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6.2.4.2.0-258
16/08/10 16:40:32 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
16/08/10 16:40:32 INFO tool.BaseSqoopTool: Using Hive-specific delimiters for output. You can override
16/08/10 16:40:32 INFO tool.BaseSqoopTool: delimiters with --fields-terminated-by, etc.
16/08/10 16:40:32 INFO manager.SqlManager: Using default fetchSize of 1000
16/08/10 16:40:33 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM [Administration.SettingAttribute] AS t WHERE 1=0
16/08/10 16:40:33 ERROR manager.SqlManager: Error executing statement: com.microsoft.sqlserver.jdbc.SQLServerException: Invalid object name 'Administration.SettingAttribute'.
com.microsoft.sqlserver.jdbc.SQLServerException: Invalid object name 'Administration.SettingAttribute'.
        at com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDatabaseError(SQLServerException.java:217)
        at com.microsoft.sqlserver.jdbc.SQLServerStatement.getNextResult(SQLServerStatement.java:1655)
        at com.microsoft.sqlserver.jdbc.SQLServerPreparedStatement.doExecutePreparedStatement(SQLServerPreparedStatement.java:440)
        at com.microsoft.sqlserver.jdbc.SQLServerPreparedStatement$PrepStmtExecCmd.doExecute(SQLServerPreparedStatement.java:385)
        at com.microsoft.sqlserver.jdbc.TDSCommand.execute(IOBuffer.java:7505)
        at com.microsoft.sqlserver.jdbc.SQLServerConnection.executeCommand(SQLServerConnection.java:2444)
        at com.microsoft.sqlserver.jdbc.SQLServerStatement.executeCommand(SQLServerStatement.java:191)
        at com.microsoft.sqlserver.jdbc.SQLServerStatement.executeStatement(SQLServerStatement.java:166)
        at com.microsoft.sqlserver.jdbc.SQLServerPreparedStatement.executeQuery(SQLServerPreparedStatement.java:297)
        at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:758)
        at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:767)
        at org.apache.sqoop.manager.SqlManager.getColumnInfoForRawQuery(SqlManager.java:270)
        at org.apache.sqoop.manager.SqlManager.getColumnTypesForRawQuery(SqlManager.java:241)
        at org.apache.sqoop.manager.SqlManager.getColumnTypes(SqlManager.java:227)
        at org.apache.sqoop.hive.TableDefWriter.getCreateTableStmt(TableDefWriter.java:126)
        at org.apache.sqoop.hive.HiveImport.importTable(HiveImport.java:188)
        at org.apache.sqoop.tool.CreateHiveTableTool.run(CreateHiveTableTool.java:58)
        at org.apache.sqoop.Sqoop.run(Sqoop.java:148)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
        at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:184)
        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:226)
        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:235)
        at org.apache.sqoop.Sqoop.main(Sqoop.java:244)
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/hdp/2.4.2.0-258/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hdp/2.4.2.0-258/zookeeper/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hdp/2.4.2.0-258/accumulo/lib/slf4j-log4j12.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
16/08/10 16:40:34 INFO hive.HiveImport: Loading uploaded data into Hive
16/08/10 16:40:34 WARN conf.HiveConf: HiveConf of name hive.server2.enable.impersonation does not exist
Logging initialized using configuration in jar:file:/usr/hdp/2.4.2.0-258/hive/lib/hive-common-1.2.1000.2.4.2.0-258.jar!/hive-log4j.properties
NoViableAltException(307@[])
        at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.identifier(HiveParser_IdentifiersParser.java:11578)
        at org.apache.hadoop.hive.ql.parse.HiveParser.identifier(HiveParser.java:45881)
        at org.apache.hadoop.hive.ql.parse.HiveParser.columnNameType(HiveParser.java:38052)
        at org.apache.hadoop.hive.ql.parse.HiveParser.columnNameTypeList(HiveParser.java:36183)
        at org.apache.hadoop.hive.ql.parse.HiveParser.createTableStatement(HiveParser.java:5222)
        at org.apache.hadoop.hive.ql.parse.HiveParser.ddlStatement(HiveParser.java:2648)
        at org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:1658)
        at org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1117)
        at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:202)
        at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:166)
        at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:431)
        at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:316)
        at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1189)
        at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1237)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1126)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1116)
        at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:216)
        at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:168)
        at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:379)
        at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:314)
        at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:412)
        at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:428)
        at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:717)
        at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:684)
        at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:624)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:497)
        at org.apache.sqoop.hive.HiveImport.executeScript(HiveImport.java:338)
        at org.apache.sqoop.hive.HiveImport.importTable(HiveImport.java:245)
        at org.apache.sqoop.tool.CreateHiveTableTool.run(CreateHiveTableTool.java:58)
        at org.apache.sqoop.Sqoop.run(Sqoop.java:148)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
        at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:184)
        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:226)
        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:235)
        at org.apache.sqoop.Sqoop.main(Sqoop.java:244)
FAILED: ParseException line 1:63 cannot recognize input near ')' 'COMMENT' ''Imported by sqoop on 2016/08/10 16:40:33'' in column specification

这个也失败了:

sqoop create-hive-table --driver 'com.microsoft.sqlserver.jdbc.SQLServerDriver' --connection-manager org.apache.sqoop.manager.SQLServerManager --connect 'jdbc:sqlserver://<IP>;database=FleetManagement' -- --schema Administration --table 'SettingAttribute' --username  --password

输出:

16/08/10 16:42:37 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6.2.4.2.0-258
16/08/10 16:42:37 ERROR tool.BaseSqoopTool: Error parsing arguments for create-hive-table:
16/08/10 16:42:37 ERROR tool.BaseSqoopTool: Unrecognized argument: --
16/08/10 16:42:37 ERROR tool.BaseSqoopTool: Unrecognized argument: --schema
16/08/10 16:42:37 ERROR tool.BaseSqoopTool: Unrecognized argument: Administration
16/08/10 16:42:37 ERROR tool.BaseSqoopTool: Unrecognized argument: --table
16/08/10 16:42:37 ERROR tool.BaseSqoopTool: Unrecognized argument: SettingAttribute
16/08/10 16:42:37 ERROR tool.BaseSqoopTool: Unrecognized argument: --username

create-hive-table 不支持 --schema 选项(documentation 中没有提到)

【问题讨论】:

你的空'--'是什么意思?我假设您指定了用户名和密码参数值? 嗯,这是一个奇怪的语法(-- --schema),我在非默认 SQL Server 模式中列出表时必须使用,有关详细信息,您可以参考:community.hortonworks.com/questions/50557/… 那不t 使用 HCatalog 导入 ... 您是否有权访问与您合作的用户的表和模式?您可以使用 sqoop-list-database 和 sqoop-list-tables 命令进行检查 是的,我编辑了我的问题以包括,如果我对 HCatalog 执行 sqoop 导入,文件会创建到 HDFS 但未创建 HCatalog 表,但我认为我有正确的访问权限,这是一个不同的问题:***.com/questions/38891139/… 【参考方案1】:

在语句末尾使用.. "-- --schema"。

sqoop create-hive-table --driver 'com.microsoft.sqlserver.jdbc.SQLServerDriver' --connection-manager org.apache.sqoop.manager.SQLServerManager --connect 'jdbc:sqlserver://;database= FleetManagement' --table 'SettingAttribute' --username --password -- --schema Administration

【讨论】:

以上是关于Sqoop 创建配置单元表 SQL Server 非默认模式的主要内容,如果未能解决你的问题,请参考以下文章

使用 sqoop 将选定数据从 oracle db 导入 S3,并在 AWS EMR 上使用选定数据创建配置单元表脚本

使用 sqoop 从 sql server 导入表时出错

使用 Sqoop 将数据附加到配置单元表

通过 SQOOP Action 在 OOZIE 中列出 MS SQL Server 表

无法使用 sqoop 将表导入到不同文件中的配置单元(例如 part-0000、part-00001、part-00002)

当表在不同的架构中时,如何从 SQL Server 导入所有带有 sqoop 的表?