标志 -useHCatalog 不起作用

Posted 2023-04-18

技术标签:

【中文标题】标志 -useHCatalog 不起作用【英文标题】：Flag -useHCatalog not working 【发布时间】：2015-05-01 15:48:44 【问题描述】：

我按照here 的说明在单个节点中安装了CDH5.4，此外，我使用这些instructions 将hive-metastore 置于本地模式，一切正常，除非我尝试将pig 与@ 连接987654326@:

➜  ~  pig -useHCatalog
log4j:WARN No appenders could be found for logger (org.apache.hadoop.util.Shell).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
2015-05-01 15:45:08,657 [main] INFO  org.apache.pig.Main - Apache Pig version 0.12.0-cdh5.4.0 (rUnversioned directory) compiled Apr 21 2015, 12:19:15
2015-05-01 15:45:08,658 [main] INFO  org.apache.pig.Main - Logging error messages to: /home/itam/pig_1430495108571.log
2015-05-01 15:45:09,035 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2015-05-01 15:45:09,035 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2015-05-01 15:45:09,035 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://localhost:8020
2015-05-01 15:45:09,940 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2015-05-01 15:45:09,941 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: localhost:8021
2015-05-01 15:45:09,941 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2015-05-01 15:45:09,999 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2015-05-01 15:45:10,001 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2015-05-01 15:45:10,088 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2015-05-01 15:45:10,089 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2015-05-01 15:45:10,125 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2015-05-01 15:45:10,126 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2015-05-01 15:45:10,160 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2015-05-01 15:45:10,162 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2015-05-01 15:45:10,194 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2015-05-01 15:45:10,195 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2015-05-01 15:45:10,227 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2015-05-01 15:45:10,228 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2015-05-01 15:45:10,261 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2015-05-01 15:45:10,262 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2015-05-01 15:45:10,295 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2015-05-01 15:45:10,296 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address

当我尝试访问表格时：

grunt> a = load 'ufos' using org.apache.hcatalog.pig.HCatLoader();
2015-05-01 15:46:11,656 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1070: Could not resolve org.apache.hcatalog.pig.HCatLoader using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.]
Details at logfile: /home/itam/pig_1430495108571.log
grunt>

Hadoop 版本

➜  ~  hadoop version
Hadoop 2.6.0-cdh5.4.0
Subversion http://github.com/cloudera/hadoop -r c788a14a5de9ecd968d1e2666e8765c5f018c271 
Compiled by jenkins on 2015-04-21T19:16Z
Compiled with protoc 2.5.0
From source with checksum cd78f139c66c13ab5cee96e15a629025
This command was run using /usr/lib/hadoop/hadoop-common-2.6.0-cdh5.4.0.jar

更新：我刚刚尝试过 Impala，但它什么也没看到：

➜  ~  impala-shell                                                                                                                                                                                                                             
/usr/lib/python2.7/dist-packages/pkg_resources.py:1049: UserWarning: /home/itam/.python-eggs is writable by group/others and vulnerable to attack when used with get_resource_filename. Consider a more secure location (set with .set_extracti
on_path or the PYTHON_EGG_CACHE environment variable).
  warnings.warn(msg, UserWarning)
Starting Impala Shell without Kerberos authentication
Connected to 6b512e41337d:21000
Server version: impalad version 2.2.0-cdh5 RELEASE (build 2ffd73a4255cefd521362ffe1cfb37463f67f75c)
Welcome to the Impala shell. Press TAB twice to see a list of available commands.

Copyright (c) 2012 Cloudera, Inc. All rights reserved.

(Shell build version: Impala Shell v2.2.0-cdh5 (2ffd73a) built on Tue Apr 21 12:09:21 PDT 2015)
[6b512e41337d:21000] > invalidate metadata;
Query: invalidate metadata
[6b512e41337d:21000] > show tables;
Query: show tables

Fetched 0 row(s) in 0.00s

但是来自beeline:

~  beeline -u jdbc:hive2://
scan complete in 2ms
Connecting to jdbc:hive2://
Connected to: Apache Hive (version 1.1.0-cdh5.4.0)
Driver: Hive JDBC (version 1.1.0-cdh5.4.0)
Transaction isolation: TRANSACTION_REPEATABLE_READ
Beeline version 1.1.0-cdh5.4.0 by Apache Hive
0: jdbc:hive2://> show tables;
OK
+-----------+--+
| tab_name  |
+-----------+--+
| ufos      |
+-----------+--+
1 row selected (0.701 seconds)

成功了……发生了什么？

更新：我也在运行hcatalog

➜  ~  sudo service hive-webhcat-server status
 * WEBHCat server is running

➜  ~  hcat -e "desc ufos"                    
OK
timestamp               string                  from deserializer   
city                    string                  from deserializer   
state                   string                  from deserializer   
shape                   string                  from deserializer   
duration                string                  from deserializer   
summary                 string                  from deserializer   
posted                  string                  from deserializer   
Time taken: 1.314 seconds

更新：impala 的问题是因为我没有将hive-site.xml 复制到/etc/impala/conf，一旦完成，impala-shell 就可以正常工作了。

【问题讨论】：

【参考方案1】：

您使用的加载器已弃用。您需要使用org.apache.hive.hcatalog.pig.HCatLoader，而不是使用org.apache.hcatalog.pig.HCatLoader。

来自org.apache.hcatalog.pig.HCatLoader：

已弃用。改用/修改HCatLoader

【讨论】：

【参考方案2】：

我在 HDP 2.3 和 Pig 0.15 中遇到了这个问题。

HCatLoader() 类的包名在 Hortonworks 发行版中不同。

以下对我有用

使用 org.apache.hive.hcatalog.pig.HCatLoader()

而不是使用 org.apache.hcatalog.pig.HCatLoader();

【讨论】：

【参考方案3】：

就像您开始看到 hive-site.xmlfile 的问题一样 - 您需要将它放在类路径中

提到here：

与 HCatalog 交互的工作流操作需要以下内容类路径中的 jar：hcatalog-core.jar、webhcat-java-client.jar、 hive-common.jar、hive-exec.jar、hive-metastore.jar、hive-serde.jar 和 libfb303.jar。 hive-site.xml 具有与之对话的配置 HCatalog 服务器也需要在类路径中。正确的版本 HCatalog 和 hive jars 应该基于安装在集群上的 HCatalog 版本。

可以使用以下方法之一将 jars 添加到操作的类路径中以下方式。
您可以将 jars 和 hive-site.xml 放在系统共享库中。 pig、hive 或 java 操作的共享库可以覆盖为包括 hcatalog 共享库以及操作的共享图书馆。有关详细信息，请参阅共享库。这 oozie 分发包中的 oozie-sharelib-[version].tar.gz 在 hcatalog sharelib 中需要 HCatalog jar。如果使用不同的 HCatalog 的版本比 sharelib 中捆绑的版本，复制需要将此类版本中的 HCatalog jar 放入 sharelib。您可以将 jars 和 hive-site.xml 放在工作流应用程序 lib/ 路径。您可以在归档标签中指定 jar 文件的位置，并对应 pig、hive 或 java 中的 file 标签中的 hive-site.xml 行动。

如果您要使用 Oozie 协调器，请将它们上传到 HDFS 协调器路径

【讨论】：

以上是关于标志 -useHCatalog 不起作用的主要内容，如果未能解决你的问题，请参考以下文章

返回堆栈和意图标志不起作用

PyInstaller ModuleNotFoundError --paths 标志似乎不起作用

带有 Qt::CustomizeWindowHint 标志的 QMainWindow 不起作用 aero snap

跟踪标志 1211 不起作用 - SQL Server 2008 R2

msgrcv - SA_RESTART 标志不起作用

Windows服务中的布尔标志不起作用