Hadoop 2.7.2 上的 Pig-0.16.0 - 错误 1002：无法存储别名

Posted 2023-04-18

技术标签:

【中文标题】Hadoop 2.7.2 上的 Pig-0.16.0 - 错误 1002：无法存储别名【英文标题】：Pig-0.16.0 on Hadoop 2.7.2 - ERROR 1002: Unable to store alias 【发布时间】：2016-07-15 08:09:47 【问题描述】：

我刚刚开始学习 Pig，为此我在 Ubuntu 14.04 LTS 上安装了一个伪分布式 Hadoop 2.7.2，Pig 版本为 0.16.0。以下是我对 PIG 和 Hadoop 的配置 -

文件：.bashrc

#===============================================================
# Hadoop Variable List

export JAVA_HOME=/usr/lib/jvm/java-9-oracle
export HADOOP_INSTALL=/home/hadoop/hadoop
export PATH=$PATH:$HADOOP_INSTALL/bin
export PATH=$PATH:$HADOOP_INSTALL/sbin
export HADOOP_MAPRED_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_HOME=$HADOOP_INSTALL
export HADOOP_HDFS_HOME=$HADOOP_INSTALL
export HADOOP_HOME=$HADOOP_INSTALL
export YARN_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_INSTALL/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_INSTALL/lib/native"

#===============================================================
# PIG variable
export PIG_HOME="/home/hadoop/pig"
export PIG_INSTALL="$PIG_HOME"
export PIG_CONF_DIR="$PIG_HOME/conf"
export PIG_CLASSPATH="$HADOOP_INSTALL/conf"
export HADOOPDIR="$HADOOP_INSTALL/conf"
export PATH="$PIG_HOME/bin:$PATH"

=========================

下面是我执行猪的目录

-rw-rw-r--  1 hadoop hadoop    540117 Jul 15 12:41 myfile.txt
hadoop@rajeev:~$ pwd
/home/hadoop

我也将此文件复制到 HDFS

hadoop@rajeev:~$ hadoop fs -ls -R /user/hadoop
-rw-r--r--   1 hadoop supergroup     540117 2016-07-15 12:48 /user/hadoop/myfile.txt

现在...当我在 Grunt shell 中执行以下命令时，它会报错！

grunt> a = load 'myfile.txt' as line;
grunt> store a into 'c.out';

2016-07-15 12:56:38,670 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases a
2016-07-15 12:56:38,670 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: a[1,4],a[-1,-1] C:  R: 
2016-07-15 12:56:38,684 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
2016-07-15 12:56:38,685 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1468556821972_0006]
2016-07-15 12:56:53,959 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 50% complete
2016-07-15 12:56:53,959 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1468556821972_0006]
2016-07-15 12:57:25,722 [main] WARN  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Ooops! Some job has failed! Specify -stop_on_failure if you want Pig to stop immediately on failure.
2016-07-15 12:57:25,722 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - job job_1468556821972_0006 has failed! Stop running all dependent jobs
2016-07-15 12:57:25,722 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2016-07-15 12:57:25,726 [main] INFO  org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at /0.0.0.0:8032
2016-07-15 12:57:25,786 [main] INFO  org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at /0.0.0.0:8032
2016-07-15 12:57:25,839 [main] ERROR org.apache.pig.tools.pigstats.mapreduce.MRPigStatsUtil - 1 map reduce job(s) failed!
2016-07-15 12:57:25,841 [main] INFO  org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics: 

HadoopVersion   PigVersion  UserId  StartedAt   FinishedAt  Features
2.7.2   0.16.0  hadoop  2016-07-15 12:56:36 2016-07-15 12:57:25 UNKNOWN

Failed!

Failed Jobs:
JobId   Alias   Feature Message Outputs
job_1468556821972_0006  a   MAP_ONLY    Message: Job failed!       hdfs://localhost:9001/user/hadoop/c.out,

Input(s):
Failed to read data from "hdfs://localhost:9001/user/hadoop/myfile.txt"

Output(s):
Failed to produce result in "hdfs://localhost:9001/user/hadoop/c.out"

Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0

Job DAG:
job_1468556821972_0006


2016-07-15 12:57:25,842 [main] INFO    org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed!

我尝试通过其他方式解决它，只在本地模式下执行 PIG 而不是在 MapReduce 模式下，但似乎没有任何效果。每次这两个简单的命令都失败了。

错误日志文件打印以下消息

2016-07-15 12:56:38,670 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases a
2016-07-15 12:56:38,670 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: a[1,4],a[-1,-1] C:  R: 
2016-07-15 12:56:38,684 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
2016-07-15 12:56:38,685 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1468556821972_0006]
2016-07-15 12:56:53,959 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 50% complete
2016-07-15 12:56:53,959 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1468556821972_0006]
2016-07-15 12:57:25,722 [main] WARN  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Ooops! Some job has failed! Specify -stop_on_failure if you want Pig to stop immediately on failure.
2016-07-15 12:57:25,722 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - job job_1468556821972_0006 has failed! Stop running all dependent jobs
2016-07-15 12:57:25,722 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2016-07-15 12:57:25,726 [main] INFO  org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at /0.0.0.0:8032
2016-07-15 12:57:25,786 [main] INFO  org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at /0.0.0.0:8032
2016-07-15 12:57:25,839 [main] ERROR org.apache.pig.tools.pigstats.mapreduce.MRPigStatsUtil - 1 map reduce job(s) failed!
2016-07-15 12:57:25,841 [main] INFO  org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics: 

HadoopVersion   PigVersion  UserId  StartedAt   FinishedAt  Features
2.7.2   0.16.0  hadoop  2016-07-15 12:56:36 2016-07-15 12:57:25 UNKNOWN

Failed!

Failed Jobs:
JobId   Alias   Feature Message Outputs
job_1468556821972_0006  a   MAP_ONLY    Message: Job failed!     hdfs://localhost:9001/user/hadoop/c.out,

Input(s):
Failed to read data from "hdfs://localhost:9001/user/hadoop/myfile.txt"

Output(s):
Failed to produce result in "hdfs://localhost:9001/user/hadoop/c.out"

Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0

Job DAG:
job_1468556821972_0006


2016-07-15 12:57:25,842 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed!

请求您的帮助！

【问题讨论】：

【参考方案1】：

指定要加载到的字段的完整路径和数据类型。

a = load 'hdfs://localhost:9001/user/hadoop/myfile.txt' AS (line:chararray);
store a into 'hdfs://localhost:9001/user/hadoop/c.out';

【讨论】：

谢谢。我不得不重新安装所有 hadoop 和 pig 文件，当我这样做时......连同你的评论......它解决了问题......谢谢！

以上是关于Hadoop 2.7.2 上的 Pig-0.16.0 - 错误 1002：无法存储别名的主要内容，如果未能解决你的问题，请参考以下文章