无法将数据加载到 Pig 中的 Hortonworks Sandbox

Posted

技术标签:

【中文标题】无法将数据加载到 Pig 中的 Hortonworks Sandbox【英文标题】:Fail to load data to Hortonworks Sandbox in Pig 【发布时间】:2014-09-05 13:23:24 【问题描述】:

嗨,我是 hadoop 的新手,当我第一次运行这个命令时 LOAD 'Pig/iris.csv' using PigStorage (',')弹出的错误:

LOAD 'Pig/iris.csv' using PigStorage (',');
2014-09-05 06:04:04,853 [main] INFO org.apache.pig.Main - Apache Pig version 0.12.1.2.1.1.0-385 (rexported) compiled Apr 16 2014, 15:59:00
2014-09-05 06:04:04,885 [main] INFO org.apache.pig.Main - Logging error messages to: /dev/null
2014-09-05 06:04:07,077 [main] INFO org.apache.pig.impl.util.Utils - Default bootup file /usr/lib/hue/.pigbootup not found
2014-09-05 06:04:14,699 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2014-09-05 06:04:14,699 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2014-09-05 06:04:14,699 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://sandbox.hortonworks.com:8020

2014-09-05 06:05:11,826 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
grunt> LOAD 'Pig/iris.csv' using PigStorage (',');
2014-09-05 06:05:13,203 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1000: Error during parsing. Encountered " <IDENTIFIER> "LOAD "" at line 1, column 1.
Was expecting one of:
<EOF>
"cat" ...
"clear" ...
"fs" ...
"sh" ...
"cd" ...
"cp" ...
"copyFromLocal" ...
"copyToLocal" ...
"dump" ...
"\\d" ...
"describe" ...
"\\de" ...
"aliases" ...
"explain" ...
"\\e" ...
"help" ...
"history" ...
"kill" ...
"ls" ...
"mv" ...
"mkdir" ...
"pwd" ...
"quit" ...
"\\q" ...
"register" ...
"rm" ...
"rmf" ...
"set" ...
"illustrate" ...
"\\i" ...
"run" ...
"exec" ...
"scriptDone" ...
"" ...
"" ...
<EOL> ...
";" ...

Details at logfile: /dev/null

有谁知道如何解决这个问题?

【问题讨论】:

【参考方案1】:

LOAD 创建关系。您需要将其分配给一个变量,以便以后可以使用它:

L = LOAD 'Pig/iris.csv' using PigStorage (',');

DUMP L;

【讨论】:

感谢它的工作!我对 Pig 感到好奇的一件事是,我可以像在 R 中那样通过突出显示它来运行一行命令,还是必须一次运行整个脚本?

以上是关于无法将数据加载到 Pig 中的 Hortonworks Sandbox的主要内容,如果未能解决你的问题,请参考以下文章

hadoop pig:无法加载sqooped数据

Pig 错误:无法找到或加载主类 org.apache.pig.Main

无法从 S3 存储桶(镶木地板文件)将数据加载到 EMR 上的猪中

将 Parquet 数据加载到 PIG 时如何避免 UnsatisfiedLinkError

CSV 将大量数据加载到 Pig 中

转储数据集时将数据从 Hive 加载到 Pig 错误