无法在 Ubuntu (16.04) 上以伪模式启动 Hadoop (3.1.0)

Posted

技术标签:

【中文标题】无法在 Ubuntu (16.04) 上以伪模式启动 Hadoop (3.1.0)【英文标题】:Unable to start Hadoop (3.1.0) in Pseudomode on Ubuntu (16.04) 【发布时间】:2018-04-19 10:34:55 【问题描述】:

我正在尝试遵循 Hadoop Apache 网站上的入门指南,尤其是伪分布式配置中的入门指南, Getting started guide from Apache Hadoop 3.1.0

但我无法启动 Hadoop 名称和数据节点。任何人都可以帮忙建议吗?即使它的东西我可以运行以尝试进一步调试/调查。

在日志的末尾,我看到一条错误消息(不确定它是重要的还是红鲱鱼)。

    2018-04-18 14:15:40,003 INFO org.apache.hadoop.hdfs.StateChange: STATE* Network topology has 0 racks and 0 datanodes

    2018-04-18 14:15:40,006 INFO org.apache.hadoop.hdfs.StateChange: STATE* UnderReplicatedBlocks has 0 blocks

    2018-04-18 14:15:40,014 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Total number of blocks            = 0

    2018-04-18 14:15:40,014 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Number of invalid blocks          = 0

    2018-04-18 14:15:40,014 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Number of under-replicated blocks = 0

    2018-04-18 14:15:40,014 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Number of  over-replicated blocks = 0

    2018-04-18 14:15:40,014 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Number of blocks being written    = 0

    2018-04-18 14:15:40,014 INFO org.apache.hadoop.hdfs.StateChange: STATE* Replication Queue initialization scan for invalid, over- and under-replicated blocks completed in 11 msec

    2018-04-18 14:15:40,028 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting

    2018-04-18 14:15:40,028 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 9000: starting

    2018-04-18 14:15:40,029 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: NameNode RPC up at: localhost/127.0.0.1:9000

    2018-04-18 14:15:40,031 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Starting services required for active state

    2018-04-18 14:15:40,031 INFO org.apache.hadoop.hdfs.server.namenode.FSDirectory: Initializing quota with 4 thread(s)

    2018-04-18 14:15:40,033 INFO org.apache.hadoop.hdfs.server.namenode.FSDirectory: Quota initialization completed in 2 milliseconds name space=1 storage space=0 storage types=RAM_DISK=0, SSD=0, DISK=0, ARCHIVE=0, PROVIDED=0 2018-04-18 14:15:40,037 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Starting CacheReplicationMonitor with interval 30000 milliseconds

> 2018-04-18 14:15:40,232 ERROR
> org.apache.hadoop.hdfs.server.namenode.NameNode: RECEIVED SIGNAL 15:
> SIGTERM
> 
> 2018-04-18 14:15:40,236 ERROR
> org.apache.hadoop.hdfs.server.namenode.NameNode: RECEIVED SIGNAL 1:
> SIGHUP
> 
> 2018-04-18 14:15:40,236 INFO
> org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG: 
> /************************************************************
> SHUTDOWN_MSG: Shutting down NameNode at c0315/127.0.1.1

我已经确认,我可以在没有密码提示的情况下ssh localhost。我还从上面提到的 Apache 入门指南中运行了以下步骤,

    $ bin/hdfs namenode -format $ sbin/start-dfs.sh

但我无法运行第 3 步来浏览位于 http://localhost:9870/ 的位置。当我从终端提示符运行 >jsp 时,我刚刚返回,

14900 日元

我期待我的节点列表。

我会附上完整的日志。

任何人都可以帮忙调试一下吗?

Java 版本, $ java --version

java 9.0.4 
Java(TM) SE Runtime Environment (build 9.0.4+11) 
Java HotSpot(TM) 64-Bit Server VM (build 9.0.4+11, mixed mode)

EDIT1:我也用 Java8 重复了这些步骤,并得到了相同的错误消息。

EDIT2:按照下面的评论建议,我检查了我现在肯定指向 Java8,并且我还从 /etc/hosts 文件中注释掉了 127.0.0.0 的 localhost 设置

Ubuntu 版本,

$ lsb_release -a

No LSB modules are available.
Distributor ID: neon
Description: KDE neon User Edition 5.12
Release: 16.04
Codename: xenial

我已经尝试运行命令,bin/hdfs version

Hadoop 3.1.0 
Source code repository https://github.com/apache/hadoop -r 16b70619a24cdcf5d3b0fcf4b58ca77238ccbe6d 
Compiled by centos on 2018-03-30T00:00Z 
Compiled with protoc 2.5.0 
From source with checksum 14182d20c972b3e2105580a1ad6990 
This command was run using /home/steelydan.com/roycecollige/Apps/hadoop-3.1.0/share/hadoop/common/hadoop-common-3.1.0.jar

当我尝试bin/hdfs groups 时,它不会返回,而是给了我,

018-04-18 15:33:34,590 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)

当我尝试时,$ bin/hdfs lsSnapshottableDir

lsSnapshottableDir: Call From c0315/127.0.1.1 to localhost:9000 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused

当我尝试时,$ bin/hdfs classpath

/home/steelydan.com/roycecoolige/Apps/hadoop-3.1.0/etc/hadoop:/home/steelydan.com/roycecoolige/Apps/hadoop-3.1.0/share/hadoop/common/lib/:/home/steelydan.com/roycecoolige/Apps/hadoop-3.1.0/share/hadoop/common/:/home/steelydan.com/roycecoolige/Apps/hadoop-3.1.0/share/hadoop/hdfs:/home/steelydan.com/roycecoolige/Apps/hadoop-3.1.0/share/hadoop/hdfs/lib/:/home/steelydan.com/roycecoolige/Apps/hadoop-3.1.0/share/hadoop/hdfs/:/home/steelydan.com/roycecoolige/Apps/hadoop-3.1.0/share/hadoop/mapreduce/:/home/steelydan.com/roycecoolige/Apps/hadoop-3.1.0/share/hadoop/yarn:/home/steelydan.com/roycecoolige/Apps/hadoop-3.1.0/share/hadoop/yarn/lib/:/home/steelydan.com/roycecoolige/Apps/hadoop-3.1.0/share/hadoop/yarn/*

核心站点.xml

<configuration>
  <property>
    <name>fs.defaultFS</name>
    <value>hdfs://localhost:9000</value>
  </property>
</configuration>

hdfs-site.xml

<configuration>
  <property>
    <name>dfs.replication</name>
    <value>1</value>
  </property>
</configuration>

mapred-site.xml

<configuration>
  <property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
  </property>
</configuration>

【问题讨论】:

您需要从 /etc/hosts 文件中删除 127.0.1.1,并且最好不要将 localhost 用于任何服务 另外,Hadoop 不支持 Java 9。请降级 Java +1。查看***.com/questions/48113847/…,尤其是issues.apache.org/jira/browse/HADOOP-11123,了解Java 9 for Hadoop 的状态。 感谢 cmets。我已经安装了 Java8,但我忘记更新我的hadoop-env.sh,所以重新运行我可能仍在引用 Java9。明天上班时我会试试。再次感谢。 好的 - 我可以确认我指向的是 Java8,但我仍然收到相同的消息(请参阅 EDIT2)。有谁知道这个错误信息与什么有关,或者它是一个红鲱鱼? org.apache.hadoop.hdfs.server.namenode.NameNode:收到信号 15:> SIGTERM 【参考方案1】:

我无法弄清楚(我只是再次尝试,因为我非常想念 NEON)但是即使 :9000 没有被使用,操作系统也会在我的情况下发送一个 SIGTERM。

遗憾的是,我发现解决此问题的唯一方法是回到原厂 Ubuntu。

【讨论】:

以上是关于无法在 Ubuntu (16.04) 上以伪模式启动 Hadoop (3.1.0)的主要内容,如果未能解决你的问题,请参考以下文章

Ubuntu 16.04使用Wine安装Xshell 4和Xftp 4

ubuntu 16.04 laravel虚拟主机即使按照描述执行所有步骤也无法正常工作

ubuntu16.04无法通过ssh连接

Win7 + Ubuntu16.04 双系统安装

Ubuntu 16.04安装PowerDesigner15

在 Ubuntu 14.04 上以 Yarn-Client 模式在 Spark 上的 Zeppelin 中加载外部依赖项