Hadoop深入浅出-001

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Hadoop深入浅出-001相关的知识,希望对你有一定的参考价值。

Doc By xvGe  Hadoop深入浅出-001

什么是Hadoop?

The Apache Hadoop project develops open-source software for reliable,scalable,distributed,computing.

Hadoop解决的问题:

--海量数据存储

--海量数据分析

--资源管理调度

作者:Doug Cutting

*********************************

(1)hadoop核心组件及文件系统概念:

*********************************

版本:

Apache:官方版本。

Cloudera:稳定,有商业支持,推荐使用。

HDP:Hortonworks公司的发行版

Hadoop核心:

--HDFS:分布式文件系统

--YARN:资源管理调度系统

--MapReduce:分布式运算框架

********************************

(2)hdfs的实现机制和文件系统概念:

********************************

1.容量可以线性扩展

2.有副本机制,存储可靠性和吞吐量大

3.有namenode后,客户端仅仅需要指定HDFS上的路径

实现机制:

1.文件被切块存储

2.客户端不需要关心分布式的细节,HDFS提供统一的抽象目录树

3.每一个文件都可以保存多个文件副本

4.HDFS的文件和具体文件位置之间的对应关系交由专门的服务器来管理

***********************

(3)mapreduce的基本思想:

***********************

1.将一个业务处理需求分成两个阶段进行,map阶段,reduce阶段

2.将分布式计算中面临的公共的问题封装成框架来实现(jar包的分发、任务的启动,任务的容错,调度,中间结果的分组传递...)

mapreduce(离线计算)只是分布式运算框架的实现,类似的框架还有storm(流式计算)、spark(内存迭代计算)

********************

(4)伪分布式集群搭建:

********************

1.配置网络参数:

-------------------------------------------------------------------------------------------------------

vim /etc/sysconfig/network     #修改网络配置

NETWORKING=yes

HOSTNAME=node0

:wq

vim /etc/sysconfig/network-scripts/ifcfg-eth0    #修改网卡配置

DEVICE=eth0

TYPE=Ethernet

ONBOOT=yes

BOOTPROTO=none

IPADDR=192.168.10.3

PREFIX=24

GATEWAY=192.168.1.1

:wq

/etc/init.d/network restart     #重启网络服务

Shutting down interface eth0:                              [  OK  ]

Shutting down loopback interface:                          [  OK  ]

Bringing up loopback interface:                            [  OK  ]

Bringing up interface eth0:  Determining if ip address 192.168.10.3 is already in use for device eth0...

                                                           [  OK  ]

vim /etc/hosts   #修改本地IP地址解析文件

127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4

::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

192.168.10.3 node0

:wq

/etc/init.d/iptables stop    #停止防火墙

chkconfig iptables off       #取消防火墙开机自启动

chkconfig iptables --list    #查看防火墙启动状态

iptables        0:off   1:off   2:off   3:off   4:off   5:off   6:off

vim /etc/selinux/config   #修改selinux参数

SELINUX=disabled          #关闭selinux

:wq

reboot   #重启服务器


2.部署JDK

-------------------------------------------------------------------------------------------------------------

mkdir /app/    #创建应用目录

tar -zxvf ./jdk-8u131-linux-x64.tar.gz -C /app/     #解压文件

ln -s /app/jdk1.8.0_131/ /app/jdk        #创建软连接

vim /etc/profile        #编辑环境变量

export JAVA_HOME=/app/jdk

export PATH=$PATH:$JAVA_HOME/bin

:wq

source /etc/profile     #刷新环境变量配置文件

java                    #测试java命令

Usage: java [-options] class [args...]

           (to execute a class)

   or  java [-options] -jar jarfile [args...]

           (to execute a jar file)

where options include:

    -d32          use a 32-bit data model if available

    -d64          use a 64-bit data model if available

    -server       to select the "server" VM

                  The default VM is server.


    -cp <class search path of directories and zip/jar files>

    -classpath <class search path of directories and zip/jar files>

                  A : separated list of directories, JAR archives,

                  and ZIP archives to search for class files.

    -D<name>=<value>

                  set a system property

    -verbose:[class|gc|jni]

                  enable verbose output

    -version      print product version and exit

    -version:<value>

                  Warning: this feature is deprecated and will be removed

                  in a future release.

                  require the specified version to run

    -showversion  print product version and continue

    -jre-restrict-search | -no-jre-restrict-search

                  Warning: this feature is deprecated and will be removed

                  in a future release.

                  include/exclude user private JREs in the version search

    -? -help      print this help message

    -X            print help on non-standard options

    -ea[:<packagename>...|:<classname>]

    -enableassertions[:<packagename>...|:<classname>]

                  enable assertions with specified granularity

    -da[:<packagename>...|:<classname>]

    -disableassertions[:<packagename>...|:<classname>]

                  disable assertions with specified granularity

    -esa | -enablesystemassertions

                  enable system assertions

    -dsa | -disablesystemassertions

                  disable system assertions

    -agentlib:<libname>[=<options>]

                  load native agent library <libname>, e.g. -agentlib:hprof

                  see also, -agentlib:jdwp=help and -agentlib:hprof=help

    -agentpath:<pathname>[=<options>]

                  load native agent library by full pathname

    -javaagent:<jarpath>[=<options>]

                  load Java programming language agent, see java.lang.instrument

    -splash:<imagepath>

                  show splash screen with specified image

See http://www.oracle.com/technetwork/java/javase/documentation/index.html for more details.


javac                      #测试javac命令

Usage: javac <options> <source files>

where possible options include:

  -g                         Generate all debugging info

  -g:none                    Generate no debugging info

  -g:{lines,vars,source}     Generate only some debugging info

  -nowarn                    Generate no warnings

  -verbose                   Output messages about what the compiler is doing

  -deprecation               Output source locations where deprecated APIs are used

  -classpath <path>          Specify where to find user class files and annotation processors

  -cp <path>                 Specify where to find user class files and annotation processors

  -sourcepath <path>         Specify where to find input source files

  -bootclasspath <path>      Override location of bootstrap class files

  -extdirs <dirs>            Override location of installed extensions

  -endorseddirs <dirs>       Override location of endorsed standards path

  -proc:{none,only}          Control whether annotation processing and/or compilation is done.

  -processor <class1>[,<class2>,<class3>...] Names of the annotation processors to run; bypasses default discovery process

  -processorpath <path>      Specify where to find annotation processors

  -parameters                Generate metadata for reflection on method parameters

  -d <directory>             Specify where to place generated class files

  -s <directory>             Specify where to place generated source files

  -h <directory>             Specify where to place generated native header files

  -implicit:{none,class}     Specify whether or not to generate class files for implicitly referenced files

  -encoding <encoding>       Specify character encoding used by source files

  -source <release>          Provide source compatibility with specified release

  -target <release>          Generate class files for specific VM version

  -profile <profile>         Check that API used is available in the specified profile

  -version                   Version information

  -help                      Print a synopsis of standard options

  -Akey[=value]              Options to pass to annotation processors

  -X                         Print a synopsis of nonstandard options

  -J<flag>                   Pass <flag> directly to the runtime system

  -Werror                    Terminate compilation if warnings occur

  @<filename>                Read options and filenames from file

java -version                  #查看Java的版本

java version "1.8.0_131"

Java(TM) SE Runtime Environment (build 1.8.0_131-b11)

Java HotSpot(TM) 64-Bit Server VM (build 25.131-b11, mixed mode)


3.部署Hadoop

----------------------------------------------------------------------------------------------------------

tar -zxvf ./hadoop-2.4.1.tar.gz -C /app/              #解压Hadoop文件

ln -s /app/hadoop-2.4.1/ /app/hadoop                  #创建软连接

##########################################################################################################

vim /app/hadoop/etc/hadoop/hadoop-env.sh

export JAVA_HOME=/app/jdk

:wq

##########################################################################################################

vim /app/hadoop/etc/hadoop/core-site.xml

<configuration>

    <property>

         <name>fs.defaultFS</name>                    #指定HADOOP所使用的文件系统schema(URI),HDFS的NameNode的地址

         <value>hdfs://node0:9000</value>

    </property>

<property>

<name>hadoop.tmp.dir</name>                   #指定hadoop运行时产生文件的存储目录

<value>/hadoop/tmpdata</value>

    </property>

</configuration>

:wq

##########################################################################################################

vim /app/hadoop/etc/hadoop/hdfs-site.xml

<configuration>

    <property>

         <name>dfs.replication</name>                 #指定存储的副本数量,默认为3个

         <value>1</value>

    </property>

</configuration>

:wq

##########################################################################################################

cp  /app/hadoop/etc/hadoop/mapred-site.xml.template /app/hadoop/etc/hadoop/mapred-site.xml

vim /app/hadoop/etc/hadoop/mapred-site.xml

<configuration>

    <property>

        <name>mapreduce.framework.name</name>         #指定mapreduce运行在yarn上

        <value>yarn</value>

    </property>

</configuration>

:wq

##########################################################################################################

vim /app/hadoop/etc/hadoop/yarn-site.xml

<configuration>

    <property>

        <name>yarn.resourcemanager.hostname</name>    #指定YARN的ResourceManager的地址

        <value>node0</value>

    </property>

    <property>

        <name>yarn.nodemanager.aux-services</name>    #reducer获取数据的方式

        <value>mapreduce_shuffle</value>

    </property>

</configuration>

:wq

##########################################################################################################

vim /etc/profile

export JAVA_HOME=/app/jdk

export HADOOP_HOME=/app/hadoop

export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin

source /etc/profile


4.格式化namenode

-------------------------------------------------------------------------------------------------------------

hdfs namenode -format

17/08/13 05:52:00 INFO namenode.NameNode: STARTUP_MSG:

/************************************************************

STARTUP_MSG: Starting NameNode

STARTUP_MSG:   host = node0/192.168.10.3

STARTUP_MSG:   args = [-format]

STARTUP_MSG:   version = 2.4.1

STARTUP_MSG:   classpath = /app/hadoop-2.4.1/etc/hadoop:/app/hadoop-2.4.1/share/hadoop/common/lib/log4j-1.2.17.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/commons-logging-1.1.3.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/jersey-json-1.9.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/jackson-core-asl-1.8.8.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/hadoop-auth-2.4.1.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/jsp-api-2.1.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/stax-api-1.0-2.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/xz-1.0.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/jetty-util-6.1.26.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/guava-11.0.2.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/commons-codec-1.4.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/jersey-server-1.9.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/asm-3.2.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/protobuf-java-2.5.0.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/jasper-runtime-5.5.23.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/paranamer-2.3.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/java-xmlbuilder-0.4.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/snappy-java-1.0.4.1.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/servlet-api-2.5.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/commons-digester-1.8.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/jackson-mapper-asl-1.8.8.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/slf4j-api-1.7.5.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/commons-beanutils-core-1.8.0.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/commons-el-1.0.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/activation-1.1.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/jsr305-1.3.9.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/jettison-1.1.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/jasper-compiler-5.5.23.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/commons-beanutils-1.7.0.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/netty-3.6.2.Final.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/jaxb-impl-2.2.3-1.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/jsch-0.1.42.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/jackson-jaxrs-1.8.8.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/commons-io-2.4.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/junit-4.8.2.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/commons-httpclient-3.1.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/xmlenc-0.52.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/commons-net-3.1.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/jetty-6.1.26.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/commons-configuration-1.6.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/jersey-core-1.9.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/commons-math3-3.1.1.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/httpclient-4.2.5.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/jaxb-api-2.2.2.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/avro-1.7.4.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/jets3t-0.9.0.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/zookeeper-3.4.5.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/httpcore-4.2.5.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/hadoop-annotations-2.4.1.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/commons-cli-1.2.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/jackson-xc-1.8.8.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/commons-collections-3.2.1.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/commons-compress-1.4.1.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/mockito-all-1.8.5.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/commons-lang-2.6.jar:/app/hadoop-2.4.1/share/hadoop/common/hadoop-common-2.4.1.jar:/app/hadoop-2.4.1/share/hadoop/common/hadoop-common-2.4.1-tests.jar:/app/hadoop-2.4.1/share/hadoop/common/hadoop-nfs-2.4.1.jar:/app/hadoop-2.4.1/share/hadoop/hdfs:/app/hadoop-2.4.1/share/hadoop/hdfs/lib/log4j-1.2.17.jar:/app/hadoop-2.4.1/share/hadoop/hdfs/lib/commons-logging-1.1.3.jar:/app/hadoop-2.4.1/share/hadoop/hdfs/lib/jackson-core-asl-1.8.8.jar:/app/hadoop-2.4.1/share/hadoop/hdfs/lib/jsp-api-2.1.jar:/app/hadoop-2.4.1/share/hadoop/hdfs/lib/jetty-util-6.1.26.jar:/app/hadoop-2.4.1/share/hadoop/hdfs/lib/guava-11.0.2.jar:/app/hadoop-2.4.1/share/hadoop/hdfs/lib/commons-codec-1.4.jar:/app/hadoop-2.4.1/share/hadoop/hdfs/lib/jersey-server-1.9.jar:/app/hadoop-2.4.1/share/hadoop/hdfs/lib/asm-3.2.jar:/app/hadoop-2.4.1/share/hadoop/hdfs/lib/protobuf-java-2.5.0.jar:/app/hadoop-2.4.1/share/hadoop/hdfs/lib/jasper-runtime-5.5.23.jar:/app/hadoop-2.4.1/share/hadoop/hdfs/lib/servlet-api-2.5.jar:/app/hadoop-2.4.1/share/hadoop/hdfs/lib/jackson-mapper-asl-1.8.8.jar:/app/hadoop-2.4.1/share/hadoop/hdfs/lib/commons-el-1.0.jar:/app/hadoop-2.4.1/share/hadoop/hdfs/lib/jsr305-1.3.9.jar:/app/hadoop-2.4.1/share/hadoop/hdfs/lib/netty-3.6.2.Final.jar:/app/hadoop-2.4.1/share/hadoop/hdfs/lib/commons-io-2.4.jar:/app/hadoop-2.4.1/share/hadoop/hdfs/lib/xmlenc-0.52.jar:/app/hadoop-2.4.1/share/hadoop/hdfs/lib/jetty-6.1.26.jar:/app/hadoop-2.4.1/share/hadoop/hdfs/lib/jersey-core-1.9.jar:/app/hadoop-2.4.1/share/hadoop/hdfs/lib/commons-cli-1.2.jar:/app/hadoop-2.4.1/share/hadoop/hdfs/lib/commons-daemon-1.0.13.jar:/app/hadoop-2.4.1/share/hadoop/hdfs/lib/commons-lang-2.6.jar:/app/hadoop-2.4.1/share/hadoop/hdfs/hadoop-hdfs-2.4.1.jar:/app/hadoop-2.4.1/share/hadoop/hdfs/hadoop-hdfs-2.4.1-tests.jar:/app/hadoop-2.4.1/share/hadoop/hdfs/hadoop-hdfs-nfs-2.4.1.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/log4j-1.2.17.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/commons-logging-1.1.3.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/jersey-json-1.9.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/jackson-core-asl-1.8.8.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/stax-api-1.0-2.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/xz-1.0.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/jetty-util-6.1.26.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/guava-11.0.2.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/commons-codec-1.4.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/guice-3.0.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/jersey-server-1.9.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/asm-3.2.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/guice-servlet-3.0.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/protobuf-java-2.5.0.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/leveldbjni-all-1.8.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/jersey-client-1.9.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/jersey-guice-1.9.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/servlet-api-2.5.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/jackson-mapper-asl-1.8.8.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/jline-0.9.94.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/activation-1.1.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/jsr305-1.3.9.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/jettison-1.1.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/jaxb-impl-2.2.3-1.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/jackson-jaxrs-1.8.8.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/commons-io-2.4.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/commons-httpclient-3.1.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/jetty-6.1.26.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/jersey-core-1.9.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/jaxb-api-2.2.2.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/aopalliance-1.0.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/zookeeper-3.4.5.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/commons-cli-1.2.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/jackson-xc-1.8.8.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/javax.inject-1.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/commons-collections-3.2.1.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/commons-compress-1.4.1.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/commons-lang-2.6.jar:/app/hadoop-2.4.1/share/hadoop/yarn/hadoop-yarn-server-tests-2.4.1.jar:/app/hadoop-2.4.1/share/hadoop/yarn/hadoop-yarn-server-resourcemanager-2.4.1.jar:/app/hadoop-2.4.1/share/hadoop/yarn/hadoop-yarn-api-2.4.1.jar:/app/hadoop-2.4.1/share/hadoop/yarn/hadoop-yarn-server-web-proxy-2.4.1.jar:/app/hadoop-2.4.1/share/hadoop/yarn/hadoop-yarn-applications-unmanaged-am-launcher-2.4.1.jar:/app/hadoop-2.4.1/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.4.1.jar:/app/hadoop-2.4.1/share/hadoop/yarn/hadoop-yarn-server-applicationhistoryservice-2.4.1.jar:/app/hadoop-2.4.1/share/hadoop/yarn/hadoop-yarn-common-2.4.1.jar:/app/hadoop-2.4.1/share/hadoop/yarn/hadoop-yarn-client-2.4.1.jar:/app/hadoop-2.4.1/share/hadoop/yarn/hadoop-yarn-server-common-2.4.1.jar:/app/hadoop-2.4.1/share/hadoop/yarn/hadoop-yarn-server-nodemanager-2.4.1.jar:/app/hadoop-2.4.1/share/hadoop/mapreduce/lib/log4j-1.2.17.jar:/app/hadoop-2.4.1/share/hadoop/mapreduce/lib/jackson-core-asl-1.8.8.jar:/app/hadoop-2.4.1/share/hadoop/mapreduce/lib/xz-1.0.jar:/app/hadoop-2.4.1/share/hadoop/mapreduce/lib/guice-3.0.jar:/app/hadoop-2.4.1/share/hadoop/mapreduce/lib/jersey-server-1.9.jar:/app/hadoop-2.4.1/share/hadoop/mapreduce/lib/asm-3.2.jar:/app/hadoop-2.4.1/share/hadoop/mapreduce/lib/guice-servlet-3.0.jar:/app/hadoop-2.4.1/share/hadoop/mapreduce/lib/protobuf-java-2.5.0.jar:/app/hadoop-2.4.1/share/hadoop/mapreduce/lib/junit-4.10.jar:/app/hadoop-2.4.1/share/hadoop/mapreduce/lib/paranamer-2.3.jar:/app/hadoop-2.4.1/share/hadoop/mapreduce/lib/jersey-guice-1.9.jar:/app/hadoop-2.4.1/share/hadoop/mapreduce/lib/snappy-java-1.0.4.1.jar:/app/hadoop-2.4.1/share/hadoop/mapreduce/lib/jackson-mapper-asl-1.8.8.jar:/app/hadoop-2.4.1/share/hadoop/mapreduce/lib/hamcrest-core-1.1.jar:/app/hadoop-2.4.1/share/hadoop/mapreduce/lib/netty-3.6.2.Final.jar:/app/hadoop-2.4.1/share/hadoop/mapreduce/lib/commons-io-2.4.jar:/app/hadoop-2.4.1/share/hadoop/mapreduce/lib/jersey-core-1.9.jar:/app/hadoop-2.4.1/share/hadoop/mapreduce/lib/avro-1.7.4.jar:/app/hadoop-2.4.1/share/hadoop/mapreduce/lib/aopalliance-1.0.jar:/app/hadoop-2.4.1/share/hadoop/mapreduce/lib/hadoop-annotations-2.4.1.jar:/app/hadoop-2.4.1/share/hadoop/mapreduce/lib/javax.inject-1.jar:/app/hadoop-2.4.1/share/hadoop/mapreduce/lib/commons-compress-1.4.1.jar:/app/hadoop-2.4.1/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.4.1-tests.jar:/app/hadoop-2.4.1/share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.4.1.jar:/app/hadoop-2.4.1/share/hadoop/mapreduce/hadoop-mapreduce-client-app-2.4.1.jar:/app/hadoop-2.4.1/share/hadoop/mapreduce/hadoop-mapreduce-client-common-2.4.1.jar:/app/hadoop-2.4.1/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-plugins-2.4.1.jar:/app/hadoop-2.4.1/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-2.4.1.jar:/app/hadoop-2.4.1/share/hadoop/mapreduce/hadoop-mapreduce-client-shuffle-2.4.1.jar:/app/hadoop-2.4.1/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.4.1.jar:/app/hadoop-2.4.1/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.4.1.jar:/app/hadoop/contrib/capacity-scheduler/*.jar

STARTUP_MSG:   build = http://svn.apache.org/repos/asf/hadoop/common -r 1604318; compiled by ‘jenkins‘ on 2014-06-21T05:43Z

STARTUP_MSG:   java = 1.8.0_131

************************************************************/

17/08/13 05:52:00 INFO namenode.NameNode: registered UNIX signal handlers for [TERM, HUP, INT]

17/08/13 05:52:00 INFO namenode.NameNode: createNameNode [-format]

Java HotSpot(TM) 64-Bit Server VM warning: You have loaded library /app/hadoop-2.4.1/lib/native/libhadoop.so.1.0.0 which might have disabled stack guard. The VM will try to fix the stack guard now.

It‘s highly recommended that you fix the library with ‘execstack -c <libfile>‘, or link it with ‘-z noexecstack‘.

17/08/13 05:52:01 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

Formatting using clusterid: CID-0f84f197-b0d5-4cd1-a4e4-14a5acfa009e

17/08/13 05:52:01 INFO namenode.FSNamesystem: fsLock is fair:true

17/08/13 05:52:01 INFO namenode.HostFileManager: read includes:

HostSet(

)

17/08/13 05:52:01 INFO namenode.HostFileManager: read excludes:

HostSet(

)

17/08/13 05:52:01 INFO blockmanagement.DatanodeManager: dfs.block.invalidate.limit=1000

17/08/13 05:52:01 INFO blockmanagement.DatanodeManager: dfs.namenode.datanode.registration.ip-hostname-check=true

17/08/13 05:52:01 INFO util.GSet: Computing capacity for map BlocksMap

17/08/13 05:52:01 INFO util.GSet: VM type       = 64-bit

17/08/13 05:52:01 INFO util.GSet: 2.0% max memory 966.7 MB = 19.3 MB

17/08/13 05:52:01 INFO util.GSet: capacity      = 2^21 = 2097152 entries

17/08/13 05:52:01 INFO blockmanagement.BlockManager: dfs.block.access.token.enable=false

17/08/13 05:52:01 INFO blockmanagement.BlockManager: defaultReplication         = 1

17/08/13 05:52:01 INFO blockmanagement.BlockManager: maxReplication             = 512

17/08/13 05:52:01 INFO blockmanagement.BlockManager: minReplication             = 1

17/08/13 05:52:01 INFO blockmanagement.BlockManager: maxReplicationStreams      = 2

17/08/13 05:52:01 INFO blockmanagement.BlockManager: shouldCheckForEnoughRacks  = false

17/08/13 05:52:01 INFO blockmanagement.BlockManager: replicationRecheckInterval = 3000

17/08/13 05:52:01 INFO blockmanagement.BlockManager: encryptDataTransfer        = false

17/08/13 05:52:01 INFO blockmanagement.BlockManager: maxNumBlocksToLog          = 1000

17/08/13 05:52:01 INFO namenode.FSNamesystem: fsOwner             = root (auth:SIMPLE)

17/08/13 05:52:01 INFO namenode.FSNamesystem: supergroup          = supergroup

17/08/13 05:52:01 INFO namenode.FSNamesystem: isPermissionEnabled = true

17/08/13 05:52:01 INFO namenode.FSNamesystem: HA Enabled: false

17/08/13 05:52:01 INFO namenode.FSNamesystem: Append Enabled: true

17/08/13 05:52:02 INFO util.GSet: Computing capacity for map INodeMap

17/08/13 05:52:02 INFO util.GSet: VM type       = 64-bit

17/08/13 05:52:02 INFO util.GSet: 1.0% max memory 966.7 MB = 9.7 MB

17/08/13 05:52:02 INFO util.GSet: capacity      = 2^20 = 1048576 entries

17/08/13 05:52:02 INFO namenode.NameNode: Caching file names occuring more than 10 times

17/08/13 05:52:02 INFO util.GSet: Computing capacity for map cachedBlocks

17/08/13 05:52:02 INFO util.GSet: VM type       = 64-bit

17/08/13 05:52:02 INFO util.GSet: 0.25% max memory 966.7 MB = 2.4 MB

17/08/13 05:52:02 INFO util.GSet: capacity      = 2^18 = 262144 entries

17/08/13 05:52:02 INFO namenode.FSNamesystem: dfs.namenode.safemode.threshold-pct = 0.9990000128746033

17/08/13 05:52:02 INFO namenode.FSNamesystem: dfs.namenode.safemode.min.datanodes = 0

17/08/13 05:52:02 INFO namenode.FSNamesystem: dfs.namenode.safemode.extension     = 30000

17/08/13 05:52:02 INFO namenode.FSNamesystem: Retry cache on namenode is enabled

17/08/13 05:52:02 INFO namenode.FSNamesystem: Retry cache will use 0.03 of total heap and retry cache entry expiry time is 600000 millis

17/08/13 05:52:02 INFO util.GSet: Computing capacity for map NameNodeRetryCache

17/08/13 05:52:02 INFO util.GSet: VM type       = 64-bit

17/08/13 05:52:02 INFO util.GSet: 0.029999999329447746% max memory 966.7 MB = 297.0 KB

17/08/13 05:52:02 INFO util.GSet: capacity      = 2^15 = 32768 entries

17/08/13 05:52:02 INFO namenode.AclConfigFlag: ACLs enabled? false

17/08/13 05:52:02 INFO namenode.FSImage: Allocated new BlockPoolId: BP-833512525-192.168.10.3-1502574722280

17/08/13 05:52:02 INFO common.Storage: Storage directory /hadoop/tmpdata/dfs/name has been successfully formatted.   #表示格式化成功了

17/08/13 05:52:02 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0

17/08/13 05:52:02 INFO util.ExitUtil: Exiting with status 0

17/08/13 05:52:02 INFO namenode.NameNode: SHUTDOWN_MSG:

/************************************************************

SHUTDOWN_MSG: Shutting down NameNode at node0/192.168.10.3

************************************************************/


5.启动Hadoop

start-dfs.sh                      #启动DFS,没有先后顺序

Java HotSpot(TM) 64-Bit Server VM warning: You have loaded library /app/hadoop-2.4.1/lib/native/libhadoop.so.1.0.0 which might have disabled stack guard. The VM will try to fix the stack guard now.

It‘s highly recommended that you fix the library with ‘execstack -c <libfile>‘, or link it with ‘-z noexecstack‘.

17/08/13 06:12:40 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

Starting namenodes on [node0]

[email protected]‘s password:            #输入密码

node0: starting namenode, logging to /app/hadoop-2.4.1/logs/hadoop-root-namenode-node0.out

[email protected]‘s password:        #输入密码

localhost: starting datanode, logging to /app/hadoop-2.4.1/logs/hadoop-root-datanode-node0.out

Starting secondary namenodes [0.0.0.0]

[email protected]‘s password:          #输入密码

0.0.0.0: starting secondarynamenode, logging to /app/hadoop-2.4.1/logs/hadoop-root-secondarynamenode-node0.out

Java HotSpot(TM) 64-Bit Server VM warning: You have loaded library /app/hadoop-2.4.1/lib/native/libhadoop.so.1.0.0 which might have disabled stack guard. The VM will try to fix the stack guard now.

It‘s highly recommended that you fix the library with ‘execstack -c <libfile>‘, or link it with ‘-z noexecstack‘.

17/08/13 06:13:12 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

start-yarn.sh                     #启动yarn

starting yarn daemons

resourcemanager running as process 31652. Stop it first.

[email protected]‘s password:        #输入密码

localhost: nodemanager running as process 31937. Stop it first.

jps                               #使用jps命令验证

32864 SecondaryNameNode

31937 NodeManager

32707 DataNode

31652 ResourceManager

32584 NameNode

33064 Jps

http://192.168.10.3:50070 (HDFS管理界面)

http://192.168.10.3:8088  (MR管理界面)

*******************

(5)ssh远程免密登录:

*******************

客户端生成密钥对:

ssh-keygen -t rsa

Generating public/private rsa key pair.

Enter file in which to save the key (/root/.ssh/id_rsa):

Enter passphrase (empty for no passphrase):

Enter same passphrase again:

Your identification has been saved in /root/.ssh/id_rsa.

Your public key has been saved in /root/.ssh/id_rsa.pub.

The key fingerprint is:

03:c8:7a:54:c7:a4:fc:74:cc:15:23:5b:ba:51:7b:b2 [email protected]

The key‘s randomart image is:

+--[ RSA 2048]----+

|      .oo . *.   |

|   . + o.o B o   |

|    + + . B o .  |

|   o   + . o +   |

|  . .   S . E    |

|   .     .       |

|                 |

|                 |

|                 |

+-----------------+

cd .ssh/

ll

total 12

-rw------- 1 root root 1675 Aug 13 07:11 id_rsa

-rw-r--r-- 1 root root  392 Aug 13 07:11 id_rsa.pub

-rw-r--r-- 1 root root 1180 Aug 13 06:11 known_hosts

将客户端公钥文件拷贝到服务器主机:

ssh-copy-id 192.168.10.3

The authenticity of host ‘192.168.10.3 (192.168.10.3)‘ can‘t be established.

RSA key fingerprint is b9:21:f9:a4:33:de:3e:79:6e:69:45:01:e6:5d:47:54.

Are you sure you want to continue connecting (yes/no)? yes

Warning: Permanently added ‘192.168.10.3‘ (RSA) to the list of known hosts.

[email protected]‘s password:

Now try logging into the machine, with "ssh ‘192.168.10.3‘", and check in:


  .ssh/authorized_keys


to make sure we haven‘t added extra keys that you weren‘t expecting.


在客户端测试:

ssh [email protected]

Last login: Sun Aug 13 04:44:30 2017 from 192.168.10.2


同样的道理,为服务器本机配置免密登录:

ssh-keygen -t rsa     #指定使用RSA算法

Generating public/private rsa key pair.

Enter file in which to save the key (/root/.ssh/id_rsa):

Created directory ‘/root/.ssh‘.

Enter passphrase (empty for no passphrase):

Enter same passphrase again:

Your identification has been saved in /root/.ssh/id_rsa.

Your public key has been saved in /root/.ssh/id_rsa.pub.

The key fingerprint is:

44:55:bb:1d:c4:b9:d8:e0:e5:6b:c2:58:19:f5:c0:57 [email protected]

The key‘s randomart image is:

+--[ RSA 2048]----+

|        .....++.E|

|       .    oo=o.|

|        .  ..O.o.|

|       .    =o+. |

|        S  +. .. |

|          . o o  |

|             o   |

|                 |

|                 |

+-----------------+

[[email protected] ~]# ssh-copy-id 192.168.10.3

The authenticity of host ‘192.168.10.3 (192.168.10.3)‘ can‘t be established.

RSA key fingerprint is b9:21:f9:a4:33:de:3e:79:6e:69:45:01:e6:5d:47:54.

Are you sure you want to continue connecting (yes/no)? yes

Warning: Permanently added ‘192.168.10.3‘ (RSA) to the list of known hosts.

[email protected]‘s password:

Now try logging into the machine, with "ssh ‘192.168.10.3‘", and check in:


  .ssh/authorized_keys


to make sure we haven‘t added extra keys that you weren‘t expecting.


ssh无密码验证免输入yes进行known_hosts添加:

虽然ssh可以进行无密码验证但是如果是多台服务器间进行验证,第一次需要手动输入多次yes来将各个主机的标示加入到known_hosts文件中去:

vim .ssh/config

StrictHostKeyChecking no

:wq

如果想在服务器ip更改后仍然无需进行更新known_hosts文件,或者想免除known_hosts未更新导致的冲突:

vim .ssh/config

UserKnownHostsFile /dev/null

:wq

/etc/init.d/sshd restart          #重启SSHD服务

Stopping sshd:                                             [  OK  ]

Starting sshd:                                               [  OK  ]

本文出自 “帅帅的小哥哥” 博客,请务必保留此出处http://xvjunjie.blog.51cto.com/12360960/1955934

以上是关于Hadoop深入浅出-001的主要内容,如果未能解决你的问题,请参考以下文章

深入浅出 Hadoop YARN

深入浅出 Hadoop YARN

深入浅出Hadoop: 高效处理大数据

001 Java虚拟机深入理解 含JVM性能调优 内存模型 虚拟机原理

深入浅出学大数据Hadoop简介及Apache Hadoop三种搭建方式

深入浅出大数据:到底什么是Hadoop?