Hadoop深入浅出-001
Posted
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Hadoop深入浅出-001相关的知识,希望对你有一定的参考价值。
Doc By xvGe Hadoop深入浅出-001
什么是Hadoop?
The Apache Hadoop project develops open-source software for reliable,scalable,distributed,computing.
Hadoop解决的问题:
--海量数据存储
--海量数据分析
--资源管理调度
作者:Doug Cutting
*********************************
(1)hadoop核心组件及文件系统概念:
*********************************
版本:
Apache:官方版本。
Cloudera:稳定,有商业支持,推荐使用。
HDP:Hortonworks公司的发行版
Hadoop核心:
--HDFS:分布式文件系统
--YARN:资源管理调度系统
--MapReduce:分布式运算框架
********************************
(2)hdfs的实现机制和文件系统概念:
********************************
1.容量可以线性扩展
2.有副本机制,存储可靠性和吞吐量大
3.有namenode后,客户端仅仅需要指定HDFS上的路径
实现机制:
1.文件被切块存储
2.客户端不需要关心分布式的细节,HDFS提供统一的抽象目录树
3.每一个文件都可以保存多个文件副本
4.HDFS的文件和具体文件位置之间的对应关系交由专门的服务器来管理
***********************
(3)mapreduce的基本思想:
***********************
1.将一个业务处理需求分成两个阶段进行,map阶段,reduce阶段
2.将分布式计算中面临的公共的问题封装成框架来实现(jar包的分发、任务的启动,任务的容错,调度,中间结果的分组传递...)
mapreduce(离线计算)只是分布式运算框架的实现,类似的框架还有storm(流式计算)、spark(内存迭代计算)
********************
(4)伪分布式集群搭建:
********************
1.配置网络参数:
-------------------------------------------------------------------------------------------------------
vim /etc/sysconfig/network #修改网络配置
NETWORKING=yes
HOSTNAME=node0
:wq
vim /etc/sysconfig/network-scripts/ifcfg-eth0 #修改网卡配置
DEVICE=eth0
TYPE=Ethernet
ONBOOT=yes
BOOTPROTO=none
IPADDR=192.168.10.3
PREFIX=24
GATEWAY=192.168.1.1
:wq
/etc/init.d/network restart #重启网络服务
Shutting down interface eth0: [ OK ]
Shutting down loopback interface: [ OK ]
Bringing up loopback interface: [ OK ]
Bringing up interface eth0: Determining if ip address 192.168.10.3 is already in use for device eth0...
[ OK ]
vim /etc/hosts #修改本地IP地址解析文件
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.10.3 node0
:wq
/etc/init.d/iptables stop #停止防火墙
chkconfig iptables off #取消防火墙开机自启动
chkconfig iptables --list #查看防火墙启动状态
iptables 0:off 1:off 2:off 3:off 4:off 5:off 6:off
vim /etc/selinux/config #修改selinux参数
SELINUX=disabled #关闭selinux
:wq
reboot #重启服务器
2.部署JDK
-------------------------------------------------------------------------------------------------------------
mkdir /app/ #创建应用目录
tar -zxvf ./jdk-8u131-linux-x64.tar.gz -C /app/ #解压文件
ln -s /app/jdk1.8.0_131/ /app/jdk #创建软连接
vim /etc/profile #编辑环境变量
export JAVA_HOME=/app/jdk
export PATH=$PATH:$JAVA_HOME/bin
:wq
source /etc/profile #刷新环境变量配置文件
java #测试java命令
Usage: java [-options] class [args...]
(to execute a class)
or java [-options] -jar jarfile [args...]
(to execute a jar file)
where options include:
-d32 use a 32-bit data model if available
-d64 use a 64-bit data model if available
-server to select the "server" VM
The default VM is server.
-cp <class search path of directories and zip/jar files>
-classpath <class search path of directories and zip/jar files>
A : separated list of directories, JAR archives,
and ZIP archives to search for class files.
-D<name>=<value>
set a system property
-verbose:[class|gc|jni]
enable verbose output
-version print product version and exit
-version:<value>
Warning: this feature is deprecated and will be removed
in a future release.
require the specified version to run
-showversion print product version and continue
-jre-restrict-search | -no-jre-restrict-search
Warning: this feature is deprecated and will be removed
in a future release.
include/exclude user private JREs in the version search
-? -help print this help message
-X print help on non-standard options
-ea[:<packagename>...|:<classname>]
-enableassertions[:<packagename>...|:<classname>]
enable assertions with specified granularity
-da[:<packagename>...|:<classname>]
-disableassertions[:<packagename>...|:<classname>]
disable assertions with specified granularity
-esa | -enablesystemassertions
enable system assertions
-dsa | -disablesystemassertions
disable system assertions
-agentlib:<libname>[=<options>]
load native agent library <libname>, e.g. -agentlib:hprof
see also, -agentlib:jdwp=help and -agentlib:hprof=help
-agentpath:<pathname>[=<options>]
load native agent library by full pathname
-javaagent:<jarpath>[=<options>]
load Java programming language agent, see java.lang.instrument
-splash:<imagepath>
show splash screen with specified image
See http://www.oracle.com/technetwork/java/javase/documentation/index.html for more details.
javac #测试javac命令
Usage: javac <options> <source files>
where possible options include:
-g Generate all debugging info
-g:none Generate no debugging info
-g:{lines,vars,source} Generate only some debugging info
-nowarn Generate no warnings
-verbose Output messages about what the compiler is doing
-deprecation Output source locations where deprecated APIs are used
-classpath <path> Specify where to find user class files and annotation processors
-cp <path> Specify where to find user class files and annotation processors
-sourcepath <path> Specify where to find input source files
-bootclasspath <path> Override location of bootstrap class files
-extdirs <dirs> Override location of installed extensions
-endorseddirs <dirs> Override location of endorsed standards path
-proc:{none,only} Control whether annotation processing and/or compilation is done.
-processor <class1>[,<class2>,<class3>...] Names of the annotation processors to run; bypasses default discovery process
-processorpath <path> Specify where to find annotation processors
-parameters Generate metadata for reflection on method parameters
-d <directory> Specify where to place generated class files
-s <directory> Specify where to place generated source files
-h <directory> Specify where to place generated native header files
-implicit:{none,class} Specify whether or not to generate class files for implicitly referenced files
-encoding <encoding> Specify character encoding used by source files
-source <release> Provide source compatibility with specified release
-target <release> Generate class files for specific VM version
-profile <profile> Check that API used is available in the specified profile
-version Version information
-help Print a synopsis of standard options
-Akey[=value] Options to pass to annotation processors
-X Print a synopsis of nonstandard options
-J<flag> Pass <flag> directly to the runtime system
-Werror Terminate compilation if warnings occur
@<filename> Read options and filenames from file
java -version #查看Java的版本
java version "1.8.0_131"
Java(TM) SE Runtime Environment (build 1.8.0_131-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.131-b11, mixed mode)
3.部署Hadoop
----------------------------------------------------------------------------------------------------------
tar -zxvf ./hadoop-2.4.1.tar.gz -C /app/ #解压Hadoop文件
ln -s /app/hadoop-2.4.1/ /app/hadoop #创建软连接
##########################################################################################################
vim /app/hadoop/etc/hadoop/hadoop-env.sh
export JAVA_HOME=/app/jdk
:wq
##########################################################################################################
vim /app/hadoop/etc/hadoop/core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name> #指定HADOOP所使用的文件系统schema(URI),HDFS的NameNode的地址
<value>hdfs://node0:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name> #指定hadoop运行时产生文件的存储目录
<value>/hadoop/tmpdata</value>
</property>
</configuration>
:wq
##########################################################################################################
vim /app/hadoop/etc/hadoop/hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name> #指定存储的副本数量,默认为3个
<value>1</value>
</property>
</configuration>
:wq
##########################################################################################################
cp /app/hadoop/etc/hadoop/mapred-site.xml.template /app/hadoop/etc/hadoop/mapred-site.xml
vim /app/hadoop/etc/hadoop/mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name> #指定mapreduce运行在yarn上
<value>yarn</value>
</property>
</configuration>
:wq
##########################################################################################################
vim /app/hadoop/etc/hadoop/yarn-site.xml
<configuration>
<property>
<name>yarn.resourcemanager.hostname</name> #指定YARN的ResourceManager的地址
<value>node0</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name> #reducer获取数据的方式
<value>mapreduce_shuffle</value>
</property>
</configuration>
:wq
##########################################################################################################
vim /etc/profile
export JAVA_HOME=/app/jdk
export HADOOP_HOME=/app/hadoop
export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
source /etc/profile
4.格式化namenode
-------------------------------------------------------------------------------------------------------------
hdfs namenode -format
17/08/13 05:52:00 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = node0/192.168.10.3
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 2.4.1
STARTUP_MSG: classpath = /app/hadoop-2.4.1/etc/hadoop:/app/hadoop-2.4.1/share/hadoop/common/lib/log4j-1.2.17.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/commons-logging-1.1.3.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/jersey-json-1.9.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/jackson-core-asl-1.8.8.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/hadoop-auth-2.4.1.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/jsp-api-2.1.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/stax-api-1.0-2.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/xz-1.0.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/jetty-util-6.1.26.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/guava-11.0.2.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/commons-codec-1.4.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/jersey-server-1.9.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/asm-3.2.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/protobuf-java-2.5.0.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/jasper-runtime-5.5.23.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/paranamer-2.3.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/java-xmlbuilder-0.4.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/snappy-java-1.0.4.1.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/servlet-api-2.5.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/commons-digester-1.8.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/jackson-mapper-asl-1.8.8.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/slf4j-api-1.7.5.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/commons-beanutils-core-1.8.0.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/commons-el-1.0.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/activation-1.1.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/jsr305-1.3.9.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/jettison-1.1.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/jasper-compiler-5.5.23.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/commons-beanutils-1.7.0.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/netty-3.6.2.Final.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/jaxb-impl-2.2.3-1.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/jsch-0.1.42.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/jackson-jaxrs-1.8.8.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/commons-io-2.4.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/junit-4.8.2.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/commons-httpclient-3.1.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/xmlenc-0.52.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/commons-net-3.1.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/jetty-6.1.26.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/commons-configuration-1.6.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/jersey-core-1.9.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/commons-math3-3.1.1.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/httpclient-4.2.5.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/jaxb-api-2.2.2.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/avro-1.7.4.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/jets3t-0.9.0.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/zookeeper-3.4.5.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/httpcore-4.2.5.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/hadoop-annotations-2.4.1.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/commons-cli-1.2.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/jackson-xc-1.8.8.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/commons-collections-3.2.1.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/commons-compress-1.4.1.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/mockito-all-1.8.5.jar:/app/hadoop-2.4.1/share/hadoop/common/lib/commons-lang-2.6.jar:/app/hadoop-2.4.1/share/hadoop/common/hadoop-common-2.4.1.jar:/app/hadoop-2.4.1/share/hadoop/common/hadoop-common-2.4.1-tests.jar:/app/hadoop-2.4.1/share/hadoop/common/hadoop-nfs-2.4.1.jar:/app/hadoop-2.4.1/share/hadoop/hdfs:/app/hadoop-2.4.1/share/hadoop/hdfs/lib/log4j-1.2.17.jar:/app/hadoop-2.4.1/share/hadoop/hdfs/lib/commons-logging-1.1.3.jar:/app/hadoop-2.4.1/share/hadoop/hdfs/lib/jackson-core-asl-1.8.8.jar:/app/hadoop-2.4.1/share/hadoop/hdfs/lib/jsp-api-2.1.jar:/app/hadoop-2.4.1/share/hadoop/hdfs/lib/jetty-util-6.1.26.jar:/app/hadoop-2.4.1/share/hadoop/hdfs/lib/guava-11.0.2.jar:/app/hadoop-2.4.1/share/hadoop/hdfs/lib/commons-codec-1.4.jar:/app/hadoop-2.4.1/share/hadoop/hdfs/lib/jersey-server-1.9.jar:/app/hadoop-2.4.1/share/hadoop/hdfs/lib/asm-3.2.jar:/app/hadoop-2.4.1/share/hadoop/hdfs/lib/protobuf-java-2.5.0.jar:/app/hadoop-2.4.1/share/hadoop/hdfs/lib/jasper-runtime-5.5.23.jar:/app/hadoop-2.4.1/share/hadoop/hdfs/lib/servlet-api-2.5.jar:/app/hadoop-2.4.1/share/hadoop/hdfs/lib/jackson-mapper-asl-1.8.8.jar:/app/hadoop-2.4.1/share/hadoop/hdfs/lib/commons-el-1.0.jar:/app/hadoop-2.4.1/share/hadoop/hdfs/lib/jsr305-1.3.9.jar:/app/hadoop-2.4.1/share/hadoop/hdfs/lib/netty-3.6.2.Final.jar:/app/hadoop-2.4.1/share/hadoop/hdfs/lib/commons-io-2.4.jar:/app/hadoop-2.4.1/share/hadoop/hdfs/lib/xmlenc-0.52.jar:/app/hadoop-2.4.1/share/hadoop/hdfs/lib/jetty-6.1.26.jar:/app/hadoop-2.4.1/share/hadoop/hdfs/lib/jersey-core-1.9.jar:/app/hadoop-2.4.1/share/hadoop/hdfs/lib/commons-cli-1.2.jar:/app/hadoop-2.4.1/share/hadoop/hdfs/lib/commons-daemon-1.0.13.jar:/app/hadoop-2.4.1/share/hadoop/hdfs/lib/commons-lang-2.6.jar:/app/hadoop-2.4.1/share/hadoop/hdfs/hadoop-hdfs-2.4.1.jar:/app/hadoop-2.4.1/share/hadoop/hdfs/hadoop-hdfs-2.4.1-tests.jar:/app/hadoop-2.4.1/share/hadoop/hdfs/hadoop-hdfs-nfs-2.4.1.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/log4j-1.2.17.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/commons-logging-1.1.3.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/jersey-json-1.9.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/jackson-core-asl-1.8.8.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/stax-api-1.0-2.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/xz-1.0.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/jetty-util-6.1.26.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/guava-11.0.2.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/commons-codec-1.4.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/guice-3.0.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/jersey-server-1.9.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/asm-3.2.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/guice-servlet-3.0.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/protobuf-java-2.5.0.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/leveldbjni-all-1.8.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/jersey-client-1.9.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/jersey-guice-1.9.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/servlet-api-2.5.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/jackson-mapper-asl-1.8.8.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/jline-0.9.94.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/activation-1.1.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/jsr305-1.3.9.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/jettison-1.1.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/jaxb-impl-2.2.3-1.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/jackson-jaxrs-1.8.8.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/commons-io-2.4.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/commons-httpclient-3.1.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/jetty-6.1.26.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/jersey-core-1.9.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/jaxb-api-2.2.2.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/aopalliance-1.0.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/zookeeper-3.4.5.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/commons-cli-1.2.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/jackson-xc-1.8.8.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/javax.inject-1.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/commons-collections-3.2.1.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/commons-compress-1.4.1.jar:/app/hadoop-2.4.1/share/hadoop/yarn/lib/commons-lang-2.6.jar:/app/hadoop-2.4.1/share/hadoop/yarn/hadoop-yarn-server-tests-2.4.1.jar:/app/hadoop-2.4.1/share/hadoop/yarn/hadoop-yarn-server-resourcemanager-2.4.1.jar:/app/hadoop-2.4.1/share/hadoop/yarn/hadoop-yarn-api-2.4.1.jar:/app/hadoop-2.4.1/share/hadoop/yarn/hadoop-yarn-server-web-proxy-2.4.1.jar:/app/hadoop-2.4.1/share/hadoop/yarn/hadoop-yarn-applications-unmanaged-am-launcher-2.4.1.jar:/app/hadoop-2.4.1/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.4.1.jar:/app/hadoop-2.4.1/share/hadoop/yarn/hadoop-yarn-server-applicationhistoryservice-2.4.1.jar:/app/hadoop-2.4.1/share/hadoop/yarn/hadoop-yarn-common-2.4.1.jar:/app/hadoop-2.4.1/share/hadoop/yarn/hadoop-yarn-client-2.4.1.jar:/app/hadoop-2.4.1/share/hadoop/yarn/hadoop-yarn-server-common-2.4.1.jar:/app/hadoop-2.4.1/share/hadoop/yarn/hadoop-yarn-server-nodemanager-2.4.1.jar:/app/hadoop-2.4.1/share/hadoop/mapreduce/lib/log4j-1.2.17.jar:/app/hadoop-2.4.1/share/hadoop/mapreduce/lib/jackson-core-asl-1.8.8.jar:/app/hadoop-2.4.1/share/hadoop/mapreduce/lib/xz-1.0.jar:/app/hadoop-2.4.1/share/hadoop/mapreduce/lib/guice-3.0.jar:/app/hadoop-2.4.1/share/hadoop/mapreduce/lib/jersey-server-1.9.jar:/app/hadoop-2.4.1/share/hadoop/mapreduce/lib/asm-3.2.jar:/app/hadoop-2.4.1/share/hadoop/mapreduce/lib/guice-servlet-3.0.jar:/app/hadoop-2.4.1/share/hadoop/mapreduce/lib/protobuf-java-2.5.0.jar:/app/hadoop-2.4.1/share/hadoop/mapreduce/lib/junit-4.10.jar:/app/hadoop-2.4.1/share/hadoop/mapreduce/lib/paranamer-2.3.jar:/app/hadoop-2.4.1/share/hadoop/mapreduce/lib/jersey-guice-1.9.jar:/app/hadoop-2.4.1/share/hadoop/mapreduce/lib/snappy-java-1.0.4.1.jar:/app/hadoop-2.4.1/share/hadoop/mapreduce/lib/jackson-mapper-asl-1.8.8.jar:/app/hadoop-2.4.1/share/hadoop/mapreduce/lib/hamcrest-core-1.1.jar:/app/hadoop-2.4.1/share/hadoop/mapreduce/lib/netty-3.6.2.Final.jar:/app/hadoop-2.4.1/share/hadoop/mapreduce/lib/commons-io-2.4.jar:/app/hadoop-2.4.1/share/hadoop/mapreduce/lib/jersey-core-1.9.jar:/app/hadoop-2.4.1/share/hadoop/mapreduce/lib/avro-1.7.4.jar:/app/hadoop-2.4.1/share/hadoop/mapreduce/lib/aopalliance-1.0.jar:/app/hadoop-2.4.1/share/hadoop/mapreduce/lib/hadoop-annotations-2.4.1.jar:/app/hadoop-2.4.1/share/hadoop/mapreduce/lib/javax.inject-1.jar:/app/hadoop-2.4.1/share/hadoop/mapreduce/lib/commons-compress-1.4.1.jar:/app/hadoop-2.4.1/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.4.1-tests.jar:/app/hadoop-2.4.1/share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.4.1.jar:/app/hadoop-2.4.1/share/hadoop/mapreduce/hadoop-mapreduce-client-app-2.4.1.jar:/app/hadoop-2.4.1/share/hadoop/mapreduce/hadoop-mapreduce-client-common-2.4.1.jar:/app/hadoop-2.4.1/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-plugins-2.4.1.jar:/app/hadoop-2.4.1/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-2.4.1.jar:/app/hadoop-2.4.1/share/hadoop/mapreduce/hadoop-mapreduce-client-shuffle-2.4.1.jar:/app/hadoop-2.4.1/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.4.1.jar:/app/hadoop-2.4.1/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.4.1.jar:/app/hadoop/contrib/capacity-scheduler/*.jar
STARTUP_MSG: build = http://svn.apache.org/repos/asf/hadoop/common -r 1604318; compiled by ‘jenkins‘ on 2014-06-21T05:43Z
STARTUP_MSG: java = 1.8.0_131
************************************************************/
17/08/13 05:52:00 INFO namenode.NameNode: registered UNIX signal handlers for [TERM, HUP, INT]
17/08/13 05:52:00 INFO namenode.NameNode: createNameNode [-format]
Java HotSpot(TM) 64-Bit Server VM warning: You have loaded library /app/hadoop-2.4.1/lib/native/libhadoop.so.1.0.0 which might have disabled stack guard. The VM will try to fix the stack guard now.
It‘s highly recommended that you fix the library with ‘execstack -c <libfile>‘, or link it with ‘-z noexecstack‘.
17/08/13 05:52:01 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Formatting using clusterid: CID-0f84f197-b0d5-4cd1-a4e4-14a5acfa009e
17/08/13 05:52:01 INFO namenode.FSNamesystem: fsLock is fair:true
17/08/13 05:52:01 INFO namenode.HostFileManager: read includes:
HostSet(
)
17/08/13 05:52:01 INFO namenode.HostFileManager: read excludes:
HostSet(
)
17/08/13 05:52:01 INFO blockmanagement.DatanodeManager: dfs.block.invalidate.limit=1000
17/08/13 05:52:01 INFO blockmanagement.DatanodeManager: dfs.namenode.datanode.registration.ip-hostname-check=true
17/08/13 05:52:01 INFO util.GSet: Computing capacity for map BlocksMap
17/08/13 05:52:01 INFO util.GSet: VM type = 64-bit
17/08/13 05:52:01 INFO util.GSet: 2.0% max memory 966.7 MB = 19.3 MB
17/08/13 05:52:01 INFO util.GSet: capacity = 2^21 = 2097152 entries
17/08/13 05:52:01 INFO blockmanagement.BlockManager: dfs.block.access.token.enable=false
17/08/13 05:52:01 INFO blockmanagement.BlockManager: defaultReplication = 1
17/08/13 05:52:01 INFO blockmanagement.BlockManager: maxReplication = 512
17/08/13 05:52:01 INFO blockmanagement.BlockManager: minReplication = 1
17/08/13 05:52:01 INFO blockmanagement.BlockManager: maxReplicationStreams = 2
17/08/13 05:52:01 INFO blockmanagement.BlockManager: shouldCheckForEnoughRacks = false
17/08/13 05:52:01 INFO blockmanagement.BlockManager: replicationRecheckInterval = 3000
17/08/13 05:52:01 INFO blockmanagement.BlockManager: encryptDataTransfer = false
17/08/13 05:52:01 INFO blockmanagement.BlockManager: maxNumBlocksToLog = 1000
17/08/13 05:52:01 INFO namenode.FSNamesystem: fsOwner = root (auth:SIMPLE)
17/08/13 05:52:01 INFO namenode.FSNamesystem: supergroup = supergroup
17/08/13 05:52:01 INFO namenode.FSNamesystem: isPermissionEnabled = true
17/08/13 05:52:01 INFO namenode.FSNamesystem: HA Enabled: false
17/08/13 05:52:01 INFO namenode.FSNamesystem: Append Enabled: true
17/08/13 05:52:02 INFO util.GSet: Computing capacity for map INodeMap
17/08/13 05:52:02 INFO util.GSet: VM type = 64-bit
17/08/13 05:52:02 INFO util.GSet: 1.0% max memory 966.7 MB = 9.7 MB
17/08/13 05:52:02 INFO util.GSet: capacity = 2^20 = 1048576 entries
17/08/13 05:52:02 INFO namenode.NameNode: Caching file names occuring more than 10 times
17/08/13 05:52:02 INFO util.GSet: Computing capacity for map cachedBlocks
17/08/13 05:52:02 INFO util.GSet: VM type = 64-bit
17/08/13 05:52:02 INFO util.GSet: 0.25% max memory 966.7 MB = 2.4 MB
17/08/13 05:52:02 INFO util.GSet: capacity = 2^18 = 262144 entries
17/08/13 05:52:02 INFO namenode.FSNamesystem: dfs.namenode.safemode.threshold-pct = 0.9990000128746033
17/08/13 05:52:02 INFO namenode.FSNamesystem: dfs.namenode.safemode.min.datanodes = 0
17/08/13 05:52:02 INFO namenode.FSNamesystem: dfs.namenode.safemode.extension = 30000
17/08/13 05:52:02 INFO namenode.FSNamesystem: Retry cache on namenode is enabled
17/08/13 05:52:02 INFO namenode.FSNamesystem: Retry cache will use 0.03 of total heap and retry cache entry expiry time is 600000 millis
17/08/13 05:52:02 INFO util.GSet: Computing capacity for map NameNodeRetryCache
17/08/13 05:52:02 INFO util.GSet: VM type = 64-bit
17/08/13 05:52:02 INFO util.GSet: 0.029999999329447746% max memory 966.7 MB = 297.0 KB
17/08/13 05:52:02 INFO util.GSet: capacity = 2^15 = 32768 entries
17/08/13 05:52:02 INFO namenode.AclConfigFlag: ACLs enabled? false
17/08/13 05:52:02 INFO namenode.FSImage: Allocated new BlockPoolId: BP-833512525-192.168.10.3-1502574722280
17/08/13 05:52:02 INFO common.Storage: Storage directory /hadoop/tmpdata/dfs/name has been successfully formatted. #表示格式化成功了
17/08/13 05:52:02 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0
17/08/13 05:52:02 INFO util.ExitUtil: Exiting with status 0
17/08/13 05:52:02 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at node0/192.168.10.3
************************************************************/
5.启动Hadoop
start-dfs.sh #启动DFS,没有先后顺序
Java HotSpot(TM) 64-Bit Server VM warning: You have loaded library /app/hadoop-2.4.1/lib/native/libhadoop.so.1.0.0 which might have disabled stack guard. The VM will try to fix the stack guard now.
It‘s highly recommended that you fix the library with ‘execstack -c <libfile>‘, or link it with ‘-z noexecstack‘.
17/08/13 06:12:40 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [node0]
[email protected]‘s password: #输入密码
node0: starting namenode, logging to /app/hadoop-2.4.1/logs/hadoop-root-namenode-node0.out
[email protected]‘s password: #输入密码
localhost: starting datanode, logging to /app/hadoop-2.4.1/logs/hadoop-root-datanode-node0.out
Starting secondary namenodes [0.0.0.0]
[email protected]‘s password: #输入密码
0.0.0.0: starting secondarynamenode, logging to /app/hadoop-2.4.1/logs/hadoop-root-secondarynamenode-node0.out
Java HotSpot(TM) 64-Bit Server VM warning: You have loaded library /app/hadoop-2.4.1/lib/native/libhadoop.so.1.0.0 which might have disabled stack guard. The VM will try to fix the stack guard now.
It‘s highly recommended that you fix the library with ‘execstack -c <libfile>‘, or link it with ‘-z noexecstack‘.
17/08/13 06:13:12 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
start-yarn.sh #启动yarn
starting yarn daemons
resourcemanager running as process 31652. Stop it first.
[email protected]‘s password: #输入密码
localhost: nodemanager running as process 31937. Stop it first.
jps #使用jps命令验证
32864 SecondaryNameNode
31937 NodeManager
32707 DataNode
31652 ResourceManager
32584 NameNode
33064 Jps
http://192.168.10.3:50070 (HDFS管理界面)
http://192.168.10.3:8088 (MR管理界面)
*******************
(5)ssh远程免密登录:
*******************
客户端生成密钥对:
ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
03:c8:7a:54:c7:a4:fc:74:cc:15:23:5b:ba:51:7b:b2 [email protected]
The key‘s randomart image is:
+--[ RSA 2048]----+
| .oo . *. |
| . + o.o B o |
| + + . B o . |
| o + . o + |
| . . S . E |
| . . |
| |
| |
| |
+-----------------+
cd .ssh/
ll
total 12
-rw------- 1 root root 1675 Aug 13 07:11 id_rsa
-rw-r--r-- 1 root root 392 Aug 13 07:11 id_rsa.pub
-rw-r--r-- 1 root root 1180 Aug 13 06:11 known_hosts
将客户端公钥文件拷贝到服务器主机:
ssh-copy-id 192.168.10.3
The authenticity of host ‘192.168.10.3 (192.168.10.3)‘ can‘t be established.
RSA key fingerprint is b9:21:f9:a4:33:de:3e:79:6e:69:45:01:e6:5d:47:54.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added ‘192.168.10.3‘ (RSA) to the list of known hosts.
[email protected]‘s password:
Now try logging into the machine, with "ssh ‘192.168.10.3‘", and check in:
.ssh/authorized_keys
to make sure we haven‘t added extra keys that you weren‘t expecting.
在客户端测试:
Last login: Sun Aug 13 04:44:30 2017 from 192.168.10.2
同样的道理,为服务器本机配置免密登录:
ssh-keygen -t rsa #指定使用RSA算法
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Created directory ‘/root/.ssh‘.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
44:55:bb:1d:c4:b9:d8:e0:e5:6b:c2:58:19:f5:c0:57 [email protected]
The key‘s randomart image is:
+--[ RSA 2048]----+
| .....++.E|
| . oo=o.|
| . ..O.o.|
| . =o+. |
| S +. .. |
| . o o |
| o |
| |
| |
+-----------------+
[[email protected] ~]# ssh-copy-id 192.168.10.3
The authenticity of host ‘192.168.10.3 (192.168.10.3)‘ can‘t be established.
RSA key fingerprint is b9:21:f9:a4:33:de:3e:79:6e:69:45:01:e6:5d:47:54.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added ‘192.168.10.3‘ (RSA) to the list of known hosts.
[email protected]‘s password:
Now try logging into the machine, with "ssh ‘192.168.10.3‘", and check in:
.ssh/authorized_keys
to make sure we haven‘t added extra keys that you weren‘t expecting.
ssh无密码验证免输入yes进行known_hosts添加:
虽然ssh可以进行无密码验证但是如果是多台服务器间进行验证,第一次需要手动输入多次yes来将各个主机的标示加入到known_hosts文件中去:
vim .ssh/config
StrictHostKeyChecking no
:wq
如果想在服务器ip更改后仍然无需进行更新known_hosts文件,或者想免除known_hosts未更新导致的冲突:
vim .ssh/config
UserKnownHostsFile /dev/null
:wq
/etc/init.d/sshd restart #重启SSHD服务
Stopping sshd: [ OK ]
Starting sshd: [ OK ]
本文出自 “帅帅的小哥哥” 博客,请务必保留此出处http://xvjunjie.blog.51cto.com/12360960/1955934
以上是关于Hadoop深入浅出-001的主要内容,如果未能解决你的问题,请参考以下文章
001 Java虚拟机深入理解 含JVM性能调优 内存模型 虚拟机原理