CentOS 7安装Hadoop集群
Posted shiwaitaoyuan
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了CentOS 7安装Hadoop集群相关的知识,希望对你有一定的参考价值。
准备三台虚拟机,ip分别为192.168.220.10(master)、192.168.220.11(slave1)、192.168.220.12(slave2)
准备好jdk-6u45-linux-x64.bin和hadoop-1.2.1-bin.tar.gz,放在/usr/local/src/目录下
安装JDK(每台虚拟机都安装)
1.进入到/usr/local/src/目录,执行./jdk-6u45-linux-x64.bin
2.修改~/.bashrc,在文件末尾增加三行
export JAVA_HOME=/usr/local/src/jdk1.6.0_45 export CLASSPATH=.:$CLASSPATH:$JAVA_HOME/lib export PATH=$PATH:$JAVA_HOME/bin
3.使环境变量生效,执行source ~/.bashrc
安装Hadoop
在192.168.220.10机器上安装hadoop
1.进入到/usr/local/src/目录,解压hadoop-1.2.1-bin.tar.gz,执行tar -zxf hadoop-1.2.1-bin.tar.gz
2.修改配置文件
masters文件
master
slaves文件
slave1 slave2
core-site.xml文件
<configuration> <property> <name>hadoop.tmp.dir</name> <value>/usr/local/src/hadoop-1.2.1/tmp</value> </property> <property> <name>fs.default.name</name> <value>hdfs://192.168.220.10:9000</value> </property> </configuration>
mapred-site.xml文件
<configuration> <property> <name>mapred.job.tracker</name> <value>http://192.168.220.10:9001</value> </property> </configuration>
hdfs-site.xml文件
<configuration> <property> <name>dfs.replication</name> <value>3</value> </property> </configuration>
hadoop-env.sh文件,在后面添加一行
export JAVA_HOME=/usr/local/src/jdk1.6.0_45
3.将/usr/local/src/hadoop-1.2.1目录拷贝到192.168.220.11、192.168.220.12机器上
配置hostname
配置192.168.220.10的主机名为master
1.执行hostname master
2.修改/etc/hostname文件
master
修改192.168.220.11的主机名为slave1,修改192.168.220.12的主机名为slave2
配置host文件
三台机器的host文件末尾添加以下代码
192.168.220.10 master 192.168.220.11 slave1 192.168.220.12 slave2
配置SSH
1.在192.168.220.10上执行ssh-keygen,在~目录下新增.ssh目录,目录中的文件为id_rsa,id_rsa.pub
2.将id_rsa.pub拷贝为authorized_keys
cp id_rsa.pub authorized_keys
3.在192.168.220.11和192.168.220.12上分别执行ssh-keygen
4.将192.168.220.11和192.168.220.12上id_rsa.pub的内容分别拷贝到192.168.220.10的authorized_keys文件中,如下:
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC9mGRhFOdcoHw9GUnKQmqThNKpsyah93Dtq/d8RICGWIHDRJ3GXd0sEcb743ejwbuCMmtlhheXcU0FuyA6Cm0jvMyvDfaPKArtxl6KT7Z93uC0VDCXDRomueux81HAIVjc7ZqlXwVeYs1LITxEeJykKlFOXvK7JexWhWGdMMADwxbFMbaNsZ9EwRxcFLFtNg65FQ+u8CIV9KR3D02kemwLCsP+xiRcgs+wirQPm5JM+2cJoLsVQBz3Hk335IsEhc1Xb9Cralo8Tt8gh/ho8K/1pVjvyW1b0LkP9HGNdwVYD9wkWdEJRkryLXBEXpjk4xu+riF+N4rOzJD root@master ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDn79fdfR/NjzPVD3NPj1vBBfQdVOrv7jeb4UJCOsd7xioPRiz8gOQnOmhu5C+GchbyGA+tg5pXwnNJTOO2wn32U4lOPndW0okN/wqyN4vgq/taJi7JgY/8rneBiGaIIdNIy/pAGlMwb53Qn766adetMhsxYMD2l4uxmbVVjzCRb8QP5EsAYTmmFOODzJsPm70uF3j1Q8zGavYg0wFSYR/yECQns4DBSuBJNxdGY6PskBXqurahwi5yaR3vWV1Ix4wtB6BYuQomEnGdzOSfrBMZ/yc5tXo0xmEfY7wFkize6z9Pm2E3oDoMR18YkwT1Cz6fHikVILA9cldtL root@slave1 ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCydYCvASzCZggks4hMqOcYSGLO2eAvocWezNOMwspTfpJ105Jumb/vf5h6cRZeckq56IvhSV6t6mytk4pZoZjjZPSmWvCwLtMRMPShNbA3BYtj5V3WRKV8ZcMrNdD//U7iHHoJm57vI/m+XO42YSYjPw7JDkb8Ij9b6zgI3fyvbSSYeXb451PlyJLHdxIzRMAaZDSbAML9e7EO8VJB9Wf9bXpow4+VeP33it3kgMNUlHQtyqduSwYGxVVtGsUTJkxnuRsbWeeA1/pp8MNFKUgBTMALTVHByglgZqwGcbblJxsG832PIZNRECIFqorm6odftjnT4DR7/0yR root@slave2
5.将192.168.220.10的authorized_keys文件拷贝到192.168.220.11、192.168.220.12机器上
配置完成后三台机器可以互相访问不用密码
好了,到这里Hadoop集群就配置完了,让我们来使用下吧
在192.168.220.10上格式化namenode
进入到/usr/local/src/hadoop-1.2.1/bin目录,执行
./hadoop namenode -format
出现
19/08/04 15:15:21 INFO namenode.NameNode: STARTUP_MSG: /************************************************************ STARTUP_MSG: Starting NameNode STARTUP_MSG: host = master/192.168.220.10 STARTUP_MSG: args = [-format] STARTUP_MSG: version = 1.2.1 STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.2 -r 1503152; compiled by ‘mattf‘ on Mon Jul 22 15:23:09 PDT 2013 STARTUP_MSG: java = 1.6.0_45 ************************************************************/ 19/08/04 15:15:21 INFO util.GSet: Computing capacity for map BlocksMap 19/08/04 15:15:21 INFO util.GSet: VM type = 64-bit 19/08/04 15:15:21 INFO util.GSet: 2.0% max memory = 1013645312 19/08/04 15:15:21 INFO util.GSet: capacity = 2^21 = 2097152 entries 19/08/04 15:15:21 INFO util.GSet: recommended=2097152, actual=2097152 19/08/04 15:15:22 INFO namenode.FSNamesystem: fsOwner=root 19/08/04 15:15:22 INFO namenode.FSNamesystem: supergroup=supergroup 19/08/04 15:15:22 INFO namenode.FSNamesystem: isPermissionEnabled=true 19/08/04 15:15:22 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100 19/08/04 15:15:22 INFO namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s) 19/08/04 15:15:22 INFO namenode.FSEditLog: dfs.namenode.edits.toleration.length = 0 19/08/04 15:15:22 INFO namenode.NameNode: Caching file names occuring more than 10 times 19/08/04 15:15:23 INFO common.Storage: Image file /usr/local/src/hadoop-1.2.1/tmp/dfs/name/current/fsimage of size 110 bytes saved in 0 seconds. 19/08/04 15:15:23 INFO namenode.FSEditLog: closing edit log: position=4, editlog=/usr/local/src/hadoop-1.2.1/tmp/dfs/name/current/edits 19/08/04 15:15:23 INFO namenode.FSEditLog: close success: truncate to 4, editlog=/usr/local/src/hadoop-1.2.1/tmp/dfs/name/current/edits 19/08/04 15:15:23 INFO common.Storage: Storage directory /usr/local/src/hadoop-1.2.1/tmp/dfs/name has been successfully formatted. 19/08/04 15:15:23 INFO namenode.NameNode: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down NameNode at master/192.168.220.10 ************************************************************/
用jps查看进程
[root@master bin]# jps 19905 JobTracker 19650 NameNode 19821 SecondaryNameNode 20202 Jps
在192.168.220.11查看进程
9289 DataNode 9493 Jps 9391 TaskTracker
在192.168.220.12查看进程
6823 DataNode 6923 TaskTracker 7057 Jps
测试下:
执行./hadoop fs -ls /
drwxr-xr-x - root supergroup 0 2019-08-04 15:15 /usr
上传文件,执行./hadoop fs -put /root/w.txt /
查看文件执行./hadoop fs -cat /w.txt,显示
ddd
成功
以上是关于CentOS 7安装Hadoop集群的主要内容,如果未能解决你的问题,请参考以下文章
CentOS 6.5 安装HDFS集群(Hadoop-2.7.3)
Linux上安装Hadoop集群(CentOS7+hadoop-2.8.0)--------hadoop环境的搭建
Hadoop集群安装配置教程_Hadoop2.6.0_Ubuntu/CentOS