Hadoop集群-集群搭建踩的那些坑之ssh篇
Posted zhang_xinxiu
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Hadoop集群-集群搭建踩的那些坑之ssh篇相关的知识,希望对你有一定的参考价值。
用hadoop搭建的集群在启动时子节点一直无法连接到主节点,在使用hadoop集群时一直报错,也就是集群并没有搭建成功,导致了出现了上面的报错信息org.apache.hadoop.ipc.Client: Retrying connect to server: hadoop-master/192.168.1.130:9000. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=100, sleepTime=10000 MILLISECONDS),这中间排查了很多问题,也查询了很多资料,最后发现是出在了自己的主从机的hosts配置问题上,也就是主机的hosts配置有问题,这里把这几天解决这个问题所踩的坑做一个总结,希望能帮助后来学习的朋友。
搭建的hadoop集群的具体环境如下:
主节点:
系统:CentOS Linux release 7.3.1611 (Core)
系统名称:hadoop-master
系统ip:192.168.1.130
hadoop:hadoop2.8.4
java:1.7.0
ssh2:OpenSSH_7.4p1, OpenSSL 1.0.2k-fips
子节点:
系统:Ubuntu 15.04
系统名称:hadoop-salve1
系统ip:192.168.1.128
hadoop:hadoop2.8.4
java:1.7.0
ssh2:OpenSSH_6.7p1 Ubuntu-5ubuntu1, OpenSSL 1.0.1f
一、主从机无法进行通信
主从机互相通信是hadoop集群搭建时必须要实现的,主要的问题是ssh的配置,主从机需要使用publickey,公钥进行验证,在互相通信时不需要输入私钥,如果主从机无法实现ssh互相通过公钥通信,每次启动hadoop集群的时候都会导致重新输入子节点的私钥,很麻烦,所以需要把主从节点使用公钥进行验证,这里给出一种互相进行公钥配置的方法。
1.1 主从机实现公钥互相登录
首先要生成ssh公钥
bash-4.2$ ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/home/hadoop/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/hadoop/.ssh/id_rsa.
Your public key has been saved in /home/hadoop/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:vjntK50PYjcoDpyIXY3N+SaSxAGrN3kz766xVpcJJtk hadoop@hadoop-master
The key's randomart image is:
+---[RSA 2048]----+
| . |
| o |
| . .o |
| . oo*E. |
|. + Oo=.So |
| + B *.o+. |
|. o B.+.Bo+. |
| .B =o=+o |
| .oo+ o+oo. |
+----[SHA256]-----+
然后将公钥拷贝到从节点,首次拷贝的时候需要输入从机的私钥,以后在使用的时候就可以免密登录了,如下
bash-4.2$ ssh-copy-id -i id_rsa.pub hadoop@hadoop-slave1
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "id_rsa.pub"
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
hadoop@hadoop-slave1's password:
Number of key(s) added: 1
Now try logging into the machine, with: "ssh 'hadoop@hadoop-slave1'"
and check to make sure that only the key(s) you wanted were added.
在接下来将公钥填入本机的授权key中,这样本机在使用ssh访问自己的时候也不需要使用私钥,设置好后,一定要修改authorized_keys的授权,如下
bash-4.2$ cat id_rsa.pub >> authorized_keys
bash-4.2$ chmod 644 authorized_keys
最后验证下公钥是否生效,如果生效了那么在登录的时候是不需要输入秘钥的,如下
bash-4.2$ ssh hadoop@hadoop-slave1
Welcome to Ubuntu 15.04 (GNU/Linux 3.19.0-15-generic x86_64)
* Documentation: https://help.ubuntu.com/
Your Ubuntu release is not supported anymore.
For upgrade information, please visit:
http://www.ubuntu.com/releaseendoflife
New release '15.10' available.
Run 'do-release-upgrade' to upgrade to it.
Last login: Mon Dec 3 12:39:40 2018 from localhost
在搭建过程中也会有很多问题,没法成功的时候公钥验证,这里给几个解决方案。
1.2 公钥验证问题汇总
1.2.1 ssh_config或sshd_config没有开启公钥验证
需要开启ssh的公钥验证
bash-4.2$ vim /etc/ssh/ssh_config
Host *
# GSSAPIAuthentication no
# GSSAPIAuthentication yes
# If this option is set to yes then remote X11 clients will have full access
# to the original X11 display. As virtually no X11 client supports the untrusted
# mode correctly we set this to yes.
# ForwardX11Trusted yes
# Send locale-related environment variables
# SendEnv LANG LC_CTYPE LC_NUMERIC LC_TIME LC_COLLATE LC_MONETARY LC_MESSAGES
# SendEnv LC_PAPER LC_NAME LC_ADDRESS LC_TELEPHONE LC_MEASUREMENT
# SendEnv LC_IDENTIFICATION LC_ALL LANGUAGE
# SendEnv XMODIFIERS
SendEnv LANG LC_*
HashKnownHosts yes
GSSAPIAuthentication yes
GSSAPIDelegateCredentials no
Centos7以后的ssh2中ssh_config默认是并没有开启GSSAPIAuthentication验证的,需要将验证更改开启,但是在ubuntu中默认是开启的。
bash-4.2$ sudo vim /etc/ssh/sshd_config
PubkeyAuthentication yes
AllowUsers hadoop root
# The default is to check both .ssh/authorized_keys and .ssh/authorized_keys2
# but this is overridden so installations will only check .ssh/authorized_keys
AuthorizedKeysFile .ssh/authorized_keys
一定要开启PubkeyAuthentication公钥验证,并且设置AllowUsers允许用户登录。最后也要设置公钥验证文件的路径
AuthorizedKeysFile,公钥文件的验证是默认路径,如果修改的话可以更改位置。
在修改完成后,记得重启ssh,生效配置,CentOs重启如下:
[root@hadoop-master .ssh]$ service sshd.service restart
bash-4.2$ service ssh restart
Ubuntu重启并加载文件如下:
hadoop@hadoop-slave1:/Library/hadoop/hadoop284/logs$ /etc/init.d/ssh restart
hadoop@hadoop-slave1:/root$ /etc/init.d/ssh reload
1.2.2:authorized_keys文件权限问题
进行了上面的设置后使用ssh登录如果还要输入密码的话,可能就是文件权限问题了,这里需要修改.ssh和authorized_keys文件的所属权。
[root@hadoop-master .ssh]$ ls -lsZ /home/hadoop/.ssh
total 16
-rw-r--r--. hadoop hadoop unconfined_u:object_r:ssh_home_t:s0 authorized_keys
-rw-------. hadoop hadoop unconfined_u:object_r:ssh_home_t:s0 id_rsa
-rw-r--r--. hadoop hadoop unconfined_u:object_r:ssh_home_t:s0 id_rsa.pub
-rw-r--r--. hadoop hadoop unconfined_u:object_r:ssh_home_t:s0 known_hosts
[root@hadoop-master .ssh]$ ls -lsZ /home/hadoop/.ssh/authorized_keys
-rwx------. hadoop hadoop unconfined_u:object_r:var_t:s0 /home/hadoop/.ssh/authorized_keys
观察上面authorized_keys文件的所属权发现,一个是ssh_home_t,一个是var_t,文件的所属权有问题,需要修改下
[root@hadoop-master .ssh]$ semanage fcontext -a -t ssh_home_t /home/hadoop/.ssh/authorized_keys
[root@hadoop-master .ssh]$ restorecon -r -vv /home/hadoop/.ssh
[root@hadoop-master .ssh]$ ls -laZ /home/hadoop/.ssh/authorized_keys
-rwx------. hadoop hadoop unconfined_u:object_r:ssh_home_t:s0 /home/hadoop/.ssh/authorized_keys
设置好后使用ssh登录正常,完美解决问题!
以上是关于Hadoop集群-集群搭建踩的那些坑之ssh篇的主要内容,如果未能解决你的问题,请参考以下文章