【Ambari-部署】记一次HDFS HA启用失败恢复过程
Posted
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了【Ambari-部署】记一次HDFS HA启用失败恢复过程相关的知识,希望对你有一定的参考价值。
参考技术A人至贱则无敌,要学会制造问题然后解决问题。
启用HDFS HA过程中,被人为的中断或意外的中断导致Secondary NameNone还没有被删除。就像下图的小红框。还没离婚就把小三带回家了,最后全都一夜暴毙...没有人知道那晚发生了什么
通过走访得知
hdc-data1: namenode,datanode,journalnode,
hdc-data2: namenode,datanode,journalnode,secondarynamenode
hdc-data3: namenode,datanode,journalnode
角色查看
清理ZKFC
清理JOURNALNODE
清理 额外的 NAMENODE
ambari HDFS-HA 回滚
curl -u admin:admin -H "X-Requested-By: ambari" -X GET http://zwshen86:8080/api/v1/clusters/bigdata/services/STORM
命令中的 zwshen86 为 Ambari Server 的机器名(端口默认为 8080),bigdata 为 cluster 名字,STORM 为 Service 的名字。
查看hdfs的信息
curl -u admin:admin -H "X-Requested-By: ambari" -X GET http://node01:8080/api/v1/clusters/ocdp/services/HDFS
node01,datanode,journalnode,
node02,datanode,journalnode,SECONDARY_NAMENODE
node03,datanode,journalnode
node04,namenode,
node05,namenode,
回滚操作:
sudo su -l hdfs -c ‘hdfs dfsadmin -safemode enter‘ sudo su -l hdfs -c ‘hdfs dfsadmin -saveNamespace‘
查看各主机的组件角色
curl -u admin:admin -i http://node01:8080/api/v1/clusters/ocdp/host_components?HostRoles/component_name=NAMENODE curl -u admin:admin -i http://node01:8080/api/v1/clusters/ocdp/host_components?HostRoles/component_name=SECONDARY_NAMENODE curl -u admin:admin -i http://node01:8080/api/v1/clusters/ocdp/host_components?HostRoles/component_name=JOURNALNODE curl -u admin:admin -i http://node01:8080/api/v1/clusters/ocdp/host_components?HostRoles/component_name=ZKFC
删除zkfc
curl -u admin:admin -H "X-Requested-By: ambari" -X DELETE http://node01:8080/api/v1/clusters/ocdp/hosts/node05/host_components/ZKFC curl -u admin:admin -H "X-Requested-By: ambari" -X DELETE http://node01:8080/api/v1/clusters/ocdp/hosts/node04/host_components/ZKFC
启用SECONDARY_NAMENODE
curl -u admin:admin -H "X-Requested-By: ambari" -X POST -d ‘{"host_components" : [{"HostRoles":{"component_name":"SECONDARY_NAMENODE"}}] }‘ http://node01:8080/api/v1/clusters/ocdp/hosts?Hosts/host_name=node02 curl -u admin:admin -H "X-Requested-By: ambari" -X PUT -d ‘{"RequestInfo":{"context":"Enable Secondary NameNode"},"Body":{"HostRoles":{"state":"INSTALLED"}}}‘ http://node01:8080/api/v1/clusters/ocdp/hosts/node02/host_components/SECONDARY_NAMENODE curl -u admin:admin -H "X-Requested-By: ambari" -X GET "http://node01:8080/api/v1/clusters/ocdp/host_components?HostRoles/component_name=SECONDARY_NAMENODE&fields=HostRoles/state"
删除journalnode
curl -u admin:admin -H "X-Requested-By: ambari" -X GET http://node01:8080/api/v1/clusters/ocdp/host_components?HostRoles/component_name=JOURNALNODE curl -u admin:admin -H "X-Requested-By: ambari" -X DELETE http://node01:8080/api/v1/clusters/ocdp/hosts/node01/host_components/JOURNALNODE curl -u admin:admin -H "X-Requested-By: ambari" -X DELETE http://node01:8080/api/v1/clusters/ocdp/hosts/node02/host_components/JOURNALNODE curl -u admin:admin -H "X-Requested-By: ambari" -X DELETE http://node01:8080/api/v1/clusters/ocdp/hosts/node03/host_components/JOURNALNODE
删除额外的namenode:
curl -u admin:admin -H "X-Requested-By: ambari" -X GET http://node01:8080/api/v1/clusters/ocdp/host_components?HostRoles/component_name=NAMENODE curl -u admin:admin -H "X-Requested-By: ambari" -X DELETE http://node01:8080/api/v1/clusters/ocdp/hosts/node05/host_components/NAMENODE
重启服务
以上是关于【Ambari-部署】记一次HDFS HA启用失败恢复过程的主要内容,如果未能解决你的问题,请参考以下文章