hdp 集群问题解决记录
Posted shanhua-fu
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了hdp 集群问题解决记录相关的知识,希望对你有一定的参考价值。
2019-04-23 14:16:21,769 WARN namenode.FSImage (EditLogFileInputStream.java:scanEditLog(359)) - Caught exception after scanning through 0 ops from /hadoop/hdfs/journal/hnscluster/current/edits_inprogress_0000000000554042931 while determining its valid length. Position was 815104
java.io.IOException: Can‘t scan a pre-transactional edit log.
at org.apache.hadoop.hdfs.server.namenode.FSEditLogOp$LegacyReader.scanOp(FSEditLogOp.java:4974)
at org.apache.hadoop.hdfs.server.namenode.EditLogFileInputStream.scanNextOp(EditLogFileInputStream.java:245)
at org.apache.hadoop.hdfs.server.namenode.EditLogFileInputStream.scanEditLog(EditLogFileInputStream.java:355)
at org.apache.hadoop.hdfs.server.namenode.FileJournalManager$EditLogFile.scanLog(FileJournalManager.java:551)
原因:日志节点在日志中记录WARN以下,并且ambari警告日记网络ui无法访问
解决:
在有问题的节点上,将fsimage编辑目录(/hadoop/hdfs/journal/hnscluster/current)移动到备用位置。
将fsimage edits目录(/ hadoop / hdfs / journal / stanleyhotel / current)从正常运行的JournalNode复制到此节点
启动JournalNodes 或者启动hdfs
under replicated blocks
解决:
找出没有复制的block:
hdfs fsck / | grep ‘Under replicated‘ | awk -F‘:‘ ‘{print $1}‘ >> /tmp/under_replicated_files
然后循环修复:
for hdfsfile in `cat /tmp/under_replicated_files`; do echo "Fixing $hdfsfile :" ; hadoop fs -setrep 3 $hdfsfile; done
输出如下:
Fixing /user/hdfs/.staging/job_1547173493660_0405/job.jar :
Replication 3 set: /user/hdfs/.staging/job_1547173493660_0405/job.jar
Fixing /user/hdfs/.staging/job_1547173493660_0405/job.split :
Replication 3 set: /user/hdfs/.staging/job_1547173493660_0405/job.split
Fixing /user/hdfs/.staging/job_1547173493660_0481/job.jar :
Replication 3 set: /user/hdfs/.staging/job_1547173493660_0481/job.jar
Fixing /user/hdfs/.staging/job_1547173493660_0481/job.split :
Replication 3 set: /user/hdfs/.staging/job_1547173493660_0481/job.split
Fixing /user/hdfs/.staging/job_1547173493660_0483/job.jar :
Replication 3 set: /user/hdfs/.staging/job_1547173493660_0483/job.jar
Fixing /user/hdfs/.staging/job_1547173493660_0483/job.split :
Replication 3 set: /user/hdfs/.staging/job_1547173493660_0483/job.split
Fixing /user/hdfs/.staging/job_1547197402450_0021/job.jar :
Replication 3 set: /user/hdfs/.staging/job_1547197402450_0021/job.jar
Fixing /user/hdfs/.staging/job_1547197402450_0021/job.split :
Replication 3 set: /user/hdfs/.staging/job_1547197402450_0021/job.split
以上是关于hdp 集群问题解决记录的主要内容,如果未能解决你的问题,请参考以下文章
Ambari-2.7.5整合HDP-3.1.5集群完整安装记录(内附安Ambari-2.7.5 + HDP-3.1.5安装包下载地址)