DataNode启动失败问题解决

Posted 雷学委

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了DataNode启动失败问题解决相关的知识,希望对你有一定的参考价值。

启动DataNode 提示Missing NameNode address

start all没有报错,但是发现这NameNode的webUI上面DataNode没有挂上。
进入DataNode查看日志发现下面问题。

  • datanode 进程没有起来
  • NodeManager启动过一段时间退出了。

错误 java.io.IOException: No services to connect, missing NameNode address.

2021-05-15 16:31:40,824 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Unable to get NameNode addresses.
2021-05-15 16:31:40,907 INFO org.eclipse.jetty.server.handler.ContextHandler: Stopped o.e.j.w.WebAppContext@33308786{/,null,UNAVAILABLE}{/datanode}
2021-05-15 16:31:40,921 INFO org.eclipse.jetty.server.AbstractConnector: Stopped ServerConnector@4b6e2263{HTTP/1.1,[http/1.1]}{localhost:0}
2021-05-15 16:31:40,921 INFO org.eclipse.jetty.server.handler.ContextHandler: Stopped o.e.j.s.ServletContextHandler@46b61c56{/static,file:///usr/local/hadoop/hadoop-3.2.1/share/hadoop/hdfs/webapps/static/,UNAVAILABLE}
2021-05-15 16:31:40,922 INFO org.eclipse.jetty.server.handler.ContextHandler: Stopped o.e.j.s.ServletContextHandler@36060e{/logs,file:///usr/local/hadoop/hadoop-3.2.1/logs/,UNAVAILABLE}
2021-05-15 16:31:40,950 INFO org.apache.hadoop.ipc.Server: Stopping server on 9867
2021-05-15 16:31:40,951 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping DataNode metrics system…
2021-05-15 16:31:40,951 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system stopped.
2021-05-15 16:31:40,952 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system shutdown complete.
2021-05-15 16:31:40,963 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Shutdown complete.
2021-05-15 16:31:40,964 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in secureMain
java.io.IOException: No services to connect, missing NameNode address.
at org.apache.hadoop.hdfs.server.datanode.BlockPoolManager.refreshNamenodes(BlockPoolManager.java:165)
at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:1441)
at org.apache.hadoop.hdfs.server.datanode.DataNode.(DataNode.java:501)
at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2806)
at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2714)
at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2756)
at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2900)
at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:2924)
2021-05-15 16:31:40,967 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1: java.io.IOException: No services to connect, missing NameNode address.
2021-05-15 16:31:40,988 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
*

这个错误前面还有一个警告日志:
Unable to get NameNode addresses

NodeManager 过了一会也挂了

详细错误棧

2021-05-15 16:47:25,535 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 9 time(s); retry policy
is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-05-15 16:47:49,580 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Cache Size Before Cl
ean: 0, Total Deleted: 0, Public Deleted: 0, Private Deleted: 0
2021-05-15 16:47:56,541 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 0 time(s); retry policy
is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-05-15 16:47:57,542 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 1 time(s); retry policy
is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-05-15 16:47:58,546 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 2 time(s); retry policy
is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-05-15 16:47:59,547 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 3 time(s); retry policy
is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-05-15 16:48:00,549 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 4 time(s); retry policy
is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-05-15 16:48:01,552 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 5 time(s); retry policy
is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-05-15 16:48:02,556 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 6 time(s); retry policy
is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-05-15 16:48:03,558 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 7 time(s); retry policy
is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-05-15 16:48:04,559 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 8 time(s); retry policy
is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-05-15 16:48:05,563 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 9 time(s); retry policy
is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-05-15 16:48:05,564 ERROR org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Unexpected error starting NodeStatusUpdater
java.net.ConnectException: Your endpoint configuration is wrong; For more details see: http://wiki.apache.org/hadoop/UnsetHostnameOrPort
at sun.reflect.GeneratedConstructorAccessor21.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:833)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:753)
at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1549)
at org.apache.hadoop.ipc.Client.call(Client.java:1491)
at org.apache.hadoop.ipc.Client.call(Client.java:1388)
at org.apache.hadoop.ipc.ProtobufRpcEngine I n v o k e r . i n v o k e ( P r o t o b u f R p c E n g i n e . j a v a : 233 ) a t o r g . a p a c h e . h a d o o p . i p c . P r o t o b u f R p c E n g i n e Invoker.invoke(ProtobufRpcEngine.java:233) at org.apache.hadoop.ipc.ProtobufRpcEngine Invoker.invoke(ProtobufRpcEngine.java:233)atorg.apache.hadoop.ipc.ProtobufRpcEngineInvoker.invoke(ProtobufRpcEngine.java:118)
at com.sun.proxy. P r o x y 73. r e g i s t e r N o d e M a n a g e r ( U n k n o w n S o u r c e ) a t o r g . a p a c h e . h a d o o p . y a r n . s e r v e r . a p i . i m p l . p b . c l i e n t . R e s o u r c e T r a c k e r P B C l i e n t I m p l . r e g i s t e r N o d e M a n a g e r ( R e s o u r c e T r a c k e r P B C l i e n t I m p l . j a v a : 73 ) a t s u n . r e f l e c t . G e n e r a t e d M e t h o d A c c e s s o r 11. i n v o k e ( U n k n o w n S o u r c e ) a t s u n . r e f l e c t . D e l e g a t i n g M e t h o d A c c e s s o r I m p l . i n v o k e ( D e l e g a t i n g M e t h o d A c c e s s o r I m p l . j a v a : 43 ) a t j a v a . l a n g . r e f l e c t . M e t h o d . i n v o k e ( M e t h o d . j a v a : 498 ) a t o r g . a p a c h e . h a d o o p . i o . r e t r y . R e t r y I n v o c a t i o n H a n d l e r . i n v o k e M e t h o d ( R e t r y I n v o c a t i o n H a n d l e r . j a v a : 422 ) a t o r g . a p a c h e . h a d o o p . i o . r e t r y . R e t r y I n v o c a t i o n H a n d l e r Proxy73.registerNodeManager(Unknown Source) at org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:73) at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422) at org.apache.hadoop.io.retry.RetryInvocationHandler Proxy73.registerNodeManager(UnknownSource)atorg.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:73)atsun.reflect.GeneratedMethodAccessor11.invoke(UnknownSource)atsun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)atjava.lang.reflect.Method.invoke(Method.java:498)atorg.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)atorg.apache.hadoop.io.retry.RetryInvocationHandlerCall.invokeMethod(RetryInvocationHandler.java:165)
:
2021-05-15 16:48:05,602 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Public cache exiting
2021-05-15 16:48:05,602 WARN org.apache.hadoop.yarn.server.nodemanager.NodeResourceMonitorImpl: org.apache.hadoop.yarn.server.nodemanager.NodeResou
rceMonitorImpl is interrupted. Exiting.
2021-05-15 16:48:05,609 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping NodeManager metrics system…
2021-05-15 16:48:05,610 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NodeManager metrics system stopped.
2021-05-15 16:48:05,610 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NodeManager metrics system shutdown complete.
2021-05-15 16:48:05,610 ERROR org.apache.hadoop.yarn.server.nodemanager.NodeManager: Error starting NodeManager
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.net.ConnectException: Your endpoint configuration is wrong; For more details see: htt
p://wiki.apache.org/hadoop/UnsetHostnameOrPort
at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.serviceStart(NodeStatusUpdaterImpl.java:278)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:975)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:1054)
Caused by: java.net.ConnectException: Your endpoint configuration is wrong; For more details see: http://wiki.apache.org/hadoop/UnsetHostnameOrPort
at sun.reflect.GeneratedConstructorAccessor21.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:833)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:753)
at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1549)
at org.apache.hadoop.ipc.Client.call(Client.java:1491)
at org.apache.hadoop.ipc.Client.call(Client.java:1388)
at org.apache.hadoop.ipc.ProtobufRpcEngine I n v o k e r . i n v o k e ( P r o t o b u f R p c E n g i n e . j a v a : 233 ) a t o r g . a p a c h e . h a d o o p . i p c . P r o t o b u f R p c E n g i n e Invoker.invoke(ProtobufRpcEngine.java:233) at org.apache.hadoop.ipc.ProtobufRpcEngine Invoker.invoke(ProtobufRpcEngine.java:233)atorg.apache.hadoop.ipc.ProtobufRpcEngineInvoker.invoke(ProtobufRpcEngine.java:118)
at com.sun.proxy. P r o x y 73. r e g i s t e r N o d e M a n a g e r ( U n k n o w n S o u r c e ) a t o r g . a p a c h e . h a d o o p . y a r n . s e r v e r . a p i . i m p l . p b . c l i e n t . R e s o u r c e T r a c k e r P B C l

以上是关于DataNode启动失败问题解决的主要内容,如果未能解决你的问题,请参考以下文章

加载 FSImage 文件失败! ||怎么解决

已解决DataNode 无法正常启动解决方案

Hadoop启动datanode失败,clusterId有问题

datanode启动失败

关于hadoop2.4.2版本学习时遇到的问题

启动Hadoop时候datanode没有启动的原因及解决方案