使用 tera gen 时,从 kv.local/172.20.12.168 到 localhost:8020 的调用因连接异常而失败
Posted
技术标签:
【中文标题】使用 tera gen 时,从 kv.local/172.20.12.168 到 localhost:8020 的调用因连接异常而失败【英文标题】:Call From kv.local/172.20.12.168 to localhost:8020 failed on connection exception, when using tera gen 【发布时间】:2017-06-01 09:46:58 【问题描述】:我正在使用 hadoop teragen 来检查使用 terasort 的 hadoop mapreduce 基准测试。 但是当我运行以下命令时,
hadoop jar /Users/**/Documents/hadoop-2.6.4/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.4.jar teragen -Dmapreduce.job.maps=100 1t 随机数据
我遇到了以下异常,
17/06/01 15:09:21 WARN util.NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where applicable
17/06/01 15:09:22 INFO client.RMProxy: Connecting to ResourceManager at /127.0.0.1:8032
17/06/01 15:09:23 INFO terasort.TeraSort: Generating -727379968 using 100
17/06/01 15:09:23 INFO mapreduce.JobSubmitter: number of splits:100
17/06/01 15:09:23 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1496303775726_0003
17/06/01 15:09:23 INFO impl.YarnClientImpl: Submitted application application_1496303775726_0003
17/06/01 15:09:23 INFO mapreduce.Job: The url to track the job: http://localhost:8088/proxy/application_1496303775726_0003/
17/06/01 15:09:23 INFO mapreduce.Job: Running job: job_1496303775726_0003
17/06/01 15:09:27 INFO mapreduce.Job: Job job_1496303775726_0003 running in uber mode : false
17/06/01 15:09:27 INFO mapreduce.Job: map 0% reduce 0%
17/06/01 15:09:27 INFO mapreduce.Job: Job job_1496303775726_0003 failed with state FAILED due to: Application application_1496303775726_0003 failed 2 times due to AM Container for appattempt_1496303775726_0003_000002 exited with exitCode: -1000
For more detailed output, check application tracking page:http://localhost:8088/proxy/application_1496303775726_0003/Then, click on links to logs of each attempt.
Diagnostics: Call From KV.local/172.20.12.168 to localhost:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
java.net.ConnectException: Call From KV.local/172.20.12.168 to localhost:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:731)
at org.apache.hadoop.ipc.Client.call(Client.java:1473)
at org.apache.hadoop.ipc.Client.call(Client.java:1400)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
at com.sun.proxy.$Proxy34.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:752)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy35.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1977)
at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1118)
at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1114)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1114)
at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:251)
at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:61)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:357)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:356)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:60)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:608)
at org.apache.hadoop.ipc.Client$Connection.setupiostreams(Client.java:706)
at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:369)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1522)
at org.apache.hadoop.ipc.Client.call(Client.java:1439)
... 31 more
如错误所示,它无法连接到 localhost:8020,但是当我检查 namenode Web UI 时,它显示 namenode 处于活动状态。请看下面的截图:
我发现了很多与此相关的帖子,但没有一个帮助我。我还检查了 hosts 文件,其中包含以下行:
127.0.0.1 本地主机
172.20.12.168 本地主机
谁能帮我解决这个问题?
【问题讨论】:
你能运行简单的wordcount例子吗?多少节点集群?我猜是单节点。 是的,简单的 wordcound 运行良好。它是一个单节点集群 主机名是什么? KV.local? 是的,它是 KV.local 然后在你的hosts文件中编辑第二行172.20.12.168 localhost
到172.20.12.168 KV.local
【参考方案1】:
以下过程帮助我解决了这个问题:
停止所有服务。
删除 hdfs-site.xml 中指定的 namenode 和 datanode 目录。
创建新的namenode和datanode目录并相应修改hdfs-site.xml。
在 core-site.xml 中,进行以下更改或添加以下属性:
<property>
<name>fs.defaultFS</name>
<value>hdfs://172.20.12.168/</value>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://172.20.12.168:8020</value>
</property>
在 hadoop-2.6.4/etc/hadoop/hadoop-env.sh 文件中进行如下修改:
export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.8.0_91.jdk/Contents/Home
重启dfs、yarn和mr如下:
start-dfs.sh
start-yarn.sh
mr-jobhistory-daemon.sh start historyserver
【讨论】:
以上是关于使用 tera gen 时,从 kv.local/172.20.12.168 到 localhost:8020 的调用因连接异常而失败的主要内容,如果未能解决你的问题,请参考以下文章
将数据从 ADLS Gen2 加载到 Azure Synapse 时出错
从源代码构建 TensorFlow 时,生成 `gen_io_ops.py` 文件的 bazel 规则在哪里?
将 Learn You Some Erlang 教程从 gen_fsm 转换为 gen_statem