Spark 退出状态 134. 啥意思

Posted

技术标签:

【中文标题】Spark 退出状态 134. 啥意思【英文标题】:Spark Exit Status 134. What does it meanSpark 退出状态 134. 什么意思 【发布时间】:2021-01-25 17:37:38 【问题描述】:

在运行我的作业时,我的某些任务出现以下失败错误。 但工作总体上成功完成并退出。这是什么意思?我可以相信结果吗?

ExecutorLostFailure(执行器 8 退出导致其中一个正在运行 任务)原因:来自坏节点的容器: 主机上的 container_1610292825631_0097_01_000013: ip-xx-xxx-xx-xx.us.aws.xxxx.com。退出状态:134。诊断:e 44.0 (TID 16633)

Container exited with a non-zero exit code 134. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
/bin/bash: line 1: 16507 Aborted
Last 4096 bytes of stderr :
 task 422.0 in stage 44.0 (TID 16633)
21/01/25 17:25:50 INFO ShuffleBlockFetcherIterator: Getting 56 non-empty blocks including 12 local blocks and 44 remote blocks
21/01/25 17:25:50 INFO ShuffleBlockFetcherIterator: Started 7 remote fetches in 2 ms
21/01/25 17:25:50 INFO Executor: Finished task 422.0 in stage 44.0 (TID 16633). 6435 bytes result sent to driver
21/01/25 17:25:50 INFO CoarseGrainedExecutorBackend: Got assigned task 16639
21/01/25 17:25:50 INFO Executor: Running task 433.0 in stage 44.0 (TID 16639)
21/01/25 17:25:50 INFO ShuffleBlockFetcherIterator: Getting 95 non-empty blocks including 9 local blocks and 86 remote blocks
21/01/25 17:25:50 INFO ShuffleBlockFetcherIterator: Started 7 remote fetches in 1 ms
21/01/25 17:25:51 INFO Executor: Finished task 383.0 in stage 44.0 (TID 16579). 6478 bytes result sent to driver
21/01/25 17:25:51 INFO CoarseGrainedExecutorBackend: Got assigned task 16661
21/01/25 17:25:51 INFO Executor: Running task 471.0 in stage 44.0 (TID 16661)
21/01/25 17:25:51 INFO ShuffleBlockFetcherIterator: Getting 200 non-empty blocks including 30 local blocks and 170 remote blocks
21/01/25 17:25:51 INFO ShuffleBlockFetcherIterator: Started 6 remote fetches in 1 ms
21/01/25 17:25:52 INFO Executor: Finished task 319.0 in stage 44.0 (TID 16555). 6478 bytes result sent to driver
21/01/25 17:25:52 INFO CoarseGrainedExecutorBackend: Got assigned task 16675
21/01/25 17:25:52 INFO Executor: Running task 482.0 in stage 44.0 (TID 16675)
21/01/25 17:25:52 INFO ShuffleBlockFetcherIterator: Getting 25 non-empty blocks including 5 local blocks and 20 remote blocks
21/01/25 17:25:52 INFO ShuffleBlockFetcherIterator: Started 7 remote fetches in 1 ms
21/01/25 17:25:52 INFO Executor: Finished task 482.0 in stage 44.0 (TID 16675). 6435 bytes result sent to driver
21/01/25 17:25:52 INFO CoarseGrainedExecutorBackend: Got assigned task 16679
21/01/25 17:25:52 INFO Executor: Running task 491.0 in stage 44.0 (TID 16679)
21/01/25 17:25:52 INFO ShuffleBlockFetcherIterator: Getting 138 non-empty blocks including 19 local blocks and 119 remote blocks
21/01/25 17:25:52 INFO ShuffleBlockFetcherIterator: Started 7 remote fetches in 1 ms
21/01/25 17:25:52 INFO Executor: Finished task 433.0 in stage 44.0 (TID 16639). 6521 bytes result sent to driver
21/01/25 17:25:52 INFO CoarseGrainedExecutorBackend: Got assigned task 16684
21/01/25 17:25:52 INFO Executor: Running task 493.0 in stage 44.0 (TID 16684)
21/01/25 17:25:52 INFO ShuffleBlockFetcherIterator: Getting 190 non-empty blocks including 29 local blocks and 161 remote blocks
21/01/25 17:25:52 INFO ShuffleBlockFetcherIterator: Started 7 remote fetches in 1 ms
21/01/25 17:25:52 INFO Executor: Finished task 491.0 in stage 44.0 (TID 16679). 6435 bytes result sent to driver
21/01/25 17:25:52 INFO CoarseGrainedExecutorBackend: Got assigned task 16685
21/01/25 17:25:52 INFO Executor: Running task 500.0 in stage 44.0 (TID 16685)
21/01/25 17:25:52 INFO ShuffleBlockFetcherIterator: Getting 51 non-empty blocks including 12 local blocks and 39 remote blocks
21/01/25 17:25:52 INFO ShuffleBlockFetcherIterator: Started 7 remote fetches in 1 ms
21/01/25 17:25:54 INFO Executor: Finished task 500.0 in stage 44.0 (TID 16685). 6478 bytes result sent to driver
21/01/25 17:25:54 INFO CoarseGrainedExecutorBackend: Got assigned task 16714
21/01/25 17:25:54 INFO Executor: Running task 524.0 in stage 44.0 (TID 16714)
21/01/25 17:25:54 INFO ShuffleBlockFetcherIterator: Getting 114 non-empty blocks including 17 local blocks and 97 remote blocks
21/01/25 17:25:54 INFO ShuffleBlockFetcherIterator: Started 7 remote fetches in 1 ms
21/01/25 17:25:59 INFO Executor: Finished task 471.0 in stage 44.0 (TID 16661). 6478 bytes result sent to driver
21/01/25 17:25:59 INFO CoarseGrainedExecutorBackend: Got assigned task 16767
21/01/25 17:25:59 INFO Executor: Running task 536.0 in stage 44.0 (TID 16767)
21/01/25 17:25:59 INFO ShuffleBlockFetcherIterator: Getting 110 non-empty blocks including 16 local blocks and 94 remote blocks
21/01/25 17:25:59 INFO ShuffleBlockFetcherIterator: Started 5 remote fetches in 1 ms

【问题讨论】:

【参考方案1】:

TL;DR您可以相信结果。

Spark 内置支持在其他可用节点上重试失败的任务以支持容错。您失败的作业将在其他节点/执行程序上重试,并且该结果包含在您的最终结果中。所以,是的,你可以相信结果。

关于错误,退出状态134 表示接收到SIGABORT 退出信号。正如它在错误消息中所说,这可能是因为容器是在黑名单节点(坏节点)上启动的。黑名单节点是被 YARN 标记为不适合运行容器的节点。

【讨论】:

以上是关于Spark 退出状态 134. 啥意思的主要内容,如果未能解决你的问题,请参考以下文章

perl 解释器的状态码是啥意思?

创建核心转储时设置退出状态

Spark 错误 - 退出状态:143。诊断:容器应请求终止

C语言 exit (0)是啥意思

Linux中echo命令的返回值代表啥意思?

华为平板打开啥软件退出来,屏幕上出现同样的软件那是啥意思