ERROR: Timeout on the Spark engine during the broadcast join
Posted 格格巫 MMQ!!
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了ERROR: Timeout on the Spark engine during the broadcast join相关的知识,希望对你有一定的参考价值。
执行 spark 查询的时候报错如下
When the Spark engine runs applications and broadcast join is enabled, the Spark driver broadcasts the cache to the Spark executors running on data nodes in the Hadoop cluster. If you enable broadcast join, applications might fail with an error similar to the following error: java.util.concurrent.TimeoutException: Futures timed out after [300 seconds] at scala.concurrent.impl.Promise D e f a u l t P r o m i s e . r e a d y ( P r o m i s e . s c a l a : 219 ) a t s c a l a . c o n c u r r e n t . i m p l . P r o m i s e DefaultPromise.ready(Promise.scala:219) at scala.concurrent.impl.Promise DefaultPromise.ready(Promise.scala:219)atscala.concurrent.impl.PromiseDefaultPromise.result(Promise.scala:223) at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:201) at org.apache.spark.sql.execution.exchange.+Broadcast+ExchangeExec.doExecuteBroadcast(BroadcastExchangeExec.scala:123)
解决方案
1)将广播关闭
set spark.sql.autoBroadcastJoinThreshold=-1
2)增加广播的超时时间,默认是300s
set spark.sql.broadcastTimeout=2000
3)设置任务执行的尝试次数
set spark.yarn.maxAppAttempts=2
以上是关于ERROR: Timeout on the Spark engine during the broadcast join的主要内容,如果未能解决你的问题,请参考以下文章
The incident LOST_EVENTS occured on the master. Message: error writing to the binary log, Error_code
on the go way git push 403 error
Error:Unable to clone the volume mounted on /space
IIS报错-An error occurred on the server when processing the URL. Please contact the system administrat