Spark运行模式:cluster与client

Posted 诸葛萧晁

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Spark运行模式:cluster与client相关的知识,希望对你有一定的参考价值。

When run SparkSubmit --class [mainClass], SparkSubmit will call a childMainClass which is

1. client mode, childMainClass = mainClass

2. standalone cluster mde, childMainClass = org.apache.spark.deploy.Client

3. yarn cluster mode, childMainClass = org.apache.spark.deploy.yarn.Client

The childMainClass is a wrapper of mainClass. The childMainClass will be called in SparkSubmit, and if cluster mode, the childMainClass will talk to the the cluster and launch a process on one woker to run the mainClass.
 
ps. use "spark-submit -v" to print debug infos.
 
Yarn client: spark-submit -v --class "org.apache.spark.examples.JavaWordCount" --master yarn JavaWordCount.jar
childMainclass: org.apache.spark.examples.JavaWordCount
Yarn cluster: spark-submit -v --class "org.apache.spark.examples.JavaWordCount" --master yarn-cluster JavaWordCount.jar
childMainclass: org.apache.spark.deploy.yarn.Client
 
Standalone client: spark-submit -v --class "org.apache.spark.examples.JavaWordCount" --master spark://aa01:7077 JavaWordCount.jar
childMainclass: org.apache.spark.examples.JavaWordCount
Stanalone cluster: spark-submit -v --class "org.apache.spark.examples.JavaWordCount" --master spark://aa01:7077 --deploy-mode cluster JavaWordCount.jar
childMainclass: org.apache.spark.deploy.rest.RestSubmissionClient (if rest, else org.apache.spark.deploy.Client)
 
Taking standalone spark as example, here is the client mode workflow. The mainclass run in the driver application which could be reside out of the cluster.

以上是关于Spark运行模式:cluster与client的主要内容,如果未能解决你的问题,请参考以下文章

Spark Yarn-cluster与Yarn-client

Spark 在yarn上运行模式详解:cluster模式和client模式

Spark on yarn的两种模式 yarn-cluster 和 yarn-client

Spark的运行模式--Yarn-Cluster

spark 体验点滴-client 与 cluster 部署

Spark 在 Yarn 上运行 Spark 应用程序