spark源码解读-SparkContext初始化过程

Posted chengwuyouxin

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了spark源码解读-SparkContext初始化过程相关的知识,希望对你有一定的参考价值。

sparkcontext是spark应用程序的入口,每个spark应用都会创建sparkcontext,用于连接spark集群来执行计算任务.在sparkcontext初始化过程中会创建SparkEnv,SparkUI,TaskSchedule,DAGSchedule等多个核心类,我们会逐个分析他们.

下面我们看一下sparkcontext的初始化过程,首先判断一些参数,

try {
    _conf = config.clone()
    _conf.validateSettings()
    if (!_conf.contains("spark.master")) {
      throw new SparkException("A master URL must be set in your configuration")
    }
    if (!_conf.contains("spark.app.name")) {
      throw new SparkException("An application name must be set in your configuration")
    }
    // log out spark.app.name in the Spark driver logs
    logInfo(s"Submitted application: $appName")
    // System property spark.yarn.app.id must be set if user code ran by AM on a YARN cluster
    if (master == "yarn" && deployMode == "cluster" && !_conf.contains("spark.yarn.app.id")) {
      throw new SparkException("Detected yarn cluster mode, but isn‘t running on a cluster. " +
        "Deployment to YARN is not supported directly by SparkContext. Please use spark-submit.")
    }
    if (_conf.getBoolean("spark.logConf", false)) {
      logInfo("Spark configuration:
" + _conf.toDebugString)
    }
    // Set Spark driver host and port system properties. This explicitly sets the configuration
    // instead of relying on the default value of the config constant.
    _conf.set(DRIVER_HOST_ADDRESS, _conf.get(DRIVER_HOST_ADDRESS))
    _conf.setIfMissing("spark.driver.port", "0")

    _conf.set("spark.executor.id", SparkContext.DRIVER_IDENTIFIER)

    _jars = Utils.getUserJars(_conf)
    _files = _conf.getOption("spark.files").map(_.split(",")).map(_.filter(_.nonEmpty))
      .toSeq.flatten

    _eventLogDir =
      if (isEventLogEnabled) {
        val unresolvedDir = conf.get("spark.eventLog.dir", EventLoggingListener.DEFAULT_LOG_DIR)
          .stripSuffix("/")
        Some(Utils.resolveURI(unresolvedDir))
      } else {
        None
      }

    _eventLogCodec = {
      val compress = _conf.getBoolean("spark.eventLog.compress", false)
      if (compress && isEventLogEnabled) {
        Some(CompressionCodec.getCodecName(_conf)).map(CompressionCodec.getShortName)
      } else {
        None
      }
    }

    _listenerBus = new LiveListenerBus(_conf)
    _statusStore = AppStatusStore.createLiveStore(conf)
    listenerBus.addToStatusQueue(_statusStore.listener.get)

  

 



以上是关于spark源码解读-SparkContext初始化过程的主要内容,如果未能解决你的问题,请参考以下文章

Spark 源码解读SparkContext的初始化之创建和启动DAGScheduler

Spark 源码解读SparkContext的初始化之创建任务调度器TaskScheduler

源码解读|SparkContext源码解读

spark源码之SparkContext

Spark源码剖析——SparkContext

Spark源码剖析——SparkContext