我的Spark源码核心SparkContext走读全纪录

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了我的Spark源码核心SparkContext走读全纪录相关的知识,希望对你有一定的参考价值。

我的Spark源码核心SparkContext走读全纪录



Dirver Program(SparkConf)  package org.apache.spark.SparkConf

Master        package org.apache.spark.deploy.master


SparkContext  package org.apache.spark.SparkContext


Stage         package org.apache.spark.scheduler.Stage

Task          package org.apache.spark.scheduler.Task  

DAGScheduler  package org.apache.spark.scheduler   

TaskScheduler package org.apache.spark.scheduler.TaskScheduler

TaskSchedulerImpl  package org.apache.spark.scheduler

Worker        package org.apache.spark.deploy.worker

Executor      package org.apache.spark.executor

BlockManager  package org.apache.spark.storage

TaskSet       package org.apache.spark.scheduler


//初始化后开始创建

// Create and start the scheduler

    val (sched, ts) = SparkContext.createTaskScheduler(this, master)

    _schedulerBackend = sched

    _taskScheduler = ts

    _dagScheduler = new DAGScheduler(this)

    _heartbeatReceiver.send(TaskSchedulerIsSet)

 

/**

   * Create a task scheduler based on a given master URL.

   * Return a 2-tuple of the scheduler backend and the task scheduler.

   */

  private def createTaskScheduler(

      sc: SparkContext,

      master: String): (SchedulerBackend, TaskScheduler) = {


master match {

      case "local" =>


实例化一个

val scheduler = new TaskSchedulerImpl(sc)

构建masterUrls:

val masterUrls = localCluster.start()

据说是非常关键的backend:

val backend = new SparkDeploySchedulerBackend(scheduler, sc, masterUrls)

        scheduler.initialize(backend)

        backend.shutdownCallback = (backend: SparkDeploySchedulerBackend) => {

          localCluster.stop()

        }

        (backend, scheduler)



以上是关于我的Spark源码核心SparkContext走读全纪录的主要内容,如果未能解决你的问题,请参考以下文章

《深入理解Spark-核心思想与源码分析》第三章SparkContext的初始化

spark源码解读-SparkContext初始化过程

SparkContext的初始化(叔篇)——TaskScheduler的启动

SparkContext的初始化(季篇)——测量系统ContextCleaner及环境更新

spark源码走读环境搭建

Spark1.4源码走读笔记之模式匹配