大数据学习:Scala隐式转换和并发编程(DT大数据梦工厂)

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了大数据学习:Scala隐式转换和并发编程(DT大数据梦工厂)相关的知识,希望对你有一定的参考价值。

很多Spark代码中使用了隐式转换、隐式参数、隐式类、隐式对象

如果不掌握,基本在读写复杂代码的时候读不懂

并发编程,怎么样进行高效并发,相互之间怎么通信,Spark这种分布式并发肯定非常重要

(Actor、Akka)

==========隐式转换函数============

可以手动指定将某种类型的对象转换成其它类型的对象或者类

转换原因:假设制定好接口

比如File,我们想要File.dtSpark的方法,在JAVA中不行

如果在Scala里面我们可以进行升级,将File编程其它类型,就用之后的类型来调用这个方法

implicit convert function

 @deprecated("Replaced by implicit functions in the RDD companion object. This is " +

    "kept here only for backward compatibility.", "1.3.0")

  def rddToSequenceFileRDDFunctions[K <% Writable: ClassTag, V <% Writable: ClassTag](

      rdd: RDD[(K, V)]): SequenceFileRDDFunctions[K, V] = {

    val kf = implicitly[K => Writable]

    val vf = implicitly[V => Writable]

    // Set the Writable class to null and `SequenceFileRDDFunctions` will use Reflection to get it

    implicit val keyWritableFactory = new WritableFactory[K](_ => null, kf)

    implicit val valueWritableFactory = new WritableFactory[V](_ => null, vf)

    RDD.rddToSequenceFileRDDFunctions(rdd)

  }

//如果上下文里面没有隐式转换,就会到类的伴生对象找有没有import,所以不用import

手工一个例子,进入神奇的世界,地狱召唤模式,最简单的隐式转换:

scala> class Engineer(val name:String,val salary:Double)

defined class Engineer

scala> class Person(val name:String)

defined class Person

scala> new Person("Spark").code

<console>:9: error: value code is not a member of Person

              new Person("Spark").code

                                  ^

=>进化

scala> class Engineer(val name:String,val salary:Double){

     | def code = println("Coding......")

     | }

defined class Engineer

scala> def toCode(p:Person){

     | p.code

     | }

<console>:9: error: value code is not a member of Person

       p.code

         ^

implicit定义的时候,一般要写返回类型,虽然不强制要写,隐式转换函数出现之后,就解决了上面的错误,隐式转换主要通过输入类型来判断找类型匹配

scala> implicit def person2Engineer(p:Person):Engineer = {

     | new Engineer(p.name,1000000)

     | }

warning: there were 1 feature warning(s); re-run with -feature for details

Person2Engineer: (p: Person)Engineer

scala>  def toCode(p:Person){             //有了隐式转化,就没问题了

     | p.code

     | }

toCode: (p: Person)Unit

scala> toCode(new Person("Scala"))              //此时new Person返回的其实就是Engineer

Coding......

//其实Person和Engineer没有任何关系,通过隐式转换,强制转换了关系!!!

//代码重构的时候,接口可以不变,但是可以多提供更多的隐式功能

//还有就是因为是隐式功能,所以在重构的时候可以随时删除

scala> import scala.io.Source

import scala.io.Source

scala> import java.io.File

import java.io.File

scala> class RichFile(val file:File){

     | def read = Source.fromFile(file.getPath()).mkString

     | }

scala> class FileImplicits(path:String) extends File(path)

defined class FileImplicits

scala> object FileImplicits{

     | implicit def file2RichFile(file:File) = new RichFile(file)

     | }

warning: there were 1 feature warning(s); re-run with -feature for details

defined module FileImplicits

warning: previously defined class FileImplicits is not a companion to object Fil

eImplicits.

Companions must be defined together; you may wish to use :paste mode for this.

scala> val file = new File("F:/FH11000001201601121730231716790929A")

file: java.io.File = F:\FH11000001201601121730231716790929A

scala> println(file.read)

技术分享

==========隐式参数============

不需要手工赋值参数,程序会自动赋值参数类型

上下文隐式值注入隐式参数中,会按照参数进行,隐式值是什么值,程序会根据上下文隐式值和隐式本身进行

会到隐式参数的伴生对象中去找隐式值

scala> class Level(level:Int)

defined class Level

scala> def toWorker(name:String)(implicit level:Level)

     | {

     | println(name+":"+level)

     | }

toWorker: (name: String)(implicit level: Level)Unit

scala>  def toWorker(name:String)(implicit l:Level)=println(name+":"+l.level)

<console>:9: error: value level is not a member of Level

        def toWorker(name:String)(implicit l:Level)=println(name+":"+l.level)

                                                                       ^

//因为level:Int没写val,表示私有的,所以得改

scala> class Level(val level:Int)

defined class Level

scala> implicit val level = new Level(8)

level: Level = [email protected]

scala> def toWorker(name:String)(implicit l:Level)=println(name+":"+l.level)

toWorker: (name: String)(implicit l: Level)Unit

scala> implicit val level = new Level(8)

level: Level = [email protected]

scala> toWorker("Spark")

Spark:8

==========隐式对象============

隐式对象和隐式参数的综合运用,sum里面第二个参数是隐式参数

==========隐式类============

1是Int,没有addSAP方法,找到RichInt,里面也没有,就到上下文中找addSAP,不管addSAP的类名是啥,找到之后就直接调用了

隐式的,不是手工调,就是编译器帮忙调用

==========并发编程============

在任何环境中开发,并发并称都是必须的

高手都逃脱不了并发编程

Scala使用Actor和JAVA的Thread类似,Spark使用Akka

JAVA的Thread用的是共享全局变量的加锁机制,一定会不可避免的带来死锁

大数据,黄金准则,一定不要有全局加锁的机制

Actor可以避免加锁,去掉了共享全局变量,变量都是私有的

注:Master就是集成Actor,Worker发送东西给Master

例子1:

scala> class HiActor extends Actor{

     | def act(){              //相当于JAVA的run()

     | while(true){

     | receive{

     | case name:String => println("Hi,"+name)

     | }

     | }

     | }

     | }

defined class HiActor

defined class HiActor

scala> val actor = new HiActor

actor: HiActor = [email protected]

scala> actor.start()

res3: scala.actors.Actor = [email protected]

scala> actor!"Spark"

scala> Hi,Spark

例子2:

scala> case class Basic(name:String,age:Int)

defined class Basic

scala> case class Worker(name:String,age:Int)

defined class Worker

scala> class basicActor extends Actor{

     | def act(){

     | while(true){

     | receive{

     | case Basic(name,age)=>println("Basic Information:"+name+"|"+age)

     | case Worker(name,age)=>println("Worker Infomation:"+name+"|"+age)

     | }

     | }

     | }

     | }

defined class basicActor

scala> val b = new basicActor

b: basicActor = [email protected]

scala> b.start

res5: scala.actors.Actor = [email protected]

scala> b ! Basic("Tom",13)

Basic Information:Tom|13

scala> b ! Worker("Jack",17)

Worker Infomation:Jack|17

看Worker中有向Master发了消息,并收到它的返回,这个!发送是异步

==========同步============

发送之后一定要结果,!?就是同步

scala> val result = b !? Worker("Jack",17)

Worker Infomation:Jack|17

然后命令行无法动了,因为它是同步的。。。

!!表示未来某个时间点会有返回的

scala> val future = a !! M

scala> val result = future

作业:

阅读DAGScheduler Master Worker RDD 分析里面的隐式转换和并发编程的消息通信

***********DAGScheduler *************

~~~1、隐式转换~~~

  /**

   * Return a new RDD containing the distinct elements in this RDD.

   */

  def distinct(numPartitions: Int)( implicit ord : Ordering[T] = null): RDD[T ] = withScope {

    map( x => (x, null)).reduceByKey((x, y ) => x, numPartitions).map (_._1)

  }

隐式参数,序号??

  implicit def rddToPairRDDFunctions [K, V](rdd: RDD[(K, V)])

    (implicit kt: ClassTag[ K], vt: ClassTag[ V], ord: Ordering[K ] = null): PairRDDFunctions[K, V ] = {

    new PairRDDFunctions( rdd)

  }

隐式函数

  implicit def rddToSequenceFileRDDFunctions [K, V](rdd: RDD[(K, V)])

      (implicit kt: ClassTag[ K], vt: ClassTag[ V],

                keyWritableFactory: WritableFactory[K],

                valueWritableFactory: WritableFactory[V])

    : SequenceFileRDDFunctions[ K, V] = {

    implicit val keyConverter = keyWritableFactory .convert

    implicit val valueConverter = valueWritableFactory .convert

    new SequenceFileRDDFunctions( rdd,

      keyWritableFactory. writableClass(kt ), valueWritableFactory.writableClass (vt ))

  }

隐式方法

~~~2、并发编程~~~

***********Master*************

~~~1、隐式转换~~~

~~~2、并发编程~~~

***********Worker*************

~~~1、隐式转换~~~

~~~2、并发编程~~~

***********RDD*************

~~~1、隐式转换~~~

~~~2、并发编程~~~

感觉都没找到Actor相关代码,再说了。。。

下载:

技术分享

技术分享


本文出自 “一枝花傲寒” 博客,谢绝转载!

以上是关于大数据学习:Scala隐式转换和并发编程(DT大数据梦工厂)的主要内容,如果未能解决你的问题,请参考以下文章

Scala VS Python:为大数据项目选择哪一个

Scala基础:隐式转换与隐式参数

Scala:高阶函数隐式转换

scala学习笔记-隐式转换与隐式参数(18)

大数据(7n)Scala隐式转换

Scala