SparkStreaming---简单demo(NetCat)

Posted Shall潇

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了SparkStreaming---简单demo(NetCat)相关的知识,希望对你有一定的参考价值。

本文主要讲:利用 SparkStreaming 方式读取并处理 通过Netcat方式获得的数据

一、导入依赖

<dependency>
  <groupId>org.apache.spark</groupId>
  <artifactId>spark-core_2.11</artifactId>
  <version>2.4.7</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-sql -->
<dependency>
  <groupId>org.apache.spark</groupId>
  <artifactId>spark-sql_2.11</artifactId>
  <version>2.4.7</version>
</dependency>
<dependency>
  <groupId>org.apache.spark</groupId>
  <artifactId>spark-streaming_2.11</artifactId>
  <version>2.4.7</version>
</dependency>

二、编写程序

import org.apache.spark.streaming.dstream.{DStream, ReceiverInputDStream}
import org.apache.spark.streaming.{Seconds, StreamingContext}
import org.apache.spark.SparkConf

object SparkStreamingDemo1 {
  def main(args: Array[String]): Unit = {
    val conf = new SparkConf().setMaster("local[*]").setAppName("SparkStreaming-1")

    // 定义流处理上下文,数据采集周期为 3 秒
    val streamingContext = new StreamingContext(conf,Seconds(3))

    // 获取数据 --- 通过 Netcat 方式来获取数据
    val rid: ReceiverInputDStream[String] = streamingContext.socketTextStream("192.168.XXX.100", 7777)

    // 处理数据 --- WordCount
    val wordStream:DStream[String] = rid.flatMap(_.split("\\\\s+"))  // 用flatMap是因为DStream获取到的是rdds
    val mapStream:DStream[(String,Int)] = wordStream.map(x=>(x,1))
    val wordcountStream = mapStream.reduceByKey(_+_)

    // 输出
    wordcountStream.print()

    // 启动采集器
    streamingContext.start()
    streamingContext.awaitTermination()
  }
}

三、测试

1、开启 Netcat 服务

nc -lk 7777

2、执行程序

3、从Netcat端输入数据,查看结果是否被处理

以上是关于SparkStreaming---简单demo(NetCat)的主要内容,如果未能解决你的问题,请参考以下文章

SparkStreaming 打印输出demo

SparkStreaming 打印输出demo

SparkStreaming wordcount demo

SparkStreaming wordcount demo

flume+sparkStreaming实例 实时监控文件demo

SparkStreaming简单例子(oldAPI)