CentOs 7 安装Spark

Posted hobinly

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了CentOs 7 安装Spark相关的知识,希望对你有一定的参考价值。

环境:

centos7

hadoop 2.7.3

java 1.8

scala

 

下载:

http://spark.apache.org

解压到安装目录

可以自由选择,我安装到hadoop同一目录

 

配置:(cd spark安装目录/conf)

cp log4j.properties.template log4j.properties
cp  spark-env.sh.template spark-env.sh
cp slaves.template  slaves

在spark-env.sh文件后面添加如下信息指定hadoop和spark、scala环境

export SPARK_DIST_CLASSPATH=$(/home/hadoop/hadoop-2.7.3/bin/hadoop classpath)
export SPARK_HOME=/home/hadoop/spark
export SCALA_HOME=/home/hadoop/scala

 在slaves 文件末尾添加 slave机器

 

启动:

sbin/start-master.sh   sbin/start-slaves.sh

 

查看spark是否运行:

http://yourIp:8080

 

运行实例application

 (主机url在http://yourIp:8080显示)

bin/spark-shell  --matser  spark://master:7077

[[email protected] spark]$ bin/spark-shell --master spark://master:7077
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/hadoop/spark/jars/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/hadoop/hadoop-2.7.3/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
17/06/06 04:01:17 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
17/06/06 04:01:29 WARN ObjectStore: Failed to get database global_temp, returning NoSuchObjectException
Spark context Web UI available at http://10.12.1.102:4040
Spark context available as ‘sc‘ (master = spark://master:7077, app id = app-20170606040119-0002).
Spark session available as ‘spark‘.
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  ‘_/
   /___/ .__/\_,_/_/ /_/\_\   version 2.1.1
      /_/

Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_112)
Type in expressions to have them evaluated.
Type :help for more information.

scala>

 

官方示例:http://spark.apache.org/docs/latest/quick-start.html

 

scala> var textfile=sc.textFile("hdfs://master:9000/user/lihb/in/*.log")
textfile: org.apache.spark.rdd.RDD[String] = hdfs://master:9000/user/lihb/in/*.log MapPartitionsRDD[1] at textFile at <console>:24

scala> textfile.first()
res5: String = #Software: IIS Advanced Logging Module

scala> textfile.count()
res7: Long = 32583

scala> val wordCounts=textfile.flatMap(line=>line.split(" ")).map(word=>(word,1)).reduceByKey((a,b)=>a+b)
wordCounts: org.apache.spark.rdd.RDD[(String, Int)] = ShuffledRDD[4] at reduceByKey at <console>:26

scala> wordCounts.collect()
res8: Array[(String, Int)] = Array((/space/attentionto/99335/,1), (01:41:27.777,1),  (01:45:...
scala>

 

以上是关于CentOs 7 安装Spark的主要内容,如果未能解决你的问题,请参考以下文章

原创 Spark动手实践 1Hadoop2.7.3安装部署实际动手

centos 7安装gitlab及使用

版本管理 GitLab 的安装及管理 (CentOS 7)

centos7.2 64位 hadoop2.7.3 安装 hawq 2.10 随笔啊随笔而已。

centOS7下Spark安装配置

Linux(基于CentOS7)单机版Spark环境搭建