Linux单机安转Spark

Posted aston

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Linux单机安转Spark相关的知识,希望对你有一定的参考价值。

安装Spark需要先安装jdk安装Scala

 

1. 创建目录

> mkdir  /opt/spark

> cd  /opt/spark

 

2. 解压缩、创建软连接

> tar  zxvf  spark-2.3.0-bin-hadoop2.7.tgz

> link -s spark-2.3.0-bin-hadoop2.7  spark

 

4. 编辑 /etc/profile

> vi /etc/profile

输入下面内容

export SPARK_HOME=/opt/spark/spark
export PATH=$PATH:$SPARK_HOME/bin

> source  /etc/profile

 

5. 进入配置文件夹

> cd /opt/spark/spark/conf

 

6. 配置spark-env.sh

> cp spark-env.sh.template spark-env.sh

spark-env.sh 中输入以下内容

export SCALA_HOME=/opt/scala/scala
export JAVA_HOME=/opt/java/jdk
export SPARK_HOME=/opt/spark/spark
export SPARK_MASTER_IP=hserver1
export SPARK_EXECUTOR_MEMORY=1G

注意:上面的路径应该根据自己的路径配置

 

7. 配置slaves

> cp slaves.template  slaves

slaves 中输入以下内容

localhost

 

8. 运行spark示例

> cd /opt/spark/spark

> ./bin/run-example SparkPi 10

会显示下面信息

[aston@localhost spark]$ ./bin/run-example SparkPi 10
2018-06-04 22:37:25 WARN  Utils:66 - Your hostname, localhost.localdomain resolves to a loopback address: 127.0.0.1; using 192.168.199.150 instead (on interface wlp8s0b1)
2018-06-04 22:37:25 WARN  Utils:66 - Set SPARK_LOCAL_IP if you need to bind to another address
2018-06-04 22:37:25 WARN  NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2018-06-04 22:37:25 INFO  SparkContext:54 - Running Spark version 2.3.0
2018-06-04 22:37:25 INFO  SparkContext:54 - Submitted application: Spark Pi
2018-06-04 22:37:26 INFO  SecurityManager:54 - Changing view acls to: aston
2018-06-04 22:37:26 INFO  SecurityManager:54 - Changing modify acls to: aston
2018-06-04 22:37:26 INFO  SecurityManager:54 - Changing view acls groups to: 
2018-06-04 22:37:26 INFO  SecurityManager:54 - Changing modify acls groups to: 
2018-06-04 22:37:26 INFO  SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(aston); groups with view permissions: Set(); users  with modify permissions: Set(aston); groups with modify permissions: Set()
2018-06-04 22:37:26 INFO  Utils:54 - Successfully started service \'sparkDriver\' on port 34729.
2018-06-04 22:37:26 INFO  SparkEnv:54 - Registering MapOutputTracker
2018-06-04 22:37:26 INFO  SparkEnv:54 - Registering BlockManagerMaster
2018-06-04 22:37:26 INFO  BlockManagerMasterEndpoint:54 - Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
2018-06-04 22:37:26 INFO  BlockManagerMasterEndpoint:54 - BlockManagerMasterEndpoint up
2018-06-04 22:37:26 INFO  DiskBlockManager:54 - Created local directory at /tmp/blockmgr-4d51d515-85db-4a8c-bb45-219fd96be3c6
2018-06-04 22:37:26 INFO  MemoryStore:54 - MemoryStore started with capacity 366.3 MB
2018-06-04 22:37:26 INFO  SparkEnv:54 - Registering OutputCommitCoordinator
2018-06-04 22:37:26 INFO  log:192 - Logging initialized @2296ms
2018-06-04 22:37:26 INFO  Server:346 - jetty-9.3.z-SNAPSHOT
2018-06-04 22:37:26 INFO  Server:414 - Started @2382ms
2018-06-04 22:37:26 INFO  AbstractConnector:278 - Started ServerConnector@779dfe55{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
2018-06-04 22:37:26 INFO  Utils:54 - Successfully started service \'SparkUI\' on port 4040.
2018-06-04 22:37:26 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@5f212d84{/jobs,null,AVAILABLE,@Spark}
2018-06-04 22:37:26 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@27ead29e{/jobs/json,null,AVAILABLE,@Spark}
2018-06-04 22:37:26 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@4c060c8f{/jobs/job,null,AVAILABLE,@Spark}
2018-06-04 22:37:26 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@383f3558{/jobs/job/json,null,AVAILABLE,@Spark}
2018-06-04 22:37:26 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@49b07ee3{/stages,null,AVAILABLE,@Spark}
2018-06-04 22:37:26 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@352e612e{/stages/json,null,AVAILABLE,@Spark}
2018-06-04 22:37:26 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@65f00478{/stages/stage,null,AVAILABLE,@Spark}
2018-06-04 22:37:26 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@28486680{/stages/stage/json,null,AVAILABLE,@Spark}
2018-06-04 22:37:26 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@4d7e7435{/stages/pool,null,AVAILABLE,@Spark}
2018-06-04 22:37:26 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@4a1e3ac1{/stages/pool/json,null,AVAILABLE,@Spark}
2018-06-04 22:37:26 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@6e78fcf5{/storage,null,AVAILABLE,@Spark}
2018-06-04 22:37:26 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@56febdc{/storage/json,null,AVAILABLE,@Spark}
2018-06-04 22:37:26 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@3b8ee898{/storage/rdd,null,AVAILABLE,@Spark}
2018-06-04 22:37:26 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@7d151a{/storage/rdd/json,null,AVAILABLE,@Spark}
2018-06-04 22:37:26 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@294bdeb4{/environment,null,AVAILABLE,@Spark}
2018-06-04 22:37:26 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@5300f14a{/environment/json,null,AVAILABLE,@Spark}
2018-06-04 22:37:26 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@1f86099a{/executors,null,AVAILABLE,@Spark}
2018-06-04 22:37:26 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@77bb0ab5{/executors/json,null,AVAILABLE,@Spark}
2018-06-04 22:37:26 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@f2c488{/executors/threadDump,null,AVAILABLE,@Spark}
2018-06-04 22:37:26 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@54acff7d{/executors/threadDump/json,null,AVAILABLE,@Spark}
2018-06-04 22:37:26 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@7bc9e6ab{/static,null,AVAILABLE,@Spark}
2018-06-04 22:37:26 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@37d00a23{/,null,AVAILABLE,@Spark}
2018-06-04 22:37:26 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@433e536f{/api,null,AVAILABLE,@Spark}
2018-06-04 22:37:26 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@988246e{/jobs/job/kill,null,AVAILABLE,@Spark}
2018-06-04 22:37:26 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@62515a47{/stages/stage/kill,null,AVAILABLE,@Spark}
2018-06-04 22:37:26 INFO  SparkUI:54 - Bound SparkUI to 0.0.0.0, and started at http://192.168.199.150:4040
2018-06-04 22:37:26 INFO  SparkContext:54 - Added JAR file:///opt/spark/spark/examples/jars/spark-examples_2.11-2.3.0.jar at spark://192.168.199.150:34729/jars/spark-examples_2.11-2.3.0.jar with timestamp 1528123046779
2018-06-04 22:37:26 INFO  SparkContext:54 - Added JAR file:///opt/spark/spark/examples/jars/scopt_2.11-3.7.0.jar at spark://192.168.199.150:34729/jars/scopt_2.11-3.7.0.jar with timestamp 1528123046780
2018-06-04 22:37:26 INFO  Executor:54 - Starting executor ID driver on host localhost
2018-06-04 22:37:26 INFO  Utils:54 - Successfully started service \'org.apache.spark.network.netty.NettyBlockTransferService\' on port 45436.
2018-06-04 22:37:26 INFO  NettyBlockTransferService:54 - Server created on 192.168.199.150:45436
2018-06-04 22:37:26 INFO  BlockManager:54 - Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
2018-06-04 22:37:26 INFO  BlockManagerMaster:54 - Registering BlockManager BlockManagerId(driver, 192.168.199.150, 45436, None)
2018-06-04 22:37:26 INFO  BlockManagerMasterEndpoint:54 - Registering block manager 192.168.199.150:45436 with 366.3 MB RAM, BlockManagerId(driver, 192.168.199.150, 45436, None)
2018-06-04 22:37:26 INFO  BlockManagerMaster:54 - Registered BlockManager BlockManagerId(driver, 192.168.199.150, 45436, None)
2018-06-04 22:37:26 INFO  BlockManager:54 - Initialized BlockManager: BlockManagerId(driver, 192.168.199.150, 45436, None)
2018-06-04 22:37:27 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@65bcf7c2{/metrics/json,null,AVAILABLE,@Spark}
2018-06-04 22:37:27 INFO  SparkContext:54 - Starting job: reduce at SparkPi.scala:38
2018-06-04 22:37:27 INFO  DAGScheduler:54 - Got job 0 (reduce at SparkPi.scala:38) with 10 output partitions
2018-06-04 22:37:27 INFO  DAGScheduler:54 - Final stage: ResultStage 0 (reduce at SparkPi.scala:38)
2018-06-04 22:37:27 INFO  DAGScheduler:54 - Parents of final stage: List()
2018-06-04 22:37:27 INFO  DAGScheduler:54 - Missing parents: List()
2018-06-04 22:37:27 INFO  DAGScheduler:54 - Submitting ResultStage 0 (MapPartitionsRDD[1] at map at SparkPi.scala:34), which has no missing parents
2018-06-04 22:37:27 INFO  MemoryStore:54 - Block broadcast_0 stored as values in memory (estimated size 1832.0 B, free 366.3 MB)
2018-06-04 22:37:28 INFO  MemoryStore:54 - Block broadcast_0_piece0 stored as bytes in memory (estimated size 1181.0 B, free 366.3 MB)
2018-06-04 22:37:28 INFO  BlockManagerInfo:54 - Added broadcast_0_piece0 in memory on 192.168.199.150:45436 (size: 1181.0 B, free: 366.3 MB)
2018-06-04 22:37:28 INFO  SparkContext:54 - Created broadcast 0 from broadcast at DAGScheduler.scala:1039
2018-06-04 22:37:28 INFO  DAGScheduler:54 - Submitting 10 missing tasks from ResultStage 0 (MapPartitionsRDD[1] at map at SparkPi.scala:34) (first 15 tasks are for partitions Vector(0, 1, 2, 3, 4, 5, 6, 7, 8, 9))
2018-06-04 22:37:28 INFO  TaskSchedulerImpl:54 - Adding task set 0.0 with 10 tasks
2018-06-04 22:37:28 INFO  TaskSetManager:54 - Starting task 0.0 in stage 0.0 (TID 0, localhost, executor driver, partition 0, PROCESS_LOCAL, 7853 bytes)
2018-06-04 22:37:28 INFO  TaskSetManager:54 - Starting task 1.0 in stage 0.0 (TID 1, localhost, executor driver, partition 1, PROCESS_LOCAL, 7853 bytes)
2018-06-04 22:37:28 INFO  TaskSetManager:54 - Starting task 2.0 in stage 0.0 (TID 2, localhost, executor driver, partition 2, PROCESS_LOCAL, 7853 bytes)
2018-06-04 22:37:28 INFO  TaskSetManager:54 - Starting task 3.0 in stage 0.0 (TID 3, localhost, executor driver, partition 3, PROCESS_LOCAL, 7853 bytes)
2018-06-04 22:37:28 INFO  Executor:54 - Running task 2.0 in stage 0.0 (TID 2)
2018-06-04 22:37:28 INFO  Executor:54 - Running task 1.0 in stage 0.0 (TID 1)
2018-06-04 22:37:28 INFO  Executor:54 - Running task 3.0 in stage 0.0 (TID 3)
2018-06-04 22:37:28 INFO  Executor:54 - Running task 0.0 in stage 0.0 (TID 0)
2018-06-04 22:37:28 INFO  Executor:54 - Fetching spark://192.168.199.150:34729/jars/scopt_2.11-3.7.0.jar with timestamp 1528123046780
2018-06-04 22:37:28 INFO  TransportClientFactory:267 - Successfully created connection to /192.168.199.150:34729 after 34 ms (0 ms spent in bootstraps)
2018-06-04 22:37:28 INFO  Utils:54 - Fetching spark://192.168.199.150:34729/jars/scopt_2.11-3.7.0.jar to /tmp/spark-a840c54e-7db9-4dfc-a446-1fa10a8d2c3e/userFiles-36ae13de-60e8-42fd-958d-66c3c3832d4a/fetchFileTemp8606784681518533462.tmp
2018-06-04 22:37:28 INFO  Executor:54 - Adding file:/tmp/spark-a840c54e-7db9-4dfc-a446-1fa10a8d2c3e/userFiles-36ae13de-60e8-42fd-958d-66c3c3832d4a/scopt_2.11-3.7.0.jar to class loader
2018-06-04 22:37:28 INFO  Executor:54 - Fetching spark://192.168.199.150:34729/jars/spark-examples_2.11-2.3.0.jar with timestamp 1528123046779
2018-06-04 22:37:28 INFO  Utils:54 - Fetching spark://192.168.199.150:34729/jars/spark-examples_2.11-2.3.0.jar to /tmp/spark-a840c54e-7db9-4dfc-a446-1fa10a8d2c3e/userFiles-36ae13de-60e8-42fd-958d-66c3c3832d4a/fetchFileTemp8435156876449095794.tmp
2018-06-04 22:37:28 INFO  Executor:54 - Adding file:/tmp/spark-a840c54e-7db9-4dfc-a446-1fa10a8d2c3e/userFiles-36ae13de-60e8-42fd-958d-66c3c3832d4a/spark-examples_2.11-2.3.0.jar to class loader
2018-06-04 22:37:28 INFO  Executor:54 - Finished task 0.0 in stage 0.0 (TID 0). 824 bytes result sent to driver
2018-06-04 22:37:28 INFO  Executor:54 - Finished task 1.0 in stage 0.0 (TID 1). 824 bytes result sent to driver
2018-06-04 22:37:28 INFO  TaskSetManager:54 - Starting task 4.0 in stage 0.0 (TID 4, localhost, executor driver, partition 4, PROCESS_LOCAL, 7853 bytes)
2018-06-04 22:37:28 INFO  Executor:54 - Finished task 2.0 in stage 0.0 (TID 2). 867 bytes result sent to driver
2018-06-04 22:37:28 INFO  Executor:54 - Running task 4.0 in stage 0.0 (TID 4)
2018-06-04 22:37:28 INFO  TaskSetManager:54 - Starting task 5.0 in stage 0.0 (TID 5, localhost, executor driver, partition 5, PROCESS_LOCAL, 7853 bytes)
2018-06-04 22:37:28 INFO  Executor:54 - Running task 5.0 in stage 0.0 (TID 5)
2018-06-04 22:37:28 INFO  Executor:54 - Finished task 3.0 in stage 0.0 (TID 3). 824 bytes result sent to driver
2018-06-04 22:37:28 INFO  TaskSetManager:54 - Starting task 6.0 in stage 0.0 (TID 6, localhost, executor driver, partition 6, PROCESS_LOCAL, 7853 bytes)
2018-06-04 22:37:28 INFO  TaskSetManager:54 - Starting task 7.0 in stage 0.0 (TID 7, localhost, executor driver, partition 7, PROCESS_LOCAL, 7853 bytes)
2018-06-04 22:37:28 INFO  Executor:54 - Running task 7.0 in stage 0.0 (TID 7)
2018-06-04 22:37:28 INFO  Executor:54 - Running task 6.0 in stage 0.0 (TID 6)
2018-06-04 22:37:28 INFO  TaskSetManager:54 - Finished task 1.0 in stage 0.0 (TID 1) in 362 ms on localhost (executor driver) (1/10)
2018-06-04 22:37:28 INFO  TaskSetManager:54 - Finished task 3.0 in stage 0.0 (TID 3) in 385 ms on localhost (executor driver) (2/10)
2018-06-04 22:37:28 INFO  TaskSetManager:54 - Finished task 0.0 in stage 0.0 (TID 0) in 418 ms on localhost (executor driver) (3/10)
2018-06-04 22:37:28 INFO  TaskSetManager:54 - Finished task 2.0 in stage 0.0 (TID 2) in 388 ms on localhost (executor driver) (4/10)
2018-06-04 22:37:28 INFO  Executor:54 - Finished task 5.0 in stage 0.0 (TID 5). 824 bytes result sent to driver
2018-06-04 22:37:28 INFO  TaskSetManager:54 - Starting task 8.0 in stage 0.0 (TID 8, localhost, executor driver, partition 8, PROCESS_LOCAL, 7853 bytes)
2018-06-04 22:37:28 INFO  TaskSetManager:54 - Finished task 5.0 in stage 0.0 (TID 5) in 79 ms on localhost (executor driver) (5/10)
2018-06-04 22:37:28 INFO  Executor:54 - Running task 8.0 in stage 0.0 (TID 8)
2018-06-04 22:37:28 INFO  Executor:54 - Finished task 4.0 in stage 0.0 (TID 4). 824 bytes result sent to driver
2018-06-04 22:37:28 INFO  TaskSetManager:54 - Starting task 9.0 in stage 0.0 (TID 9, localhost, executor driver, partition 9, PROCESS_LOCAL, 7853 bytes)
2018-06-04 22:37:28 INFO  Executor:54 - Running task 9.0 in stage 0.0 (TID 9)
2018-06-04 22:37:28 INFO  TaskSetManager:54 - Finished task 4.0 in stage 0.0 (TID 4) in 99 ms on localhost (executor driver) (6/10)
2018-06-04 22:37:28 INFO  Executor:54 - Finished task 7.0 in stage 0.0 (TID 7). 824 bytes result sent to driver
2018-06-04 22:37:28 INFO  TaskSetManager:54 - Finished task 7.0 in stage 0.0 (TID 7) in 98 ms on localhost (executor driver) (7/10)
2018-06-04 22:37:28 INFO  Executor:54 - Finished task 6.0 in stage 0.0 (TID 6). 824 bytes result sent to driver
2018-06-04 22:37:28 INFO  TaskSetManager:54 - Finished task 6.0 in stage 0.0 (TID 6) in 107 ms on localhost (executor driver) (8/10)
2018-06-04 22:37:28 INFO  Executor:54 - Finished task 9.0 in stage 0.0 (TID 9). 824 bytes result sent to driver
2018-06-04 22:37:28 INFO  Executor:54 - Finished task 8.0 in stage 0.0 (TID 8). 867 bytes result sent to driver
2018-06-04 22:37:28 INFO  TaskSetManager:54 - Finished task 9.0 in stage 0.0 (TID 9) in 39 ms on localhost (executor driver) (9/10)
2018-06-04 22:37:28 INFO  TaskSetManager:54 - Finished task 8.0 in stage 0.0 (TID 8) in 57 ms on localhost (executor driver) (10/10)
2018-06-04 22:37:28 INFO  TaskSchedulerImpl:54 - Removed TaskSet 0.0, whose tasks have all completed, from pool 
2018-06-04 22:37:28 INFO  DAGScheduler:54 - ResultStage 0 (reduce at SparkPi.scala:38) finished in 0.800 s
2018-06-04 22:37:28 INFO  DAGScheduler:54 - Job 0 finished: reduce at SparkPi.scala:38, took 0.945853 s
Pi is roughly 3.14023914023914
2018-06-04 22:37:28 INFO  AbstractConnector:318 - Stopped Spark@779dfe55{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
2018-06-04 22:37:28 INFO  SparkUI:54 - Stopped Spark web UI at http://192.168.199.150:4040
2018-06-04 22:37:28 INFO  MapOutputTrackerMasterEndpoint:54 - MapOutputTrackerMasterEndpoint stopped!
2018-06-04 22:37:28 INFO  MemoryStore:54 - MemoryStore cleared
2018-06-04 22:37:28 INFO  BlockManager:54 - BlockManager stopped
2018-06-04 22:37:28 INFO  BlockManagerMaster:54 - BlockManagerMaster stopped
2018-06-04 22:37:28 INFO  OutputCommitCoordinator$OutputCommitCoordinatorEndpoint:54 - OutputCommitCoordinator stopped!
2018-06-04 22:37:28 INFO  SparkContext:54 - Successfully stopped SparkContext
2018-06-04 22:37:28 INFO  ShutdownHookManager:54 - Shutdown hook called
2018-06-04 22:37:28 INFO  ShutdownHookManager:54 - Deleting directory /tmp/spark-a840c54e-7db9-4dfc-a446-1fa10a8d2c3e
2018-06-04 22:37:28 INFO  ShutdownHookManager:54 - Deleting directory /tmp/spark-16300765-9872-4542-91ed-1a7a0f8285d9

 

9. 运行spark shell

> cd /opt/spark/spark

> ./bin/spark-shell

以上是关于Linux单机安转Spark的主要内容,如果未能解决你的问题,请参考以下文章

Linux下如何安转JDK

在 Ubuntu16.04 中搭建 Spark 单机开发环境 (JDK + Scala + Spark)

centos部署单机spark大数据环境--安装mysql

linux安转mysql8遇到的坑

在这个 spark 代码片段中 ordering.by 是啥意思?

Centos7 搭建单机Spark分布式集群