spark 相关配置 shuffle 相关配置选项
Posted 流浪在伯纳乌
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了spark 相关配置 shuffle 相关配置选项相关的知识,希望对你有一定的参考价值。
在master的/conf/spark-defaults.conf中配置
spark.shuffle.service.enabled true
spark.shuffle.service.port 7337
但是在从节点的spark-defaults.conf中注释上面两个配置选项,不然web 界面中将看不到从节点
spark-defaults.conf:
spark.local.dir /mnt/diskb/sparklocal,/mnt/diskc/sparklocal,/mnt/diskd/sparklocal,/mnt/diske/sparklocal,/mnt/diskf/sparklocal,/mnt/diskg/sparklocal //shuffle 中产生的临时文件的路径
spark.eventLog.enabled true //记录spark日志
spark.eventLog.dir hdfs://nameservice1/spark-log //日志保存在hdfs上
spark.network.timeout 450
spark.dynamicAllocation.enabled true
spark.dynamicAllocation.minExecutors 8
spark.dynamicAllocation.maxExecutors 30
spark.dynamicAllocation.schedulerBacklogTimeout 1s
spark.dynamicAllocation.sustainedSchedulerBacklogTimeout 5s
spark.io.compression.codec snappy
spark-env.sh:
export JAVA_HOME=/usr/java/jdk1.7.0_67-cloudera
export SPARK_MASTER_IP=10.130.2.20
export SPARK_MASTER_PORT=7077
export SPARK_WORKER_CORES=12
export SPARK_EXECUTOR_INSTANCES=1
export SPARK_WORKER_MEMORY=48g
export SPARK_WORKER_DIR=/mnt/diskb/sparkwork,/mnt/diskc/sparkwork,/mnt/diskd/sparkwork,/mnt/diske/sparkwork,/mnt/diskf/sparkwork,/mnt/diskg/sparkwork
export SPARK_LOCAL_DIRS=/mnt/diske/sparklocal,/mnt/diskb/sparklocal,/mnt/diskc/sparklocal,/mnt/diskd/sparklocal,/mnt/diskf/sparklocal,/mnt/diskg/sparklocal
export HADOOP_HOME=/opt/cloudera/parcels/CDH/lib/hadoop
export HADOOP_CONF_DIR=/etc/hadoop/conf/
export SPARK_DAEMON_MEMORY=12g
#export SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER -Dspark.deploy.zookeeper.url=bdc40.hexun.com:2181,bdc41.hexun.com:2181,bdc46.hexun.com:2181,bdc53.hexun.com:2181,bdc54.hexun.com:2181 -Dspark.deploy.zookeeper.dir=/spark"
#export SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=FILESYSTEM -Dspark.deploy.recoveryDirectory=/opt/modules/spark/recovery"
export JAVA_LIBRARY_PATH=$JAVA_LIBRARY_PATH:$HADOOP_HOME/lib/native
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$HADOOP_HOME/lib/native
export SPARK_LIBRARY_PATH=$SPARK_LIBRARY_PATH:$HADOOP_HOME/lib/native
export SPARK_CLASSPATH=$SPARK_CLASSPATH:$HADOOP_HOME/lib/snappy-java-1.0.4.1.jar
以上是关于spark 相关配置 shuffle 相关配置选项的主要内容,如果未能解决你的问题,请参考以下文章