[Hadoop][Spark]Cluster and HA
Posted
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了[Hadoop][Spark]Cluster and HA相关的知识,希望对你有一定的参考价值。
#!/bin/bash
# HOST
echo ‘10.211.55.101 spark01‘ >> /etc/hosts
echo ‘10.211.55.102 spark02‘ >> /etc/hosts
echo ‘10.211.55.103 spark03‘ >> /etc/hosts
echo ‘10.211.55.101 linux01‘ >> /etc/hosts
echo ‘10.211.55.102 linux02‘ >> /etc/hosts
echo ‘10.211.55.103 linux03‘ >> /etc/hosts
# SPARK 国内安装文件下载地址
# SPARK_WEB_FILE=wget -P /tmp https://mirrors.tuna.tsinghua.edu.cn/apache/spark/spark-2.1.1/spark-2.1.1-bin-hadoop2.7.tgz
# 是否下载 SPARK 安装文件
# wget -P /tmp $ZOOKEEPER_WEB_FILE
# SPARK 安装文件
# SPARK_INSTALL_FILE=/tmp/spark-2.1.1-bin-hadoop2.7.tgz
# SPARK 目录
# SPARK_INSTALL_DIR=/opt/spark-2.1.1-bin-hadoop2.7
# SPARK_HOME=/opt/spark
# 安装 SPARK
tar -C /opt -xf $SPARK_INSTALL_FILE
ln -s $SPARK_INSTALL_DIR $SPARK_HOME
# 创建组和用户
groupadd spark
useradd -g spark spark -s /sbin/nologin
# 文件存放目录
mkdir -p /mnt/spark
mkdir -p /var/log/spark
mkdir -p $SPARK_HOME/run
chown spark:spark -R $SPARK_HOME
chown spark:spark -R $SPARK_INSTALL_DIR
chown spark:spark -R $SPARK_HOME
# /etc/profile
echo ‘JAVA_HOME=/usr/java/jdk‘ >> /etc/profile
echo ‘JRE_HOME=$JAVA_HOME/jre‘ >> /etc/profile
echo ‘CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:$JRE_HOME/lib‘ >> /etc/profile
echo ‘HADOOP_HOME=/opt/hadoop‘ >> /etc/profile
echo ‘HADOOP_PREFIX=$HADOOP_HOME‘ >> /etc/profile
echo ‘HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop‘ >> /etc/profile
echo ‘HADOOP_PID_DIR=$HADOOP_PREFIX/run‘ >> /etc/profile
echo ‘YARN_PID_DIR=$HADOOP_PREFIX/run‘ >> /etc/profile
echo ‘SPARK_HOME=/opt/spark‘ >> /etc/profile
echo ‘SPARK_PID_DIR=/$SPARK_HOME/run‘ >> /etc/profile
echo ‘PATH=$PATH:$JAVA_HOME/bin:$JRE_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$SPARK_HOME/bin:$SPARK_HOME/sbin‘ >> /etc/profile
echo ‘export JAVA_HOME JRE_HOME CLASSPATH HADOOP_HOME HADOOP_PREFIX HADOOP_CONF_DIR HADOOP_PID_DIR YARN_PID_DIR SPARK_HOME SPARK_PID_DIR PATH‘ >> /etc/profile
#创建配置文件
cat <<EOF | tee /opt/spark/conf/spark-env.sh
export JAVA_HOME=$JAVA_HOME
export HADOOP_HOME=$HADOOP_HOME
export HADOOP_PREFIX=$HADOOP_PREFIX
export HADOOP_CONF_DIR=$HADOOP_CONF_DIR
export SPARK_HOME=/opt/spark
export SPARK_CONF_DIR=$SPARK_HOME/conf
export SPARK_PID_DIR=$SPARK_PID_DIR
export SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER -Dspark.deploy.zookeeper.url=zookeeper01:2181,zookeeper02:2181,zookeeper03:2181 -Dspark.deploy.zookeeper.dir=/spark"
EOF
chmod +x /opt/spark/conf/spark-env.sh
systemd开机启动 master
cat <<EOF | tee /usr/lib/systemd/system/spark-master.service
[Unit]
Description=Spark Master
After=syslog.target network.target remote-fs.target nss-lookup.target network-online.target
Requires=network-online.target
[Service]
User=spark
Group=spark
Type=forking
ExecStart=/opt/spark/sbin/start-master.sh
ExecStop=/opt/spark/sbin/stop-master.sh
Restart=on-failure
[Install]
WantedBy=multi-user.target
EOF
systemd开机启动 slave
cat <<EOF | tee /usr/lib/systemd/system/spark-slave.service
[Unit]
Description=Spark Slave
After=syslog.target network.target remote-fs.target nss-lookup.target network-online.target
Requires=network-online.target
[Service]
User=spark
Group=spark
Type=forking
ExecStart=/opt/spark/sbin/start-slave.sh
ExecStop=/opt/spark/sbin/stop-slave.sh
Restart=on-failure
[Install]
WantedBy=multi-user.target
EOF
firewall-cmd --zone=public --add-port=7077/tcp --permanent
firewall-cmd --zone=public --add-port=6066/tcp --permanent
firewall-cmd --zone=public --add-port=8080/tcp --permanent
firewall-cmd --reload
systemctl stop spark-master
systemctl start spark-master
systemctl status spark-master
systemctl stop spark-slave
systemctl start spark-slave
systemctl status spark-slave
以上是关于[Hadoop][Spark]Cluster and HA的主要内容,如果未能解决你的问题,请参考以下文章
Spark -14:spark Hadoop 高可用模式下读写hdfs
Spark运行模式_spark自带cluster manager的standalone cluster模式(集群)