从hadoop一路配置到spark

Posted better technology ==more money

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了从hadoop一路配置到spark相关的知识,希望对你有一定的参考价值。

 
 
安装
jdk-8u131-linux-x64.gz
scala-2.11.8.tgz
hadoop-2.7.3.tar.gz
spark-2.1.1-bin-hadoop2.7.tgz
 
vim /etc/profile
export ZOOKEEPER_HOME=/opt/zookeeper-3.4.8
export PATH=$ZOOKEEPER_HOME/bin:$PATH
export JAVA_HOME=/opt/jdk1.8.0_131
export CLASSPATH=$JAVA_HOME/lib
export PATH=$JAVA_HOME/bin:$PATH
export CLASSPATH=$ZOOKEEPER_HOME/lib:$CLASSPATH
export JSTORM_HOME=/opt/jstorm-2.2.1
export PATH=$JSTORM_HOME/bin:$PATH
export SCALA_HOME=/opt/scala-2.11.8
export PATH=$SCALA_HOME/bin:$PATH
export HADOOP_HOME=/opt/hadoop-2.7.3
export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export SPARK_HOME=/opt/spark-2.1.1-bin-hadoop2.7
export PATH=$PATH:$SPARK_HOME/bin:$SPARK_HOME/sbin
 
 
ssh免密码登陆
ssh-keygen -t rsa
cd /root/.ssh
cat id_rsa.pub >> authorized_keys     三台机器的id_rsa.pub合并
vim /etc/hosts 
192.168.56.101 j001
192.168.56.102 j002
192.168.56.103 j003
 
hadoop配置
 mkdir  /opt/data
 mkdir  /opt/data/hadoop
 mkdir  /opt/data/hadoop/tmp
cd /opt/hadoop-2.7.3/etc/hadoop
 
vim   hadoop-env.sh
export JAVA_HOME=/opt/jdk1.8.0_131
export HADOOP_PREFIX=/opt/hadoop-2.7.3
 
vim  yarn-env.sh
export JAVA_HOME=/opt/jdk1.8.0_131
 
vim  core-site.xml
<configuration>
    <property>
        <name>hadoop.tmp.dir</name>
        <value>/opt/data/hadoop/tmp</value>
    </property>
    <property>
        <name>fs.default.name</name>
        <value>hdfs://主节点IP:9000(未被占用的端口号都可以)</value>
    </property>
</configuration>
vim hdfs-site.xml
<configuration>
    <property>
        <name>dfs.replication</name>
        <value> hdfs的副本数</value>
    </property>
    <property>
        <name>dfs.name.dir</name>
        <value>dfs名称(/opt/data/hadoop/tmp/dfs/name)</value>
    </property>
    <property>
        <name>dfs.data.dir</name>
        <value>dfs数据路径(/opt/data/hadoop/tmp/dfs/data)</value>
    </property>   
</configuration>
 cp mapred-site.xml.template mapred-site.xml
<configuration>
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
</configuration>
 
vim yarn-site.xml
<property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
</property>
<property>
    <name>yarn.resourcemanager.hostname</name>
    <value>j001</value>
</property>
 
---新增
在mapred-site.xml配置文件中添加:
<property>  
        <name>mapreduce.jobhistory.address</name>  
        <value>sjfx:10020</value>  
</property>
在namenode上执行命令:mr-jobhistory-daemon.sh start historyserver 
这样在,namenode上会启动JobHistoryServer服务,可以在historyserver的日志中查看运行情况
 
vim  slaves
j001
j002
j003
 
启动
hdfs namenode -format
cd sbin
start-dfs.sh
start-yarn.sh
http://192.168.56.101:50070/
 
 停止Yarn及HDFS

      #stop-yarn.sh

      #stop-dfs.sh
SPARK配置
cd /opt/spark-2.1.1-bin-hadoop2.7/conf
mv spark-env.sh.template spark-env.sh
vim spark-env.sh
export JAVA_HOME=/opt/jdk1.8.0_131
export SCALA_HOME=/opt/scala-2.11.8
export SPARK_MASTER_HOST=192.168.56.101
export SPARK_MASTER_IP=192.168.56.101
export SPARK_LOCAL_IP=192.168.56.103
export SPARK_MASTER_PORT=7077
export SPARK_MASTER_WEBUI_PORT=8080
export SPARK_WORKER_PORT=7078
export SPARK_WORKER_WEBUI_PORT=8081
export SPARK_WORKER_MEMORY=400m
export HADOOP_HOME=/opt/hadoop-2.7.3
export HADOOP_CONF_DIR=/opt/hadoop-2.7.3/etc/hadoop
export SPARK_HOME=/opt/spark-2.1.1-bin-hadoop2.7
 
mv slaves.template slaves
j002
j003
 
start-master.sh
等http://192.168.56.101 :8080能访问了再执行start-slaves.sh
 
 
 
 
hdfs dfs -mkdir /input
hdfs dfs -put aa.xtx /input
hadoop jar
 
 
 
 
 
 
 
 

以上是关于从hadoop一路配置到spark的主要内容,如果未能解决你的问题,请参考以下文章

sparkhadoop动态增减节点

hadoop 配置文件放到哪里

hadoop 配置文件部分

Storm与SparkHadoop三种框架对比

怎么修改ambari中的hadoop配置文件

Storm与SparkHadoop三种框架对比