hadoop+zookeeper+kafka集群搭建

Posted 韦建国

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了hadoop+zookeeper+kafka集群搭建相关的知识,希望对你有一定的参考价值。


一、服务器环境

1.四台服务器(最低标准1核2g)

​hadoop01 172.16.192.132

hadoop02 172.16.192.133

hadoop03 172.16.192.134

hadoop04 172.16.192.135​

2.四台全部修改主机名

​hostnamectl set-hostname hadoop01

hostnamectl set-hostname hadoop02

hostnamectl set-hostname hadoop03

hostnamectl set-hostname hadoop04​

3. 四台主机全部关闭防火墙

​内核、外核全部关闭

具体操作比较基础就不写命令了​

4. 重启系

reboot

5.编辑/etc/hosts分别给四台主机名映射

​127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4

::1 localhost localhost.localdomain localhost6 localhost6.localdomain6

172.16.192.132 hadoop01

172.16.192.133 hadoop02

172.16.192.134 hadoop03

172.16.192.135 hadoop04​

6.把所需的安装包导入

​上传所需安装包到/opt/目录下

这里导入用rz命令或者xftp都可以

安装包在百度云盘链接: https://pan.baidu.com/s/1V7wro2qj3LPFrHWUz-dq4A 

                                   passwd: vdgc

hadoop+zookeeper+kafka集群搭建_kafka

7. 在/root/创建密钥对并拷贝到其他主机   !!这里开始建议四台机器同步操作命令!

ssh-keygen -t rsa(四台机器都输入,这里会跳出来选择确认,三次全部回车即可,见下图)

hadoop+zookeeper+kafka集群搭建_hadoop_02

8、

       ssh-copy-id hadoop01(这里输入yes,然后输入passwd即可)

ssh-copy-id hadoop02(这里输入yes,然后输入passwd即可)

ssh-copy-id hadoop03(这里输入yes,然后输入passwd即可)

ssh-copy-id hadoop04(这里输入yes,然后输入passwd即可)

注意:第8步骤只用操作hadoop01、hadoop02、hadoop03

hadoop04不用操作

9,测试密钥是否测试成功

hadoop+zookeeper+kafka集群搭建_kafka_03

二、集群环境

1.在root目录创建bin目录存放脚本

mkdir bin

cd bin

2.编辑脚本

创建:touch xrsync

vim xrsync​

#!/bin/bash

#虚拟机之间传递文件

#1 获取输入参数个数,如果参数个数没有,退出

pcount=$#

if((pcount==0));then

echo no args;

exit;

fi

#2 获取文件名

p1=$1

fname=`basename $p1`

echo fname=$fname

#3 获取上级目录的绝对路径

pdir=`cd -P $(dirname $p1);pwd`

echo pdir=$pdir

#4 获取当前用户名字

user=`whoami`

#5 将文件拷贝到目标机器

for host in hadoop01 hadoop02 hadoop03 hadoop04

do

echo ------------- $host ---------------

rsync -av $pdir/$fname $user@$host:$pdir

done


​创建:touch showjps.sh

vim showjps.sh​

#!/bin/bash

#在一台机器上查看所有机器进程

for host in hadoop01 hadoop02 hadoop03 hadoop04

do

echo ----------$host-------------

ssh $host "$*"

done

​touch zkop.sh

vi zkop.sh​

#!/bin/bash

#关于zookeeper的启动、关闭、状态

···· start stop status

case $1 in

"start")

for i in hadoop01 hadoop02 hadoop03 hadoop04

do

ssh $i "/opt/bigdata/zk345/bin/zkServer.sh start"

done

;;

"stop")

for i in hadoop01 hadoop02 hadoop03 hadoop04

do

ssh $i "/opt/bigdata/zk345/bin/zkServer.sh stop"

done

;;

"status")

for i in hadoop01 hadoop02 hadoop03 hadoop04

do

ssh $i "/opt/bigdata/zk345/bin/zkServer.sh status"

done

;;

esac


​创建:touch kakop.sh

vim kakop.sh​

#!/bin/bash

#关于kafka的启动关闭脚本

case $1 in

"start")

for i in hadoop01 hadoop02 hadoop03 hadoop04

do

echo ------------$i 启动KAFKA-----------------

ssh $i "/opt/bigdata/kafka211/bin/kafka-server-start.sh -daemon /opt/bigdata/kafka211/config/server.properties"

done

;;

"stop")

for i in hadoop01 hadoop02 hadoop03 hadoop04

do

echo ------------$i 关闭KAFKA-----------------

ssh $i "/opt/bigdata/kafka211/bin/kafka-server-stop.sh"

done

;;

esac

3.给4个脚本777权限

​[root@hadoop01 bin]# chmod 777 xrsync

[root@hadoop01 bin]# chmod 777 showjps.sh

[root@hadoop01 bin]# chmod 777 zkop.sh

[root@hadoop01 bin]# chmod 777 kakop.sh​

或者chmod -R 777 /bin/

4.在/opt/目录下创建目录bigdata并解压jdk包到bigdata

​mkdir -p /opt/bigdata

cd /opt/install

[root@hadoop01 install]# tar xvf jdk-8u131-linux-x64.tar.gz -C /opt/bigdata

[root@hadoop01 bigdata]# mv jdk1.8.0_131/ jdk180​

5.配置jdk环境变量

​[root@hadoop01 bin]# cd /etc/profile.d/

[root@hadoop01 profile.d]# vim env.sh​

export JAVA_HOME=/opt/bigdata/jdk180

export JRE_HOME=$JAVA_HOME/jre

export CLASSPATH=.:$JAVA_HOME/lib:$JRE_HOME/lib

export PATH=$JAVA_HOME/bin:$PATH

[root@hadoop01 profile.d]# source ./env.sh

6.安装rsync -y(4台同操作)  如果用CRT、Xshell同时控制同步操作可以忽视此步骤

yum install rsync -y

7.从hadoop01网其他三台虚拟机传jdk及环境变量文件

​[root@hadoop01 bin]# xrsync /opt/bigdata/jdk180

[root@hadoop01 bin]# xrsync /etc/profile.d/env.sh

同时在其他三台操作:source /etc/profile.d/env.sh​

验证是否安装成功

hadoop+zookeeper+kafka集群搭建_vim_04

出现如图这样,Java配置成功

8. 解压hadoop包

​cd /opt/install

[root@hadoop01 install]# tar -xvf hadoop-2.6.0-cdh5.14.2.tar.gz -C /opt/bigdata

[root@hadoop01 bigdata]# mv hadoop-2.6.0-cdh5.14.2/ hadoop260​

9、进入/opt/bigdata/hadoop260/etc/hadoop 目录

​vim hadoop-env.sh

在最下面添加​

export JAVA_HOME=/opt/bigdata/jdk180/

vim mapred-env.sh

export JAVA_HOME=/opt/bigdata/jdk180/

hadoop+zookeeper+kafka集群搭建_hadoop_05

vim yarn-env.sh

export JAVA_HOME=/opt/bigdata/jdk180/

hadoop+zookeeper+kafka集群搭建_hadoop_06

vim slaves (将原本的localhost删除)

hadoop01

hadoop02

hadoop03

hadoop04

10,在hadoop260目录下建hadoop2目录

具体就是进入/opt/bigdata/hadoop260/创建hadoop2

11,vim core-site.xml

<configuration>

<property>

  <name>fs.defaultFS</name>

  <value>hdfs://hadoop01:9000</value>

</property>

<property>

   <name>hadoop.tmp.dir</name>

   <value>/opt/bigdata/hadoop260/hadoop2</value>

</property>

<property>

   <name>hadoop.proxyuser.root.hosts</name>

   <value>*</value>

 </property>

<property>

   <name>hadoop.proxyuser.root.groups</name>

   <value>*</value>

 </property>

</configuration>

12,vim hdfs-site.xml

<configuration>

<property>

  <name>dfs.replication</name>

  <value>1</value>

</property>

<property>

  <name>dfs.namenode.secondary.http-address</name>

  <value>hadoop03:50090</value>

</property>

</configuration>

13,vim mapred-site.xml

<configuration>

<property>

        <name>mapreduce.framework.name</name>

        <value>yarn</value>

</property>

<property>

        <name>mapreduce.jobhistory.address</name>

        <value>hadoop01:10020</value>

</property>

<property>

        <name>mapreduce.jobhistory.webapp.address</name>

        <value>hadoop01:19888</value>

</property>

</configuration>

14,vim yarn-site.xml

<configuration>

<!-- Site specific YARN configuration properties -->

<property>

    <name>yarn.nodemanager.aux-services</name>

    <value>mapreduce_shuffle</value>

</property>

<property>

    <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>

    <value>org.apache.hadoop.mapred.ShuffleHandler</value>

</property>

<!-- 指定YARN的ResourceManager的地址 -->

<property>

    <name>yarn.resourcemanager.hostname</name>

    <value>pass2</value>

</property>

<!-- 日志聚集功能使用 -->

<property>

    <name>yarn.log-aggregation-enable</name>

    <value>true</value>

</property>

<!-- 日志保留时间设置7天 -->

<property>

    <name>yarn.log-aggregation.retain-seconds</name>

    <value>604800</value>

</property>

</configuration>

15,添加hadoop环境变量

vim /etc/profile.d/env.sh

export HADOOP_HOME=/opt/bigdata/hadoop260

export HADOOP_MAPRED_HOME=$HADOOP_HOME

export HADOOP_COMMON_HOME=$HADOOP_HOME

export HADOOP_HDFS_HOME=$HADOOP_HOME

export YARN_HOME=$HADOOP_HOME

export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native

export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"

export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin

[root@hadoop01 profile.d]# source /etc/profile.d/env.sh

16,往其他三个虚拟机传配置

​[root@hadoop01 hadoop]# xrsync /etc/profile.d/env.sh

[root@hadoop01 bigdata]# xrsync hadoop260/​

17,启动hadoop(此本步骤在01机器执行)

​[root@hadoop01 hadoop260]# hadoop namenode -format

[root@hadoop01 hadoop260]# start-dfs.sh

[root@hadoop01 hadoop260]# start-yarn.sh

[root@hadoop01 bin]# showjps.sh jps

hadoop+zookeeper+kafka集群搭建_vim_07

出现这样代表所有的节点启动成功了

三、zookeeper安装

1.解压zookeeper安装包

进入/opt/install目录

​[root@hadoop01 install]# tar -xvf zookeeper-3.4.5-cdh5.14.2.tar.gz -C /opt/bigdata/

[root@hadoop01 install]# cd /opt/bigdata/

[root@hadoop01 bigdata]# mv zookeeper-3.4.5-cdh5.14.2/ zk345

2.创建myid

​[root@hadoop01 bigdata]# cd zk345/

[root@hadoop01 zk345]# mkdir zkData

[root@hadoop01 zk345]# cd ./zkData/

[root@hadoop01 zkData]# vim myid

1​

只输入1保存退出

3.修改配置文件

​[root@hadoop01 zk345]# cd conf/

[root@hadoop01 conf]# cp zoo_sample.cfg zoo.cfg

[root@hadoop01 conf]# vim zoo.cfg​

dataDir=/opt/bigdata/zk345/zkData

server.1=hadoop01:2287:3387

server.2=hadoop02:2287:3387

server.3=hadoop03:2287:3387

server.4=hadoop04:2287:3387

hadoop+zookeeper+kafka集群搭建_vim_08

4.将zookeeper发送到其他三台机器上,分别配置/opt/bigdata/zk345/zkData下的myid依次为1234

​[root@hadoop01 bigdata]# cd /opt/bigdata/

[root@hadoop01 bigdata]# xrsync zk345/

在其他三台修改mydi​

vim /opt/bigdata/zk345/zkData/myid

将其他三台内容里的1修改

比如hadoop01 此文件是1

hadoop02内容改成2

hadoop03内容改成3

hadoop04内容改成4

​保存退出

5.配置zookeeper环境变量

[root@hadoop01 bigdata]# vim /etc/profile.d/env.sh

export ZOOKEEPER_HOME=/opt/bigdata/zk345

export PATH=$PATH:$ZOOKEEPER_HOME/bin

​[root@hadoop01 bigdata]# xrsync /etc/profile.d/env.sh

[root@hadoop01 bigdata]# cd /opt/bigdata/​

6.启动zookeeper

​[root@hadoop01 bin]# cd /root/bin/

[root@hadoop01 bin]# zkop.sh start​

执行后效果图

hadoop+zookeeper+kafka集群搭建_hadoop_09

四、kafka安装

1.解压安装包 进入opt目录的install目录下

​[root@hadoop01 install]# tar -xvf kafka_2.11-2.1.1.tgz -C /opt/bigdata/

[root@hadoop01 install]# cd /opt/bigdata/

[root@hadoop01 bigdata]# mv kafka_2.11-2.1.1/ kafka211​

2.创建日志目录

​[root@hadoop01 bigdata]# cd kafka211/

[root@hadoop01 kafka211]# mkdir logs

[root@hadoop01 kafka211]# cd logs/

[root@hadoop01 logs]# pwd

/opt/bigdata/kafka211/logs​


3.修改配置文件

​[root@hadoop01 logs]# cd ../config/

[root@hadoop01 config]# vim server.properties​

broker.id=1

log.dirs=/opt/bigdata/kafka211/logs

zookeeper.connect=hadoop01:2181,hadoop02:2181,hadoop03:2181,hadoop04:2181

[root@hadoop01 bigdata]# xrsync kafka211/

broker.id=1(在其他机器上分别设置broker.id按顺序依次对应改为2、3、4)

4.配置Kafka环境变量

[root@hadoop01 bigdata]# vim /etc/profile.d/env.sh

export KAFAKA_HOME=/opt/bigdata/kafka211

export PATH=$PATH:$KAFAKA_HOME/bin

​[root@hadoop01 bigdata]# xrsync /etc/profile.d/env.sh

[root@hadoop01 bigdata]# source /etc/profile.d/env.sh​

5.启动Kafka

​[root@hadoop01 bigdata]# cd /root/bin/

[root@hadoop01 bin]# kakop.sh start​

集体启动效果如图

hadoop+zookeeper+kafka集群搭建_vim_10

​单台启动方法(需要进入/opt/bigdata/kafka211/bin执行)

./kafka-server-start.sh -daemon ../config/server.properties

6、运行查看showjps.sh jps

hadoop+zookeeper+kafka集群搭建_vim_11

​到这里集群搭建完毕

测试Kafka消费

[root@hadoop01 bin]# kafka-topics.sh --create --topic test --zookeeper hadoop01:2181,hadoop02:2181,hadoop03:2181,hadoop04:2181 --partitions 4 --replication-factor 4​

会显示结果:Created topic "test".

[root@hadoop01 bin]# kafka-topics.sh --describe --zookeeper hadoop01:2181,hadoop02:2181,hadoop03:2181,hadoop04:2181 --topic test

Topic:test PartitionCount:4 ReplicationFactor:4 Configs:

Topic: test Partition: 0 Leader: 1 Replicas: 1,2,3,4 Isr: 1,2,3,4

Topic: test Partition: 1 Leader: 2 Replicas: 2,3,4,1 Isr: 2,3,4,1

Topic: test Partition: 2 Leader: 3 Replicas: 3,4,1,2 Isr: 3,4,1,2

Topic: test Partition: 3 Leader: 4 Replicas: 4,1,2,3 Isr: 4,1,2,3

./kafka-producer-perf-test.sh  --topic test --record-size 100 --num-records 100000 --throughput 1000 --producer-props bootstrap.servers=192.168.11.145:9092,192.168.11.146:9092,192.168.11.147:9092

hadoop+zookeeper+kafka集群搭建_hadoop_12


以上是关于hadoop+zookeeper+kafka集群搭建的主要内容,如果未能解决你的问题,请参考以下文章

hadoop+zookeeper+kafka集群搭建

Kafka + Zookeeper集群搭建

hadoop完全分布式集群搭建

hadoop-ha+zookeeper+hbase+hive+sqoop+flume+kafka+spark集群安装

常用组件kafka集群hadoop高可用

Hadoop集群及组件