Centos7安装zookeeper+kafka集群

Posted longblogs

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Centos7安装zookeeper+kafka集群相关的知识,希望对你有一定的参考价值。

Centos7安装zookeeper+kafka集群

1  Zookeeper和kafka简介

1)  ZooKeeper

是一个分布式的、分层级的文件系统,能促进客户端间的松耦合,并提供最终一致的,用于管理、协调Kafka代理,zookeeper集群中一台服务器作为Leader,其它作为Follower

2)  Apache Kafka

是分布式发布-订阅消息系统,kafka对消息保存时根据Topic进行归类,每个topic将被分成多个partition(区),每条消息在文件中的位置称为offset(偏移量),offset为一个long型数字,它是唯一标记一条消息,它唯一的标记一条消息。

一个partition中的消息只会被group中的一个consumer消费,每个group中consumer消息消费互相独立,不过一个consumer可以消费多个partitions中的消息。

kafka只能保证一个partition中的消息被某个consumer消费时,消息是顺序的。从Topic角度来说,消息仍不是有序的。

每个partition都有一个server为"leader";leader负责所有的读写操作,如果leader失效,那么将会有其他follower来接管成为新的leader(有zookeeper选举);follower只是单调的和leader跟进,同步消息即可..由此可见作为leader的server承载了全部的请求压力,因此从集群的整体考虑,有多少个partitions就意味着有多少个"leader",kafka会将"leader"均衡的分散在每个实例上,来确保整体的性能稳定.

2  服务器地址

使用3台服务器搭建集群

Server1:192.168.89.11

Server2:192.168.89.12

Server3:192.168.89.13

3  安装jdk

(jdk1.8.0_102)

# yum install -y java-1.8.0-openjdk java-1.8.0-openjdk-devel

4  搭建zookeeper集群

Zookeeper集群的工作是超过半数才能对外提供服务,所以选择主机数可以是1台、3台、5台…。

3台中允许1台挂掉 ,是否可以用偶数,其实没必要。

如果有4台,那么挂掉一台还剩下三台服务器,如果再挂掉一个就不行了,记住是超过半数。

4.1 下载地址

http://mirrors.shu.edu.cn/apache/zookeeper/zookeeper-3.4.13/zookeeper-3.4.13.tar.gz

4.2 所有节点安装zookeeper

1)创建zookeeper安装目录

# mkdir -p /data/zookeeper

2)将zookeeper解压到安装目录

# tar -zxvf zookeeper-3.4.13.tar.gz -C /data/zookeeper/

3)新建保存数据的目录

# mkdir -p /data/zookeeper/zookeeper-3.4.13/data

4)新建日志目录

# mkdir -p /data/zookeeper/zookeeper-3.4.13/dataLog

5)配置环境变量并刷新

# vim /etc/profile

===================================================

 export ZK_HOME=/data/zookeeper/zookeeper-3.4.13

export PATH=$PATH:$ZK_HOME/bin

===================================================

# source /etc/profile

4.3 所有节点配置zookeeper配置文件

4.3.1        各节点中配置

# cd /data/zookeeper/zookeeper-3.4.13/conf/

# cp -f zoo_sample.cfg zoo.cfg

# vim zoo.cfg

====================================================

tickTime=2000

initLimit=10

syncLimit=5

dataDir=/data/zookeeper/zookeeper-3.4.13/data/

dataLogDir=/data/zookeeper/zookeeper-3.4.13/dataLog/

clientPort=2181

server.1=192.168.89.11:2888:3888

server.2=192.168.89.12:2888:3888

server.3=192.168.89.13:2888:3888

#第一个端口是master和slave之间的通信端口,默认是2888,第二个端口是leader选举的端口,集群刚启动的时候选举或者leader挂掉之后进行新的选举的端口默认是3888

====================================================

# echo "1" > /data/zookeeper/zookeeper-3.4.13/data/myid   #server1配置,各节点不同,跟上面配置server.1的号码一样

# echo "2" > /data/zookeeper/zookeeper-3.4.13/data/myid   #server2配置,各节点不同,跟上面配置server.2的号码一样

# echo "3" > /data/zookeeper/zookeeper-3.4.13/data/myid   #server3配置,各节点不同,跟上面配置server.3的号码一样

4.3.2        启动停止zookeeper命令并设置开机启动

1)启动停止zookeeper命令

# zkServer.sh start   #启动

# zkServer.sh stop   #停止

# zkCli.sh   #连接集群

2)设置开机启动

# cd /usr/lib/systemd/system

# vim zookeeper.service

=========================================

[Unit]

Description=zookeeper server daemon

After=zookeeper.target

 

[Service]

Type=forking

ExecStart=/data/zookeeper/zookeeper-3.4.13/bin/zkServer.sh start

ExecReload=/data/zookeeper/zookeeper-3.4.13/bin/zkServer.sh stop && sleep 2 && /data/zookeeper/zookeeper-3.4.13/bin/zkServer.sh start

ExecStop=/data/zookeeper/zookeeper-3.4.13/bin/zkServer.sh stop

Restart=always

 

[Install]

WantedBy=multi-user.target

=======================================================

# systemctl start  zookeeper

# systemctl enable zookeeper

5   搭建kafka集群

5.1 下载地址

http://mirror.bit.edu.cn/apache/kafka/2.1.0/kafka_2.12-2.1.0.tgz

5.2 所有节点上搭建kafka

1)  新建kafka工作目录

# mkdir -p /data/kafka

2)  解压kafka

# tar -zxvf kafka_2.12-2.1.0.tgz -C /data/kafka/

3)  新建kafka日志目录

# mkdir -p /data/kafka/kafkalogs

4)  配置kafka配置文件

# vim /data/kafka/kafka_2.12-2.1.0/config/server.properties

=================================================

broker.id=1   #每一个broker在集群中的唯一标示,要求是正数

listeners=PLAINTEXT://192.168.89.11:9092  # 套接字服务器连接的地址

log.dirs=/data/kafka/kafkalogs/    #kafka数据的存放地址

message.max.byte=5242880     #消息体的最大大小,单位是字节

log.cleaner.enable=true #开启日志清理

log.retention.hours=72    #segment文件保留的最长时间(小时),超时将被删除,也就是说3天之前的数据将被清理掉

log.segment.bytes=1073741824  #日志文件中每个segmeng的大小(字节),默认为1G

log.retention.check.interval.ms=300000  #定期检查segment文件有没有达到1G(单位毫秒)

num.partitions=3  #每个topic的分区个数, 更多的分区允许更大的并行操作default.replication.factor=3  # 一个topic ,默认分区的replication个数 ,不得大于集群中broker的个数

delete.topic.enable=true   # 选择启用删除主题功能,默认false

replica.fetch.max.bytes=5242880  # replicas每次获取数据的最大大小

#以下三个参数设置影响消费者消费分区可以连接的kafka主机,详细请看第6点附录

offsets.topic.replication.factor=3  #Offsets topic的复制因子(备份数)

transaction.state.log.replication.factor=3  #事务主题的复制因子(设置更高以确保可用性)

transaction.state.log.min.isr=3 #覆盖事务主题的min.insync.replicas配置

zookeeper.connect=192.168.89.11:2181,192.168.89.12:2181,192.168.89.13:2181

#zookeeper集群的地址,可以是多个

=================================================

5)   kafka节点默认需要的内存为1G,如果需要修改内存,可以修改kafka-server-start.sh的配置项

# vim /data/kafka/kafka_2.12-2.1.0/bin/kafka-server-start.sh

#找到KAFKA_HEAP_OPTS配置项,例如修改如下:

       export KAFKA_HEAP_OPTS="-Xmx2G -Xms2G"

5.3 启动kafka并设置开机启动

1)启动kafka

# cd /data/kafka/kafka_2.12-2.1.0/

./bin/kafka-server-start.sh -daemon ./config/server.properties

启动后可以执行jps命令查看kafka是否启动,如果启动失败,可以进入logs目录,查看kafkaServer.out日志记录。

2)设置开机启动

# cd /usr/lib/systemd/system

# vim kafka.service

=========================================

[Unit]

Description=kafka server daemon

After=kafka.target

 

[Service]

Type=forking

ExecStart=/data/kafka/kafka_2.12-2.1.0/bin/kafka-server-start.sh -daemon /data/kafka/kafka_2.12-2.1.0/config/server.properties

ExecReload=/data/kafka/kafka_2.12-2.1.0/bin/kafka-server-stop.sh && sleep 2 && /data/kafka/kafka_2.12-2.1.0/bin/kafka-server-start.sh -daemon /

data/kafka/kafka_2.12-2.1.0/config/server.properties

ExecStop=/data/kafka/kafka_2.12-2.1.0/bin/kafka-server-stop.sh

Restart=always

 

[Install]

WantedBy=multi-user.target

=======================================================

# systemctl start kafka

# systemctl enable kafka

5.4 创建topic

创建3分区、3备份

# cd /data/kafka/kafka_2.12-2.1.0/

#./bin/kafka-topics.sh --create --zookeeper 192.168.89.11:2181,192.168.89.12:2181,192.168.89.13:2181 --replication-factor 3 --partitions 3 --topic SyslogTopic

5.5 常用命令

# cd /data/kafka/kafka_2.12-2.1.0/

1)  停止kafka

./bin/kafka-server-stop.sh 

2)  创建topic

./bin/kafka-topics.sh --create --zookeeper 192.168.89.11:2181,192.168.89.12:2181,192.168.89.13:2181 --replication-factor 1 --partitions 1 --topic topic_name

3)  展示topic

./bin/kafka-topics.sh --list --zookeeper 192.168.89.11:2181,192.168.89.12:2181,192.168.89.13:2181

4) 查看描述topic

./bin/kafka-topics.sh --describe --zookeeper 192.168.89.11:2181,192.168.89.12:2181,192.168.89.13:2181 --topic topic_name

5)  生产者发送消息

./bin/kafka-console-producer.sh --broker-list 192.168.89.11:9092 --topic topic_name

6)  消费者消费消息

./bin/kafka-console-consumer.sh --bootstrap-server 192.168.89.11:9092,192.168.89.12:9092,192.168.89.13:9092 --topic topic_name

7)  删除topic

./bin/kafka-topics.sh --delete --topictopic_name --zookeeper 192.168.89.11:2181,192.168.89.12:2181,192.168.89.13:2181

8)  查看每分区consumer_offsets(可以连接到的消费主机)

./bin/kafka-topics.sh --describe --zookeeper 192.168.89.11:2181,192.168.89.12:2181,192.168.89.13:2181 --topic __consumer_offsets

6   附录

1、  消费者消费分区数据,kafka的负载均衡

在/data/kafka/kafka_2.12-2.1.0/config/server.properties文件中修改offsets.topic.replication.factor,这个值为kafka集群的主机数量(__consumer_offest不受server.properties中num.partitions和default.replication.factor参数的制约。相反地,它的分区数和备份因子分别由offsets.topic.num.partitions和offsets.topic.replication.factor参数决定。这两个参数的默认值分别是50和1,表示该topic有50个分区,副本因子是1。),设置正确如下图:

(执行命令:bin/kafka-topics.sh --describe --zookeeper 192.168.89.11:2181,192.168.89.12:2181,192.168.89.13:2181 --topic __consumer_offsets)

技术分享图片

注:如果配置文件offsets.topic.replication.factor设置成1后启动了一次,再设置成3重新启动副本因子不会更改,需要以下方法:

另外一种方法设置:

1)  创建规则json

cat > increase-replication-factor.json <<EOF

{"version":1, "partitions":[

{"topic":"__consumer_offsets","partition":0,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":1,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":2,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":3,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":4,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":5,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":6,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":7,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":8,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":9,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":10,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":11,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":12,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":13,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":14,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":15,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":16,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":17,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":18,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":19,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":20,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":21,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":22,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":23,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":24,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":25,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":26,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":27,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":28,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":29,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":30,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":31,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":32,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":33,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":34,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":35,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":36,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":37,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":38,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":39,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":40,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":41,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":42,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":43,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":44,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":45,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":46,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":47,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":48,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":49,"replicas":[1,2,3]}]

}

EOF

2)  执行

bin/kafka-reassign-partitions.sh --zookeeper 192.168.89.11:2181,192.168.89.12:2181,192.168.89.13:2181 --reassignment-json-file increase-replication-factor.json --execute

3)  验证

bin/kafka-reassign-partitions.sh --zookeeper 192.168.89.11:2181,192.168.89.12:2181,192.168.89.13:2181 --reassignment-json-file increase-replication-factor.json --verify

 

以上是关于Centos7安装zookeeper+kafka集群的主要内容,如果未能解决你的问题,请参考以下文章

centOS7安装kafka和zookeeper

Centos7 安装kafka+zookeeper实践

centos7 docker安装kafka:2.13-2.8.1

kafka生产实例安装

学习笔记Kafka—— Kafka安装配置—— JDK&Zookeeper安装Zookeeper 常用操作

学习笔记Kafka—— Kafka安装配置—— JDK&Zookeeper安装Zookeeper 常用操作