InnoDB Cluster addInstance:插件 group_replication 报告:“读取失败”

Posted

技术标签:

【中文标题】InnoDB Cluster addInstance:插件 group_replication 报告:“读取失败”【英文标题】:InnoDB Cluster addInstance: Plugin group_replication reported: 'read failed' 【发布时间】:2020-03-10 19:43:18 【问题描述】:

使用 5.7.25 运行 InnoDB 集群(计划很快迁移到 8.0) 由于网络问题,我的两个实例离开了集群,我只剩下一个健康的节点。

我正在执行以下过程以将节点添加到集群,但失败并显示如下错误。

我做错了什么?

注意:host1 是留在集群中的健康节点。 host2 是加入的人

host1 上的过程:

    设置super_read_only = ON 复制最后一个 GTID 使用:select @@global.gtid_executed; 设置super_read_only = OFF(就在host2的第3步之前)

host2 上的程序:

    停止mysql 使用以下命令从 host1 同步 mysql 数据目录: rsync -Parvz --exclude="auto.cnf" --exclude="<host1>*" --exclude="binlog.*" <user>@<host1>:/mysql-data/* . 启动mysql 清除复制日志并设置 GTID 的使用:
reset master;
reset slave;
set SQL_LOG_BIN=0; 
set @@GLOBAL.GTID_PURGED='<gtid from step 2 on host1>`;
set SQL_LOG_BIN=1; 
    连接到 MySQL Shell 并将新节点 (host2) 添加到集群: cluster.addInstance('root@host2:3306', ipWhitelist: 'host1, host2')

来自未能加入的新实例(host2)的日志:

2020-03-09T15:19:33.328996Z 38 [Note] 'CHANGE MASTER TO FOR CHANNEL 'group_replication_recovery' executed'. Previous state master_host='<NULL>', master_port= 0, master_log_file='', master_log_pos= 4, master_bind
=''. New state master_host='<NULL>', master_port= 0, master_log_file='', master_log_pos= 4, master_bind=''.
2020-03-09T15:19:33.514003Z 38 [Note] Plugin group_replication reported: 'Group communication SSL configuration: group_replication_ssl_mode: "DISABLED"'
2020-03-09T15:19:33.514154Z 38 [Warning] Plugin group_replication reported: '[GCS] Automatically adding IPv4 localhost address to the whitelist. It is mandatory that it is added.'
2020-03-09T15:19:33.514181Z 38 [Note] Plugin group_replication reported: '[GCS] SSL was not enabled'
2020-03-09T15:19:33.514193Z 38 [Note] Plugin group_replication reported: 'Initialized group communication with configuration: group_replication_group_name: "<uuid1>"; group_replication_local_address: "host2:33061"; group_replication_group_seeds: "host1:33061"; group_replication_bootstrap_group: false; group_replication_poll_spin_loops: 100; group_replication_compression_threshold: 1000; group_replication_ip_whitelist: "host1ip, host2ip"'
2020-03-09T15:19:33.514223Z 38 [Note] Plugin group_replication reported: '[GCS] Configured number of attempts to join: 0'
2020-03-09T15:19:33.514227Z 38 [Note] Plugin group_replication reported: '[GCS] Configured time between attempts to join: 5 seconds'
2020-03-09T15:19:33.514239Z 38 [Note] Plugin group_replication reported: 'Member configuration: member_id: 139923628; member_uuid: "<uuid2>"; single-primary mode: "true"; group_replication_auto_increment_increment: 7; '
2020-03-09T15:19:33.514576Z 40 [Note] 'CHANGE MASTER TO FOR CHANNEL 'group_replication_applier' executed'. Previous state master_host='<NULL>', master_port= 0, master_log_file='', master_log_pos= 4, master_bind=''. New state master_host='<NULL>', master_port= 0, master_log_file='', master_log_pos= 4, master_bind=''.
2020-03-09T15:19:33.613296Z 43 [Note] Slave SQL thread for channel 'group_replication_applier' initialized, starting replication in log 'FIRST' at position 0, relay log './scynbm96-relay-bin-group_replication_applier.000001' position: 4
2020-03-09T15:19:33.613383Z 38 [Note] Plugin group_replication reported: 'Group Replication applier module successfully initialized!'
2020-03-09T15:19:33.613811Z 0 [Note] Plugin group_replication reported: 'XCom protocol version: 3'
2020-03-09T15:19:33.613858Z 0 [Note] Plugin group_replication reported: 'XCom initialized and ready to accept incoming connections on port 33061'
2020-03-09T15:19:33.667118Z 0 [Warning] Plugin group_replication reported: 'read failed'
2020-03-09T15:19:33.685025Z 0 [ERROR] Plugin group_replication reported: '[GCS] The member was unable to join the group. Local port: 33061'
2020-03-09T15:19:34.732938Z 48 [Note] Got an error reading communication packets
2020-03-09T15:20:04.733653Z 52 [Note] Got an error reading communication packets
2020-03-09T15:20:33.613595Z 38 [ERROR] Plugin group_replication reported: 'Timeout on wait for view after joining group'
2020-03-09T15:20:33.613655Z 38 [Note] Plugin group_replication reported: 'Requesting to leave the group despite of not being a member'
2020-03-09T15:20:33.613697Z 38 [ERROR] Plugin group_replication reported: '[GCS] The member is leaving a group without being on one.'
2020-03-09T15:20:33.614136Z 43 [Note] Error reading relay log event for channel 'group_replication_applier': slave SQL thread was killed
2020-03-09T15:20:33.614325Z 43 [Note] Slave SQL thread for channel 'group_replication_applier' exiting, replication stopped in log 'FIRST' at position 0
2020-03-09T15:20:33.614966Z 40 [Note] Plugin group_replication reported: 'The group replication applier thread was killed'
2020-03-09T15:20:34.734155Z 55 [Note] Got an error reading communication packets

【问题讨论】:

【参考方案1】:

以下步骤终于让我形成了一个健康的 3 节点集群。

    将健康节点设置为 super_read_only 稍等片刻,让现有交易完成 使用select @@global.gtid_executed; 复制 GTID 在host2和host3上,从头安装mysql 在host2和host3上,停止mysql服务器 rsync 数据到两个主机使用:rsync -Parvz --exclude="auto.cnf" --exclude="&lt;host1&gt;*" --exclude="binlog.*" &lt;user&gt;@&lt;host1&gt;:/mysql-data/* . 确认主机 1 上的 GTID 未更改 在 host2 和 host3 上启动 mysql,通过选择某些表来验证数据是否完整 使用 mysql shell,解散集群 再次创建集群,从其存在开始添加 host2 和 host3。

注意:集群解散后,您需要重新启动所有 MySQL 路由器 注2:这里有一些监控信息: https://dev.mysql.com/doc/refman/5.7/en/group-replication-monitoring.html (8.x 版增加了更多的日志记录和检测)

【讨论】:

以上是关于InnoDB Cluster addInstance:插件 group_replication 报告:“读取失败”的主要内容,如果未能解决你的问题,请参考以下文章

搭建 MySQL 8.0 InnoDB Cluster

Mysql Innodb Cluster测试

MySQL 8.0 InnoDB Cluster 恢复故障成员

MySQL原生高可用方案之InnoDB Cluster

InnoDB Cluster addInstance:插件 group_replication 报告:“读取失败”

mysql innodb cluster (by quqi99)