InnoDB Cluster addInstance：插件 group_replication 报告：“读取失败”

Posted 2023-04-19

技术标签:

【中文标题】InnoDB Cluster addInstance：插件 group_replication 报告：“读取失败”【英文标题】：InnoDB Cluster addInstance: Plugin group_replication reported: 'read failed' 【发布时间】：2020-03-10 19:43:18 【问题描述】：

使用 5.7.25 运行 InnoDB 集群（计划很快迁移到 8.0）由于网络问题，我的两个实例离开了集群，我只剩下一个健康的节点。

我正在执行以下过程以将节点添加到集群，但失败并显示如下错误。

我做错了什么？

注意：host1 是留在集群中的健康节点。 host2 是加入的人

host1 上的过程：

super_read_only = ON

select @@global.gtid_executed;

super_read_only = OFF

host2 上的程序：

mysql

rsync -Parvz --exclude="auto.cnf" --exclude="&lt;host1&gt;*" --exclude="binlog.*" &lt;user&gt;@&lt;host1&gt;:/mysql-data/* .

reset master;
reset slave;
set SQL_LOG_BIN=0; 
set @@GLOBAL.GTID_PURGED='<gtid from step 2 on host1>`;
set SQL_LOG_BIN=1;

cluster.addInstance('root@host2:3306', ipWhitelist: 'host1, host2')

来自未能加入的新实例（host2）的日志：

2020-03-09T15:19:33.328996Z 38 [Note] 'CHANGE MASTER TO FOR CHANNEL 'group_replication_recovery' executed'. Previous state master_host='<NULL>', master_port= 0, master_log_file='', master_log_pos= 4, master_bind
=''. New state master_host='<NULL>', master_port= 0, master_log_file='', master_log_pos= 4, master_bind=''.
2020-03-09T15:19:33.514003Z 38 [Note] Plugin group_replication reported: 'Group communication SSL configuration: group_replication_ssl_mode: "DISABLED"'
2020-03-09T15:19:33.514154Z 38 [Warning] Plugin group_replication reported: '[GCS] Automatically adding IPv4 localhost address to the whitelist. It is mandatory that it is added.'
2020-03-09T15:19:33.514181Z 38 [Note] Plugin group_replication reported: '[GCS] SSL was not enabled'
2020-03-09T15:19:33.514193Z 38 [Note] Plugin group_replication reported: 'Initialized group communication with configuration: group_replication_group_name: "<uuid1>"; group_replication_local_address: "host2:33061"; group_replication_group_seeds: "host1:33061"; group_replication_bootstrap_group: false; group_replication_poll_spin_loops: 100; group_replication_compression_threshold: 1000; group_replication_ip_whitelist: "host1ip, host2ip"'
2020-03-09T15:19:33.514223Z 38 [Note] Plugin group_replication reported: '[GCS] Configured number of attempts to join: 0'
2020-03-09T15:19:33.514227Z 38 [Note] Plugin group_replication reported: '[GCS] Configured time between attempts to join: 5 seconds'
2020-03-09T15:19:33.514239Z 38 [Note] Plugin group_replication reported: 'Member configuration: member_id: 139923628; member_uuid: "<uuid2>"; single-primary mode: "true"; group_replication_auto_increment_increment: 7; '
2020-03-09T15:19:33.514576Z 40 [Note] 'CHANGE MASTER TO FOR CHANNEL 'group_replication_applier' executed'. Previous state master_host='<NULL>', master_port= 0, master_log_file='', master_log_pos= 4, master_bind=''. New state master_host='<NULL>', master_port= 0, master_log_file='', master_log_pos= 4, master_bind=''.
2020-03-09T15:19:33.613296Z 43 [Note] Slave SQL thread for channel 'group_replication_applier' initialized, starting replication in log 'FIRST' at position 0, relay log './scynbm96-relay-bin-group_replication_applier.000001' position: 4
2020-03-09T15:19:33.613383Z 38 [Note] Plugin group_replication reported: 'Group Replication applier module successfully initialized!'
2020-03-09T15:19:33.613811Z 0 [Note] Plugin group_replication reported: 'XCom protocol version: 3'
2020-03-09T15:19:33.613858Z 0 [Note] Plugin group_replication reported: 'XCom initialized and ready to accept incoming connections on port 33061'
2020-03-09T15:19:33.667118Z 0 [Warning] Plugin group_replication reported: 'read failed'
2020-03-09T15:19:33.685025Z 0 [ERROR] Plugin group_replication reported: '[GCS] The member was unable to join the group. Local port: 33061'
2020-03-09T15:19:34.732938Z 48 [Note] Got an error reading communication packets
2020-03-09T15:20:04.733653Z 52 [Note] Got an error reading communication packets
2020-03-09T15:20:33.613595Z 38 [ERROR] Plugin group_replication reported: 'Timeout on wait for view after joining group'
2020-03-09T15:20:33.613655Z 38 [Note] Plugin group_replication reported: 'Requesting to leave the group despite of not being a member'
2020-03-09T15:20:33.613697Z 38 [ERROR] Plugin group_replication reported: '[GCS] The member is leaving a group without being on one.'
2020-03-09T15:20:33.614136Z 43 [Note] Error reading relay log event for channel 'group_replication_applier': slave SQL thread was killed
2020-03-09T15:20:33.614325Z 43 [Note] Slave SQL thread for channel 'group_replication_applier' exiting, replication stopped in log 'FIRST' at position 0
2020-03-09T15:20:33.614966Z 40 [Note] Plugin group_replication reported: 'The group replication applier thread was killed'
2020-03-09T15:20:34.734155Z 55 [Note] Got an error reading communication packets

【问题讨论】：

【参考方案1】：

以下步骤终于让我形成了一个健康的 3 节点集群。

select @@global.gtid_executed;

rsync -Parvz --exclude="auto.cnf" --exclude="&lt;host1&gt;*" --exclude="binlog.*" &lt;user&gt;@&lt;host1&gt;:/mysql-data/* .

解散集群

注意：集群解散后，您需要重新启动所有 MySQL 路由器注2：这里有一些监控信息： https://dev.mysql.com/doc/refman/5.7/en/group-replication-monitoring.html （8.x 版增加了更多的日志记录和检测）

【讨论】：

以上是关于InnoDB Cluster addInstance：插件 group_replication 报告：“读取失败”的主要内容，如果未能解决你的问题，请参考以下文章