MySQL GTID复制报错跳过方法
Posted 码城
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了MySQL GTID复制报错跳过方法相关的知识,希望对你有一定的参考价值。
gtid复制
更详细的验证步骤,详见
mysql主备复制集群跳过错误方法(GTID) - 码到城攻MySQL主备复制集群跳过错误方法(GTID),区分传统复制和GTID复制,复制方式不同,跳过错误的方式也不同。https://www.codecomeon.com/posts/192/
问题复现
重点验证下 `gtid` 复制报错,如何跳过,如下为一个正常 `gtid` 复制的表,在从实例上删除表,然后主实例中对表插入数据,以复现复制报错:
从:192.168.78.189 3306,上删除表 test:
MySQL [gtid_test]> show tables;
+---------------------+
| Tables_in_gtid_test |
+---------------------+
| test |
+---------------------+
1 row in set (0.00 sec)
MySQL [gtid_test]>
MySQL [gtid_test]> drop table test;
Query OK, 0 rows affected (0.01 sec)
MySQL [gtid_test]> show tables;
Empty set (0.00 sec)
MySQL [gtid_test]>
主:192.168.78.188 3308 上对正在复制的表 test 插入 sql:MySQL [(none)]> show databases;
+--------------------+
| Database |
+--------------------+
| gtid_test |
| information_schema |
| mysql |
| ning |
| performance_schema |
| sys |
+--------------------+
6 rows in set (0.01 sec)
MySQL [(none)]> use gtid_test;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
MySQL [gtid_test]> show tables;
+---------------------+
| Tables_in_gtid_test |
+---------------------+
| test |
+---------------------+
1 row in set (0.00 sec)
MySQL [gtid_test]> insert into test values (2);
Query OK, 1 row affected (0.00 sec)
MySQL [gtid_test]>
然后,在从上查看同步状态:
MySQL [gtid_test]> show slave status\\G;
*************************** 1. row ***************************
Slave_IO_State: Waiting for source to send event
Master_Host: 192.168.78.188
Master_User: gtid_copy
Master_Port: 3308
Connect_Retry: 60
Master_Log_File: mysql-bin.000002
Read_Master_Log_Pos: 667
Relay_Log_File: 67bf0b455536-relay-bin.000004
Relay_Log_Pos: 598
Relay_Master_Log_File: mysql-bin.000002
Slave_IO_Running: Yes
Slave_SQL_Running: No
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 1146
Last_Error: Coordinator stopped because there were error(s) in the worker(s). The most recent failure being: Worker 1 failed executing transaction '1d9b633d-e1ae-11ec-8ece-0242ac110002:15' at master log mysql-bin.000002, end_log_pos 636. See error log and/or performance_schema.replication_applier_status_by_worker table for more details about this failure or others, if any.
Skip_Counter: 0
Exec_Master_Log_Pos: 382
Relay_Log_Space: 1140
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 1146
Last_SQL_Error: Coordinator stopped because there were error(s) in the worker(s). The most recent failure being: Worker 1 failed executing transaction '1d9b633d-e1ae-11ec-8ece-0242ac110002:15' at master log mysql-bin.000002, end_log_pos 636. See error log and/or performance_schema.replication_applier_status_by_worker table for more details about this failure or others, if any.
Replicate_Ignore_Server_Ids:
Master_Server_Id: 188
Master_UUID: 1d9b633d-e1ae-11ec-8ece-0242ac110002
Master_Info_File: mysql.slave_master_info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State:
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp: 220602 07:47:50
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set: 1d9b633d-e1ae-11ec-8ece-0242ac110002:1-15
Executed_Gtid_Set: 1d9b633d-e1ae-11ec-8ece-0242ac110002:1-14,
f19b117a-cf6d-11ec-9fd7-0242ac110002:1 // 此为从节点删除表执行的事务
Auto_Position: 1
Replicate_Rewrite_DB:
Channel_Name:
Master_TLS_Version:
Master_public_key_path:
Get_master_public_key: 0
Network_Namespace:
1 row in set, 1 warning (0.00 sec)
ERROR: No query specified
MySQL [gtid_test]>
可以看出,复制已经报错,而:
Retrieved_Gtid_Set: 1d9b633d-e1ae-11ec-8ece-0242ac110002:1-15
Executed_Gtid_Set: 1d9b633d-e1ae-11ec-8ece-0242ac110002:1-14,
意思是,从主接收 事务 1 到 15,但是只成功执行了 1 到 14 ,15 并没有执行成功,如果 15 没有成功的前提下,我们发现,后续的复制,也没法在进行,如下,主节点在新增一张表:
MySQL [gtid_test]> create table next_test (a int);
Query OK, 0 rows affected (0.02 sec)
MySQL [gtid_test]>
MySQL [gtid_test]> show tables;
+---------------------+
| Tables_in_gtid_test |
+---------------------+
| next_test |
| test |
+---------------------+
2 rows in set (0.00 sec)
MySQL [gtid_test]>
如下,可以看到,由于上一个 15 事务的失败,导致后续的 16 事务同样没有在从实例上执行:
MySQL [gtid_test]> show slave status\\G;
*************************** 1. row ***************************
Slave_IO_State: Waiting for source to send event
Master_Host: 192.168.78.188
Master_User: gtid_copy
Master_Port: 3308
Connect_Retry: 60
Master_Log_File: mysql-bin.000002
Read_Master_Log_Pos: 872
Relay_Log_File: 67bf0b455536-relay-bin.000004
Relay_Log_Pos: 598
Relay_Master_Log_File: mysql-bin.000002
Slave_IO_Running: Yes
Slave_SQL_Running: No
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 1146
Last_Error: Coordinator stopped because there were error(s) in the worker(s). The most recent failure being: Worker 1 failed executing transaction '1d9b633d-e1ae-11ec-8ece-0242ac110002:15' at master log mysql-bin.000002, end_log_pos 636. See error log and/or performance_schema.replication_applier_status_by_worker table for more details about this failure or others, if any.
Skip_Counter: 0
Exec_Master_Log_Pos: 382
Relay_Log_Space: 1345
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 1146
Last_SQL_Error: Coordinator stopped because there were error(s) in the worker(s). The most recent failure being: Worker 1 failed executing transaction '1d9b633d-e1ae-11ec-8ece-0242ac110002:15' at master log mysql-bin.000002, end_log_pos 636. See error log and/or performance_schema.replication_applier_status_by_worker table for more details about this failure or others, if any.
Replicate_Ignore_Server_Ids:
Master_Server_Id: 188
Master_UUID: 1d9b633d-e1ae-11ec-8ece-0242ac110002
Master_Info_File: mysql.slave_master_info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State:
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp: 220602 07:47:50
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set: 1d9b633d-e1ae-11ec-8ece-0242ac110002:1-16
Executed_Gtid_Set: 1d9b633d-e1ae-11ec-8ece-0242ac110002:1-14,
f19b117a-cf6d-11ec-9fd7-0242ac110002:1
Auto_Position: 1
Replicate_Rewrite_DB:
Channel_Name:
Master_TLS_Version:
Master_public_key_path:
Get_master_public_key: 0
Network_Namespace:
1 row in set, 1 warning (0.01 sec)
ERROR: No query specified
MySQL [gtid_test]>
MySQL [gtid_test]>
MySQL [gtid_test]> show tables;
Empty set (0.00 sec)
MySQL [gtid_test]>
如下,15 16 号事务并未执行:
Retrieved_Gtid_Set: 1d9b633d-e1ae-11ec-8ece-0242ac110002:1-16
Executed_Gtid_Set: 1d9b633d-e1ae-11ec-8ece-0242ac110002:1-14,
解决问题
方法一
我们验证一种常用的方法,在从节点上跳过错误事务:
1. 停止slave进程
mysql> STOP SLAVE;
2. 设置事务号,事务号从 Retrieved_Gtid_Set 获取,在session里设置gtid_next,即跳过这个GTID
mysql> SET @@SESSION.GTID_NEXT= '1d9b633d-e1ae-11ec-8ece-0242ac110002:15'
3. 设置空事物
mysql> BEGIN; COMMIT;
4. 恢复自增事物号
mysql> SET SESSION GTID_NEXT = AUTOMATIC;
5. 启动slave进程
mysql> START SLAVE;
方法二
重置 master 方法跳过错误:
mysql> STOP SLAVE;
mysql> RESET MASTER;
mysql> SET @@GLOBAL.GTID_PURGED ='1d9b633d-e1ae-11ec-8ece-0242ac110002:1-16'
mysql> START SLAVE;
上面这些命令的意思是,忽略 1d9b633d-e1ae-11ec-8ece-0242ac110002:1-16 这个 GTID 事务,下一次事务接着从 17 这个GTID开始,即可跳过上述错误。
以上是关于MySQL GTID复制报错跳过方法的主要内容,如果未能解决你的问题,请参考以下文章