MHA主从不同步处理

Posted feifeizouzou

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了MHA主从不同步处理相关的知识,希望对你有一定的参考价值。

---恢复内容开始---

1.检查MHA状态:

#masterha_check_status --conf=/etc/mha/app1.cnf 

  app1 is stopped(2:NOT_RUNNING).

状态为没有运行。

2.检查MHA主从复制状态

masterha_check_repl --conf=/etc/mha/app1.cnf 

Wed Mar 28 11:38:22 2018 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.

Wed Mar 28 11:38:22 2018 - [info] Reading application default configuration from /etc/mha/app1.cnf..

Wed Mar 28 11:38:22 2018 - [info] Reading server configuration from /etc/mha/app1.cnf..

Wed Mar 28 11:38:22 2018 - [info] MHA::MasterMonitor version 0.57.

Wed Mar 28 11:38:22 2018 - [error][/usr/local/share/perl5/MHA/Server.pm, ln935] SQL Thread is stopped(error) on 10.1.46.214(10.1.46.214:3306)! Errno:1032, Error:Could not execute Delete_rows event on table shop.t_ncl_agent_openid; Can‘t find record in ‘t_ncl‘, Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event‘s master log mysql-bin.000010, end_log_pos 878001439

Wed Mar 28 11:38:22 2018 - [error][/usr/local/share/perl5/MHA/ServerManager.pm, ln703] Server 10.1.46.214(10.1.46.214:3306) is alive, but does not work as a slave!

Wed Mar 28 11:38:22 2018 - [error][/usr/local/share/perl5/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations.  at /usr/local/share/perl5/MHA/MasterMonitor.pm line 329

Wed Mar 28 11:38:22 2018 - [error][/usr/local/share/perl5/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.

Wed Mar 28 11:38:22 2018 - [info] Got exit code 1 (Not master dead).

 

MySQL Replication Health is NOT OK!

3.主从复制检查:其中一个从库:

mysql> show slave status\G;

*************************** 1. row ***************************

               Slave_IO_State: Waiting for master to send event

                  Master_Host: 

                  Master_User: rep

                  Master_Port: 3306

                Connect_Retry: 60

              Master_Log_File: mysql-bin.000070

          Read_Master_Log_Pos: 232233295

               Relay_Log_File: weixintbdb02-relay-bin.000024

                Relay_Log_Pos: 877993376

        Relay_Master_Log_File: mysql-bin.000010

             Slave_IO_Running: Yes

            Slave_SQL_Running: No

              Replicate_Do_DB:

          Replicate_Ignore_DB:

           Replicate_Do_Table:

       Replicate_Ignore_Table:

      Replicate_Wild_Do_Table:

  Replicate_Wild_Ignore_Table:

                   Last_Errno: 1032

                   Last_Error: Could not execute Delete_rows event on table shop.t_ncl; Can‘t find record in ‘t_ncl‘, Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event‘s master log mysql-bin.000010, end_log_pos 878001439

                 Skip_Counter: 0

          Exec_Master_Log_Pos: 877993203

              Relay_Log_Space: 64262720902

              Until_Condition: None

               Until_Log_File:

                Until_Log_Pos: 0

           Master_SSL_Allowed: No

           Master_SSL_CA_File:

           Master_SSL_CA_Path:

              Master_SSL_Cert:

            Master_SSL_Cipher:

               Master_SSL_Key:

        Seconds_Behind_Master: NULL

Master_SSL_Verify_Server_Cert: No

                Last_IO_Errno: 0

                Last_IO_Error:

               Last_SQL_Errno: 1032

               Last_SQL_Error: Could not execute Delete_rows event on table shop.t_ncl_agent_openid; Can‘t find record in ‘t_ncl_agent_openid‘, Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event‘s master log mysql-bin.000010, end_log_pos 878001439

  Replicate_Ignore_Server_Ids:

             Master_Server_Id: 462131

                  Master_UUID: 9ab25963-1cf8-11e8-b9b9-0425c58fa8e9

             Master_Info_File: /data/mysql/mysql/master.info

                    SQL_Delay: 0

          SQL_Remaining_Delay: NULL

      Slave_SQL_Running_State:

           Master_Retry_Count: 86400

                  Master_Bind:

      Last_IO_Error_Timestamp:

     Last_SQL_Error_Timestamp: 180327 23:55:13

               Master_SSL_Crl:

           Master_SSL_Crlpath:

           Retrieved_Gtid_Set: 9ab25963-1cf8-11e8-b9b9-0425c58fa8e9:9-123314222

            Executed_Gtid_Set: 98ef4e90-1cff-11e8-ab2a-0425c58fa8d7:1-8,

9ab25963-1cf8-11e8-b9b9-0425c58fa8e9:1-13652530

                Auto_Position: 1

         Replicate_Rewrite_DB:

                 Channel_Name:

           Master_TLS_Version:

1 row in set (0.00 sec)

检查另一个从库复制状态:

    Slave_IO_Running: Yes

            Slave_SQL_Running: Yes

两个线程均正常。

4.解析日志文件

mysqlbinlog -v --base64-output=decode-rows  weixintbdb02-relay-bin.000024 > /tmp/weixintbdb02.sql

SET @@SESSION.GTID_NEXT= ‘9ab25963-1cf8-11e8-b9b9-0425c58fa8e9:13652530‘/*!*/;
# at 877992833
#180320 13:23:38 server id 462131 end_log_pos 877992738 CRC32 0x1673bf15 Query thread_id=3763088 exec_time=0 error_code=0
SET TIMESTAMP=1521523418/*!*/;
BEGIN
/*!*/;
# at 877992911
#180320 13:23:38 server id 462131 end_log_pos 877992834 CRC32 0x9f7675ce Table_map: `shop`.`t_pai_log_apps` mapped to number 265
# at 877993007
#180320 13:23:38 server id 462131 end_log_pos 877993172 CRC32 0xfe923d4e Write_rows: table id 265 flags: STMT_END_F
### INSERT INTO `shop`.`t_pai_log_apps`
### SET
### @1=‘331354b1ae8c40509868342c1170b91f‘
### @2=‘/ncl/agentWD/checkInfoAndBind‘
### @3=‘妫??ュ井搴.??″.杈..?.俊?.苟缁..涓..?.
### @4=‘/weshop‘
### @5=0
### @6=NULL
### @7=‘2018:03:20‘
### @8=‘2018:03:20‘
### @9=724
### @10=‘2018:03:20‘
### @11=‘10.5.96.29‘
### @12=NULL
### @13=‘checkInfoAndBind.step3 com.geong.ncl.agentWD.step.AgentWDInsertStep:0 720ms tojson com.geong.pai.web.opstep.TransferPage:0 1ms ‘
### @14=‘0000‘
### @15=NULL
# at 877993345
#180320 13:23:38 server id 462131 end_log_pos 877993203 CRC32 0x626d11b7 Xid = 48696609
COMMIT/*!*/;
# at 877993376 ---复制应用位置

#180320 13:30:50 server id 462131 end_log_pos 877993268 CRC32 0x8c10dea2 GTID last_committed=1536936 sequence_number=1536937 rbr_only=yes
/*!50718 SET TRANSACTION ISOLATION LEVEL READ COMMITTED*//*!*/;
SET @@SESSION.GTID_NEXT= ‘9ab25963-1cf8-11e8-b9b9-0425c58fa8e9:13652531‘/*!*/;
# at 877993441
#180320 13:30:50 server id 462131 end_log_pos 877993344 CRC32 0x89c418bf Query thread_id=3763092 exec_time=0 error_code=0
SET TIMESTAMP=1521523850/*!*/;
/*!\C utf8mb4 *//*!*/;
SET @@session.character_set_client=45,@@session.collation_connection=45,@@session.collation_server=33/*!*/;
BEGIN
/*!*/;
# at 877993517
#180320 13:30:50 server id 462131 end_log_pos 877993443 CRC32 0x3e33c686 Table_map: `shop`.`shop.t_ncl` mapped to number 290
# at 877993616
#180320 13:30:50 server id 462131 end_log_pos 878001439 CRC32 0x2cbaa777 Delete_rows: table id 290
....
....
# at 879424748
#180320 13:30:50 server id 462131 end_log_pos 879432661 CRC32 0x04ca6c87 Delete_rows: table id 290
# at 879432834
#180320 13:30:50 server id 462131 end_log_pos 879433847 CRC32 0xb9e88af6 Delete_rows: table id 290 flags: STMT_END_F
### DELETE FROM `shop`.`shop.t_ncl`
### WHERE
### @1=47
### @2=‘02243518‘
### @3=‘o5lHPw7U0FGEznvO6SawUAFtQjA4‘
### @4=NULL
### @5=NULL
### @6=NULL
### @7=NULL
### @8=NULL
### @9=NULL
### @10=NULL
### @11=NULL
### @12=NULL
### @13=0
....
....
### DELETE FROM `shop`.`shop.t_ncl`
### WHERE
...skipping...
### @1=6416
### @2=‘02275064‘
### @3=‘8886139b-96b9-44be-95dd-1257ae7688aa‘
### @4=‘13784976686‘
### @5=NULL
### @6=NULL
### @7=NULL
### @8=‘http://thirdwx.qlogo.cn/mmopen/bj9JGugn6Uf7P15BxMQAz0q7V8v38ibric2acnsnMiaF3SibSibWPYoGEgBDObRdMImFtvfqFIv8wibvOibJJnYp2tYn2SOlG15uetV/132‘
### @9=NULL
### @10=NULL
### @11=NULL
### @12=‘2018-03-20 13:15:45‘
### @13=0
# at 879434020
#180320 13:30:50 server id 462131 end_log_pos 879433878 CRC32 0x6b705e04 Xid = 48696956
COMMIT/*!*/;
# at 879434051
#180320 13:34:25 server id 462131 end_log_pos 879433943 CRC32 0x2a4685b6 GTID last_committed=1536937 sequence_number=1536938 rbr_only=yes
/*!50718 SET TRANSACTION ISOLATION LEVEL READ COMMITTED*//*!*/;
SET @@SESSION.GTID_NEXT= ‘9ab25963-1cf8-11e8-b9b9-0425c58fa8e9:13652532‘/*!*/;

主从复制显示sql线程已经执行完gtid号为9ab25963-1cf8-11e8-b9b9-0425c58fa8e9:1-13652530,日志在GTID号为9ab25963-1cf8-11e8-b9b9-0425c58fa8e9:1-13652531均是对shop`.`shop.t_ncl表的删除操作,查看从库中有这个表,但是没有数据,说明从库可能手动删除过,或者根本就没有数据。

解决跳过GTID号为9ab25963-1cf8-11e8-b9b9-0425c58fa8e9:1-13652531的操作。

解决办法:
mysql> stop slave;
Query OK, 0 rows affected (0.00 sec)

mysql> set gtid_next=‘9ab25963-1cf8-11e8-b9b9-0425c58fa8e9:13652531‘;
Query OK, 0 rows affected (0.00 sec)

mysql> show slave status\G;
*************************** 1. row ***************************
Slave_IO_State:
Master_Host: 10.1.46.213
Master_User: rep
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000070
Read_Master_Log_Pos: 232703867
Relay_Log_File: weixintbdb02-relay-bin.000024
Relay_Log_Pos: 877993376
Relay_Master_Log_File: mysql-bin.000010
Slave_IO_Running: No
Slave_SQL_Running: No
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 1032
Last_Error: Could not execute Delete_rows event on table shop.t_ncl; Can‘t find record in ‘t_ncl‘, Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event‘s master log mysql-bin.000010, end_log_pos 878001439
Skip_Counter: 0
Exec_Master_Log_Pos: 877993203
Relay_Log_Space: 64263191474
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 1032
Last_SQL_Error: Could not execute Delete_rows event on table shop.t_ncl; Can‘t find record in ‘t_ncl‘, Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event‘s master log mysql-bin.000010, end_log_pos 878001439
Replicate_Ignore_Server_Ids:
Master_Server_Id: 462131
Master_UUID: 9ab25963-1cf8-11e8-b9b9-0425c58fa8e9
Master_Info_File: /data/mysql/mysql/master.info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State:
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp: 180327 23:55:13
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set: 9ab25963-1cf8-11e8-b9b9-0425c58fa8e9:9-123314722
Executed_Gtid_Set: 98ef4e90-1cff-11e8-ab2a-0425c58fa8d7:1-8,
9ab25963-1cf8-11e8-b9b9-0425c58fa8e9:1-13652530
Auto_Position: 1
Replicate_Rewrite_DB:
Channel_Name:
Master_TLS_Version:
1 row in set (0.00 sec)

mysql> begin;
Query OK, 0 rows affected (0.00 sec)

mysql> commit;
Query OK, 0 rows affected (10.00 sec)


mysql> show master status\G;
*************************** 1. row ***************************
File: mysql-bin.000012
Position: 436
Binlog_Do_DB:
Binlog_Ignore_DB:
Executed_Gtid_Set: 98ef4e90-1cff-11e8-ab2a-0425c58fa8d7:1-8,
9ab25963-1cf8-11e8-b9b9-0425c58fa8e9:1-13652531

mysql> set gtid_next=‘automatic‘;
Query OK, 0 rows affected (0.00 sec)


mysql> start slave;
Query OK, 0 rows affected (0.00 sec)

通过执行一个空事物跳过31的GTID号,在重启slave。

5.后续操作

因为mha需要在mysql主从同步复制正常的时候才能正常运行。现在从库10.1.46.214还在追日志,等日志追平后,将mha重新启动。

检查从库同步:

mysql> show slave status\G;

查看点:

同步状态:

Slave_IO_Running: Yes

Slave_SQL_Running: Yes

复制同步情况:

Retrieved_Gtid_Set: 9ab25963-1cf8-11e8-b9b9-0425c58fa8e9:9-123315286

            Executed_Gtid_Set: 45cb77c3-1cfa-11e8-a55f-0425c555a5f3:1-3,

98ef4e90-1cff-11e8-ab2a-0425c58fa8d7:1-3,

9ab25963-1cf8-11e8-b9b9-0425c58fa8e9:1-123315286

Retrieved_Gtid_Set 提取的最大GTID号和Executed_Gtid_Set执行的最大GTID相同,则同步完成。

启动MHA

nohup masterha_manager --conf=/etc/mha/app1.cnf --ignore_last_failover > /etc/mha/app1/manager.log < /dev/null 2>&1 &

以上是关于MHA主从不同步处理的主要内容,如果未能解决你的问题,请参考以下文章

MySQL半同步复制和MHA高可用架构设计

mysql主从不同步处理过程分享

MySQL主从说明详解MySQL主从不同步处理方案

MySQL主从说明详解MySQL主从不同步处理方案

mysql高可用之MHA+半同步复制

基于半同步复制的MHA高可用MySql集群架构搭建实战