mysqlbackup 备份失败的分析

Posted 蒋乐兴的技术随笔

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了mysqlbackup 备份失败的分析相关的知识,希望对你有一定的参考价值。

现象
1、从mysqlbackup 的日志上来看是它一直处于state: Waiting for locks;
2、从mysql 层面show processlist 上看它的处于waiting for gloabl read lock

show processlist;
+-----+-------------+---------------------+------+---------+--------+----------------------------------+------------------+
| Id  | User        | Host                | db   | Command | Time   | State                            | Info             |
+-----+-------------+---------------------+------+---------+--------+----------------------------------+------------------+
| 331 | system user |                     | NULL | Connect | 215831 | Waiting for master to send event | NULL             |
| 332 | system user |                     | NULL | Connect | 252054 | Reading event from the relay log | NULL             |
| 336 | ha_op       | 127.0.0.1:36246     | NULL | Sleep   |      1 |                                  | NULL             |
| 337 | ha_op       | 127.0.0.1:36247     | NULL | Sleep   |      1 |                                  | NULL             |
| 339 | ha_op       | 192.168.56.39:44672 | NULL | Sleep   |   8816 |                                  | NULL             |
| 340 | backup      | 127.0.0.1:36267     | NULL | Sleep   |   3954 |                                  | NULL             |
| 344 | root        | 127.0.0.1:36310     | NULL | Query   |      0 | init                             | show processlist |
| 345 | backup      | 127.0.0.1:36337     | NULL | Sleep   |   1134 |                                  | NULL             |
+-----+-------------+---------------------+------+---------+--------+----------------------------------+------------------+

 

分析
  1、是不是有别的事务持有了X锁,使得备份用户的flush tables with read lock 一直等待

        select * from information_schema.innodb_trx \G
        *************************** 1. row ***************************
                            trx_id: 82557697
                         trx_state: RUNNING
                       trx_started: 2016-08-28 13:53:11
             trx_requested_lock_id: NULL
                  trx_wait_started: NULL
                        trx_weight: 153349
               trx_mysql_thread_id: 332
                         trx_query: NULL
               trx_operation_state: starting index read
                 trx_tables_in_use: 296
                 trx_tables_locked: 296
                  trx_lock_structs: 5759
             trx_lock_memory_bytes: 538152
                   trx_rows_locked: 295679
                 trx_rows_modified: 147590
           trx_concurrency_tickets: 0
               trx_isolation_level: REPEATABLE READ
                 trx_unique_checks: 1
            trx_foreign_key_checks: 1
        trx_last_foreign_key_error: NULL
         trx_adaptive_hash_latched: 0
         trx_adaptive_hash_timeout: 10000
                  trx_is_read_only: 0
        trx_autocommit_non_locking: 0
        1 row in set (0.01 sec)    

  2、由(分析.1)可以看出是有一事务阻塞了备份用户下发的flush tables with read lock命令、 而且可以知道这个事务是332这个session发起的;
     再结合(现象.2)可以知道332 是SQL进程

  3、由(分析.2)已经知道是复制阻塞了备份,所以接下来是要去解析当前正slave 正在应用的binlog 用于更进一步的确定问题原因
          show slave status 看到Relay_Master_Log_File: mysql-bin.000314、Exec_Master_Log_Pos: 216304949

  4、解析binlog 后发现slave 在执行delete 操作

        # at 216304997
        #160826 14:49:02 server id 192694598  end_log_pos 216305071 CRC32 0x416f0b3c    Query   thread_id=16381 exec_time=0     error_code=0
        SET TIMESTAMP=1472194142/*!*/;
        BEGIN
        /*!*/;
        # at 216305071
        #160826 14:49:02 server id 192694598  end_log_pos 216305306 CRC32 0x19b5309b    Table_map: `dcsdba`.`pre_merge_cdr_gsm` mapped to number 147
        # at 216305306
        #160826 14:49:02 server id 192694598  end_log_pos 216313252 CRC32 0xd8102d41    Delete_rows: table id 147
        # at 216313252
        #160826 14:49:02 server id 192694598  end_log_pos 216321207 CRC32 0xd73f1e3f    Delete_rows: table id 147
        # at 216321207
        #160826 14:49:02 server id 192694598  end_log_pos 216329247 CRC32 0x6331d53b    Delete_rows: table id 147
        # at 216329247
        #160826 14:49:02 server id 192694598  end_log_pos 216337393 CRC32 0xecd5041e    Delete_rows: table id 147
        # at 216337393
        #160826 14:49:02 server id 192694598  end_log_pos 216345555 CRC32 0x6284a900    Delete_rows: table id 147

  5、进一步确认问题原因是因为pre_merge_cdr_gsm表没有主键,在这种情况下slave 每删除一行都会对应一次全表扫描。

建议
1、给pre_merge_cdr_gsm表加上一个合适的主键



以上是关于mysqlbackup 备份失败的分析的主要内容,如果未能解决你的问题,请参考以下文章

mysqlbackup 备份恢复数据

MySQLBackup 使用说明

MySQLBackup 使用说明

MySQL自动备份

linux 下备份mysql数据库

Bash脚本手动运行,但在crontab上失败