GreatSQL删除分区慢的跟踪

Posted 2023-06-08 GreatSQL

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了GreatSQL删除分区慢的跟踪相关的知识，希望对你有一定的参考价值。

GreatSQL删除分区慢的跟踪

背景

某业务系统，每天凌晨会删除分区表的一个分区(按天分区)，耗时较久，从最开始的30秒，慢慢变为1分钟+，影响到交易业务的正常进行。在测试环境进行了模拟，复现了删除分区慢的情况，本次基于GreatSQL8.0.25-17进行测试，官方mysql版本也存在相同问题。

测试环境

$ mysql -h127.0.0.1 -P8025 -uroot -p

mysql: [Warning] Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor.  Commands end with ; or \\g.
Your MySQL connection id is 10
Server version: 8.0.25-17 GreatSQL, Release 17, Revision 4733775f703

Copyright (c) 2000, 2021, Oracle and/or its affiliates.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type \'help;\' or \'\\h\' for help. Type \'\\c\' to clear the current input statement.

greatsql> select version();
+-----------+
| version() |
+-----------+
| 8.0.25-17 |
+-----------+
1 row in set (0.00 sec)
greatsql> show variables like \'autocommit\' ;
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| autocommit    | ON    |
+---------------+-------+
1 row in set (0.01 sec)

greatsql> show variables like \'innodb_buffer_pool_size\';
+-------------------------+------------+
| Variable_name           | Value      |
+-------------------------+------------+
| innodb_buffer_pool_size | 4294967296 |
+-------------------------+------------+
1 row in set (0.01 sec)

greatsql> show variables like \'innodb_flush_log_at_trx_commit\';
+--------------------------------+-------+
| Variable_name                  | Value |
+--------------------------------+-------+
| innodb_flush_log_at_trx_commit | 1     |
+--------------------------------+-------+
1 row in set (0.00 sec)

greatsql> show variables like \'sync_binlog\';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| sync_binlog   | 1     |
+---------------+-------+
1 row in set (0.00 sec)

建表

CREATE TABLE `t_partition` (
  `id` bigint NOT NULL AUTO_INCREMENT,
  `ua` varchar(100) DEFAULT NULL,
  `start_time` datetime NOT NULL,
  PRIMARY KEY (`id`,`start_time`)
)  PARTITION BY RANGE (to_days(`start_time`))
(PARTITION P20230129 VALUES LESS THAN (738915) ENGINE = InnoDB,
 PARTITION P20230130 VALUES LESS THAN (738916) ENGINE = InnoDB,
 PARTITION P20230131 VALUES LESS THAN (738917) ENGINE = InnoDB,
 PARTITION P20230201 VALUES LESS THAN (738918) ENGINE = InnoDB,
 PARTITION P20230202 VALUES LESS THAN (738919) ENGINE = InnoDB,
 PARTITION P20230203 VALUES LESS THAN (738920) ENGINE = InnoDB,
 PARTITION P20230204 VALUES LESS THAN (738921) ENGINE = InnoDB,
 PARTITION P20230302 VALUES LESS THAN (738947) ENGINE = InnoDB,
 PARTITION P20230303 VALUES LESS THAN (738948) ENGINE = InnoDB,
 PARTITION P20230304 VALUES LESS THAN (738949) ENGINE = InnoDB,
 PARTITION P20230305 VALUES LESS THAN (738950) ENGINE = InnoDB,
 PARTITION P20230306 VALUES LESS THAN (738951) ENGINE = InnoDB,
 PARTITION P20230307 VALUES LESS THAN (738952) ENGINE = InnoDB, 
 PARTITION P20230308 VALUES LESS THAN (738953) ENGINE = InnoDB,
 PARTITION P20230309 VALUES LESS THAN (738954) ENGINE = InnoDB,
 PARTITION P20230310 VALUES LESS THAN (738955) ENGINE = InnoDB, 
 PARTITION P20230311 VALUES LESS THAN (738956) ENGINE = InnoDB, 

 PARTITION P20230312 VALUES LESS THAN (738957) ENGINE = InnoDB,
 PARTITION P20230313 VALUES LESS THAN (738958) ENGINE = InnoDB,
 PARTITION P20230314 VALUES LESS THAN (738959) ENGINE = InnoDB,
 PARTITION P20230315 VALUES LESS THAN (738960) ENGINE = InnoDB,
 PARTITION P20230316 VALUES LESS THAN (738961) ENGINE = InnoDB,
 PARTITION P20230317 VALUES LESS THAN (738962) ENGINE = InnoDB, 
 PARTITION P20230318 VALUES LESS THAN (738963) ENGINE = InnoDB,
 PARTITION P20230319 VALUES LESS THAN (738964) ENGINE = InnoDB,
 PARTITION P20230320 VALUES LESS THAN (738965) ENGINE = InnoDB, 
 PARTITION P20230321 VALUES LESS THAN (738966) ENGINE = InnoDB, 
 
 PARTITION P20230322 VALUES LESS THAN (738967) ENGINE = InnoDB,
 PARTITION P20230323 VALUES LESS THAN (738968) ENGINE = InnoDB,
 PARTITION P20230324 VALUES LESS THAN (738969) ENGINE = InnoDB,
 PARTITION P20230325 VALUES LESS THAN (738970) ENGINE = InnoDB,
 PARTITION P20230326 VALUES LESS THAN (738971) ENGINE = InnoDB,
 PARTITION P20230327 VALUES LESS THAN (738972) ENGINE = InnoDB, 
 PARTITION P20230328 VALUES LESS THAN (738973) ENGINE = InnoDB,
 PARTITION p20230329 VALUES LESS THAN (738974) ENGINE = InnoDB,
 PARTITION p20230330 VALUES LESS THAN (738975) ENGINE = InnoDB,
 PARTITION p20230331 VALUES LESS THAN (738976) ENGINE = InnoDB,
 PARTITION p20230401 VALUES LESS THAN (738977) ENGINE = InnoDB,
 PARTITION p20230402 VALUES LESS THAN (738978) ENGINE = InnoDB,
 PARTITION p20230403 VALUES LESS THAN (738979) ENGINE = InnoDB);

插入数据

greatsql> insert into t_partition(ua,start_time) select substring(md5(rand()),1,20),\'2023-03-10\' ;
Query OK, 1 row affected (0.01 sec)
Records: 1  Duplicates: 0  Warnings: 0

greatsql> insert into t_partition(ua,start_time) select substring(md5(rand()),1,20),\'2023-03-10\' ;
Query OK, 1 row affected (0.00 sec)
Records: 1  Duplicates: 0  Warnings: 0

greatsql> insert into t_partition(ua,start_time) select substring(md5(rand()),1,20),\'2023-03-10\' ;
Query OK, 1 row affected (0.01 sec)
Records: 1  Duplicates: 0  Warnings: 0



greatsql> insert into t_partition(ua,start_time) select substring(md5(rand()),1,20),start_time from t_partition; 
Query OK, 3 rows affected (0.01 sec)
Records: 3  Duplicates: 0  Warnings: 0

greatsql> insert into t_partition(ua,start_time) select substring(md5(rand()),1,20),start_time from t_partition; 
Query OK, 6 rows affected (0.00 sec)
Records: 6  Duplicates: 0  Warnings: 0

............... 
greatsql> insert into t_partition(ua,start_time) select substring(md5(rand()),1,20),start_time from t_partition; 
Query OK, 3145728 rows affected (35.68 sec)
Records: 3145728  Duplicates: 0  Warnings: 0

greatsql> insert into t_partition(ua,start_time) select substring(md5(rand()),1,20),start_time from t_partition; 
Query OK, 6291456 rows affected (1 min 11.51 sec)
Records: 6291456  Duplicates: 0  Warnings: 0

greatsql> insert into t_partition(ua,start_time) select substring(md5(rand()),1,20),start_time from t_partition; 
Query OK, 12582912 rows affected (2 min 26.74 sec)
Records: 12582912  Duplicates: 0  Warnings: 0

greatsql> select count(*) from t_partition; 
+----------+
| count(*) |
+----------+
| 25165824 |
+----------+
1 row in set (0.50 sec)
greatsql> select count(*) from t_partition partition(P20230310); 
+----------+
| count(*) |
+----------+
| 25165824 |
+----------+

向分区插入数据

insert into t_partition(ua,start_time) select substring(md5(rand()),1,20),\'2023-03-11\' from t_partition partition(P20230310); 
 insert into t_partition(ua,start_time) select substring(md5(rand()),1,20),\'2023-03-12\' from t_partition partition(P20230310); 
 insert into t_partition(ua,start_time) select substring(md5(rand()),1,20),\'2023-03-13\' from t_partition partition(P20230310); 
 insert into t_partition(ua,start_time) select substring(md5(rand()),1,20),\'2023-03-14\' from t_partition partition(P20230310); 
 insert into t_partition(ua,start_time) select substring(md5(rand()),1,20),\'2023-03-15\' from t_partition partition(P20230310); 
 insert into t_partition(ua,start_time) select substring(md5(rand()),1,20),\'2023-03-16\' from t_partition partition(P20230310); 
 insert into t_partition(ua,start_time) select substring(md5(rand()),1,20),\'2023-03-17\' from t_partition partition(P20230310); 
 insert into t_partition(ua,start_time) select substring(md5(rand()),1,20),\'2023-03-18\' from t_partition partition(P20230310); 
 insert into t_partition(ua,start_time) select substring(md5(rand()),1,20),\'2023-03-19\' from t_partition partition(P20230310); 
 insert into t_partition(ua,start_time) select substring(md5(rand()),1,20),\'2023-03-20\' from t_partition partition(P20230310); 
 insert into t_partition(ua,start_time) select substring(md5(rand()),1,20),\'2023-03-21\' from t_partition partition(P20230310); 
 
 。。。。。。。。。。。
greatsql>  insert into t_partition(ua,start_time) select substring(md5(rand()),1,20),\'2023-03-19\' from t_partition partition(P20230310); 
Query OK, 25165824 rows affected (5 min 17.92 sec)
Records: 25165824  Duplicates: 0  Warnings: 0

greatsql>  insert into t_partition(ua,start_time) select substring(md5(rand()),1,20),\'2023-03-20\' from t_partition partition(P20230310); 
Query OK, 25165824 rows affected (5 min 19.56 sec)
Records: 25165824  Duplicates: 0  Warnings: 0

greatsql>  insert into t_partition(ua,start_time) select substring(md5(rand()),1,20),\'2023-03-21\' from t_partition partition(P20230310); 
Query OK, 25165824 rows affected (5 min 27.27 sec)
Records: 25165824  Duplicates: 0  Warnings: 0

更新数据

greatsql> Update t_partition set ua=concat(ua,\'abc\') where start_time=\'2023-03-19\';
Query OK, 25165824 rows affected (12 min 55.53 sec)
Rows matched: 25165824  Changed: 25165824  Warnings: 0

删除分区s

greatsql> alter table t_partition drop partition P20230311;

Query OK, 0 rows affected (13.68 sec)

Records: 0 Duplicates: 0 Warnings: 0

greatsql> alter table t_partition drop partition P20230312;

Query OK, 0 rows affected (0.07 sec)

Records: 0 Duplicates: 0 Warnings: 0

两个分区数据量是一样，但删除第一个分区耗时较长。

$ perf record -ag -p  11222  -o /mysqldb/perf_drop_part_mysql2.data

Warning:
PID/TID switch overriding SYSTEM
^C[ perf record: Woken up 41 times to write data ]
[ perf record: Captured and wrote 10.610 MB /mysqldb/perf_drop_part_mysql2.data (54351 samples) ]
  Children      Self  Command  Shared Object        Symbol                                                   
+   99.17%     0.00%  mysqld   libpthread-2.17.so   [.] start_thread                                         
+   99.17%     0.00%  mysqld   mysqld               [.] pfs_spawn_thread                                     
+   99.17%     0.00%  mysqld   mysqld               [.] handle_connection                                    
+   99.17%     0.00%  mysqld   mysqld               [.] do_command                                           
+   99.17%     0.00%  mysqld   mysqld               [.] dispatch_command                                     
+   99.17%     0.00%  mysqld   mysqld               [.] dispatch_sql_command                                 
+   99.16%     0.00%  mysqld   mysqld               [.] mysql_execute_command                                
+   99.16%     0.00%  mysqld   mysqld               [.] Sql_cmd_alter_table::execute                         
+   99.16%     0.00%  mysqld   mysqld               [.] mysql_alter_table                                    
+   99.09%     0.00%  mysqld   mysqld               [.] mysql_inplace_alter_table                            
+   98.56%     0.00%  mysqld   mysqld               [.] ha_innopart::commit_inplace_alter_partition          
+   98.54%     0.00%  mysqld   mysqld               [.] alter_parts::prepare_or_commit_for_new               
+   98.54%     0.00%  mysqld   mysqld               [.] alter_part_normal::try_commit                        
+   98.53%     0.00%  mysqld   mysqld               [.] btr_drop_ahi_for_table                               
+   98.52%     1.30%  mysqld   mysqld               [.] btr_drop_next_batch                                  
+   97.20%     0.03%  mysqld   mysqld               [.] btr_search_drop_page_hash_when_freed                 
+   96.34%     2.03%  mysqld   mysqld               [.] btr_search_drop_page_hash_index                      
+   86.52%    52.25%  mysqld   mysqld               [.] ha_remove_all_nodes_to_page                          
+   34.21%    34.15%  mysqld   mysqld               [.] ha_delete_hash_node                                  
+    4.27%     2.68%  mysqld   mysqld               [.] rec_get_offsets_func                                 
+    2.11%     2.10%  mysqld   mysqld               [.] ut_fold_binary                                       
+    1.58%     1.58%  mysqld   mysqld               [.] rec_init_offsets                                     
+    1.30%     1.30%  mysqld   mysqld               [.] rec_fold                                             
+    0.57%     0.03%  mysqld   mysqld               [.] buf_page_get_gen                                     
+    0.57%     0.00%  mysqld   mysqld               [.] execute_native_thread_routine                        
+    0.56%     0.01%  mysqld   [kernel.kallsyms]    [k] system_call_fastpath

从系统调用上看有大量的自适应hash相关的调用

重启关闭自适应hash

greatsql> show variables like \'%hash%\';
+----------------------------------+-------+
| Variable_name                    | Value |
+----------------------------------+-------+
| innodb_adaptive_hash_index       | OFF   |
| innodb_adaptive_hash_index_parts | 8     |
+----------------------------------+-------+

修改配置文件，关闭自适应hash，按照上面的流程从新执行

greatsql> alter table t_partition drop partition P20230311;

Query OK, 0 rows affected (0.08 sec)

Records: 0 Duplicates: 0 Warnings: 0

greatsql> alter table t_partition drop partition P20230312;

Query OK, 0 rows affected (0.08 sec)

Records: 0 Duplicates: 0 Warnings: 0

关闭自适应hash后，相同的操作过程，删除第一个分区的时间明显变短，删除每个分区的时间基本上一致。

备注：innodb_adaptive_hash_index是全局变量，可以动态修改，不重启数据库。

测试结果汇总

自适应hash对比	第一次删分区	第二次删分区
innodb_buffer_pool_instances=8& innodb_adaptive_hash_index=on	13.68	0.07
innodb_buffer_pool_instances=8& innodb_adaptive_hash_index=off	0.08	0.08

源码分析

// btr_drop_ahi_for_table
void btr_drop_ahi_for_table(dict_table_t *table) 
  const ulint len = UT_LIST_GET_LEN(table->indexes);

  if (len == 0) 
    return;
  

  const dict_index_t *indexes[MAX_INDEXES];
  const page_size_t page_size(dict_table_page_size(table));

  for (;;) 
    ulint ref_count = 0;
    const dict_index_t **end = indexes;

    for (dict_index_t *index = table->first_index(); index != nullptr;
         index = index->next()) 
      if (ulint n_refs = index->search_info->ref_count) 
        ut_ad(!index->disable_ahi);
        ut_ad(index->is_committed());
        ref_count += n_refs;
        ut_ad(indexes + len > end);
        *end++ = index;
      
    

    ut_ad((indexes == end) == (ref_count == 0));

    if (ref_count == 0) 
      return;
    

    btr_drop_next_batch(page_size, indexes, end);  // breakpoint  

    std::this_thread::yield();
  




// btr_drop_next_batch
static void btr_drop_next_batch(const page_size_t &page_size,
                                const dict_index_t **first,
                                const dict_index_t **last) 
  static constexpr unsigned batch_size = 1024;
  std::vector<page_id_t> to_drop;
  to_drop.reserve(batch_size);

  for (ulint i = 0; i < srv_buf_pool_instances; ++i) 
    to_drop.clear();
    buf_pool_t *buf_pool = buf_pool_from_array(i);
    mutex_enter(&buf_pool->LRU_list_mutex);
    const buf_page_t *prev;

    for (const buf_page_t *bpage = UT_LIST_GET_LAST(buf_pool->LRU);
         bpage != nullptr; bpage = prev) 
      prev = UT_LIST_GET_PREV(LRU, bpage);

      ut_a(buf_page_in_file(bpage));
      if (buf_page_get_state(bpage) != BUF_BLOCK_FILE_PAGE ||
          bpage->buf_fix_count > 0) 
        continue;
      

      const dict_index_t *block_index =
          reinterpret_cast<const buf_block_t *>(bpage)->ahi.index;

      /* index == nullptr means the page is no longer in AHI, so no need to
      attempt freeing it */
      if (block_index == nullptr) 
        continue;
      
      /* pages IO fixed for read have index == nullptr */
      ut_ad(!bpage->was_io_fix_read());

      if (std::find(first, last, block_index) != last) 
        to_drop.emplace_back(bpage->id);
        if (to_drop.size() == batch_size)    // batch_size = 1024
          break;
        
      
    

    mutex_exit(&buf_pool->LRU_list_mutex);

    for (const page_id_t &page_id : to_drop) 
      btr_search_drop_page_hash_when_freed(page_id, page_size); // breakpoint
    
  


// btr_search_drop_page_hash_when_freed
void btr_search_drop_page_hash_when_freed(const page_id_t &page_id,
                                          const page_size_t &page_size) 
  buf_block_t *block;
  mtr_t mtr;

  ut_d(export_vars.innodb_ahi_drop_lookups++);

  mtr_start(&mtr);

  /* If the caller has a latch on the page, then the caller must
  have a x-latch on the page and it must have already dropped
  the hash index for the page. Because of the x-latch that we
  are possibly holding, we cannot s-latch the page, but must
  (recursively) x-latch it, even though we are only reading. */

  block = buf_page_get_gen(page_id, page_size, RW_X_LATCH, nullptr,
                           Page_fetch::PEEK_IF_IN_POOL, UT_LOCATION_HERE, &mtr);

  if (block) 
    /* If AHI is still valid, page can\'t be in free state.
    AHI is dropped when page is freed. */
    ut_ad(!block->page.file_page_was_freed);

    buf_block_dbg_add_level(block, SYNC_TREE_NODE_FROM_HASH);

    dict_index_t *index = block->ahi.index;
    if (index != nullptr) 
      /* In all our callers, the table handle should
      be open, or we should be in the process of
      dropping the table (preventing eviction). */
      ut_ad(index->table->n_ref_count > 0 || dict_sys_mutex_own());
      btr_search_drop_page_hash_index(block);
    
  

  mtr_commit(&mtr);

函数逻辑说明：

drop 分区和add分区都会清空所有分区的AHI信息，最耗时的如下
循环每个分区调用函数
->btr_drop_ahi_for_table
  循环表（或者分区）中的每个索引，如果索引都没有用到AHI，
则退出
  循环innodb buffer中的每个实例，根据LRU链表循环每个page
如果page建立了AHI信息，且是要删除表（或者分区）的相关索引
  则放入drop vector容器中
如果page没有建立AHI信息 
则跳过
如果drop verctor容器中填满1024个page
则清理一次，循环每个page，调用函数
->btr_search_drop_page_hash_index
  计算page所在AHI结构的slot信息，以便找到对应的hash_table_t结构
  循环page中所有的行
    循环行中访问到的索引字段（访问模式），计算出fold信息填入到fold[]数组中
    本循环中会通过函数rec_get_offsets进行字段偏移量的获取，为耗用CPU的函数
      循环fold[]数组，一个fold代表一行数据,调用函数
       ->ha_remove_all_nodes_to_page,为耗用CPU的函数
         ->ha_chain_get_fist
           根据fold信息找到hash结构的cell
         循环本cell中的链表信息
           如果行的地址在本要删除的page上，调用函数
           ->ha_delete_hash_node，为消耗CPU的函数
             进行链表和hash结构的维护
每次处理完1024个page后，yeild线程主动放弃CPU，避免长期占用CPU，醒来后继续处理

drop 分区和add分区都会循环每个分区调用函数btr_drop_ahi_for_table 、btr_search_drop_page_hash_index清空所有分区及索引的的AHI信息，删除第1个分区的时ahi信息被清空，删第2个分区的时候buffer中已经没有ahi信息了，所有删除第2个分区就很快了。

避免方式

针对以上原因，线上执行 drop必须遵守以下原则：

1、关闭AHI功能，不使用AHI带来的查询加速功能，需要先在测试环境进行业务测试，确保业务性能不受影响。

2、删除表的第一个分区时，内部会清理该表在每个buffer pool实例中对应的数据块页面，耗时较久，接着删其他分区耗时很小，建议将每天一次的删除分区的操作改为每周或者每月批量执行删除分区的操作，并且需要在业务低峰期操作。

Enjoy GreatSQL

从零开始学习MySQL调试跟踪

1. 编译GreatSQL
2. 安装gdb
3. 开始调试GreatSQL源码
- 3.1 利用gdb设置断点
- 3.2 使用 Trace 文件调试

有时为了跟踪故障需要调试MySQL/GreatSQL源码，本文介绍如何在Linux下构建MySQL/GreatSQL源码调试环境。

在这之前，我也是一名小白，一起从零开始探索吧。

本文以CentOS 8.x环境下的GreatSQL 8.0.25-16版本为例。

1. 编译GreatSQL

查看系统环境：

$ cat /etc/system-release

CentOS Linux release 8.4.2105

首先，从 https://gitee.com/GreatSQL/GreatSQL/releases/ 下载GreatSQL 8.0.25-16的源码包

Source Code

Packages Size
greatsql-8.0.25-16.tar.gz 503M

Packages	Size
greatsql-8.0.25-16.tar.gz	503M

接下来，参考文章在Linux下源码编译安装GreatSQL 构建好编译环境。然后开始编译GreatSQL源码，编译参数中增加/修改debug相关选项，这样编译后得到的二进制文件才能支持调试模式，例如：

$ cd /opt/greatsql-8.0.25-16
$ mkdir -p build
$ cd build
$ cmake3 .. \\
-DBOOST_INCLUDE_DIR=/opt/boost_73_0 \\
-DLOCAL_BOOST_DIR=/opt/boost_73_0 \\
-DCMAKE_INSTALL_PREFIX=/usr/local/GreatSQL-8.0.25-16-Linux-glibc2.28-x86_64 \\
-DWITH_ZLIB=bundled \\
-DWITH_NUMA=ON \\
-DCMAKE_EXE_LINKER_FLAGS="-ljemalloc" \\
-DBUILD_CONFIG=mysql_release \\
-DWITH_TOKUDB=OFF \\
-DWITH_ROCKSDB=OFF \\
-DMAJOR_VERSION=8 \\
-DMINOR_VERSION=0 \\
-DPATCH_VERSION=25 \\
-DWITH_UNIT_TESTS=OFF \\
-DWITH_NDBCLUSTER=OFF \\
-DWITH_SSL=system \\
-DWITH_SYSTEMD=ON \\
-DWITH_LDAP=OFF \\
-DWITH_AUTHENTICATION_LDAP=OFF \\
-DWITH_DEBUG=1 \\
-DCMAKE_BUILD_TYPE=Debug \\
&& make -j8 VERBOSE=1 && make install

主要是增加两个参数 -DWITH_DEBUG=1 和 -DCMAKE_BUILD_TYPE=Debug，注意不要有参数 -DCMAKE_BUILD_TYPE=RelWithDebInfo。

编译完成后，即可得到包含debug功能的GreatSQL二进制文件，执行下面的命令检查：

$ cd /usr/local/GreatSQL-8.0.25-16-Linux-glibc2.28-x86_64
$ ./bin/mysqld-debug --verbose --version

/usr/local/GreatSQL-8.0.25-16-Linux-glibc2.28-x86_64/bin/mysqld-debug  Ver 8.0.25-16-debug for Linux on x86_64 (Source distribution)

可以看到，输出的结果中包含 debug 关键字，这就表示成功了。

2. 安装gdb

直接执行yum安装gdb即可：

$ yum install -y gdb
$ gdb --version
GNU gdb (GDB) Red Hat Enterprise Linux 9.2-4.el8
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

gdb常用的调试相关指令有以下几个：

命令	缩写	备注
attach		挂接/进入准备调试的进程pid
detach		取消挂接进程（退出进程）
list	l	显示多行源代码
break	b	设置断点，程序运行到断点的位置会停下来
info	i	描述程序的状态
run	r	开始运行程序
display	disp	跟踪查看某个变量,每次停下来都显示它的值
step	s	执行下一条语句,如果该语句为函数调用,则进入函数执行其中的第一条语句
next	n	执行下一条语句,如果该语句为函数调用,不会进入函数内部执行(即不会一步步地调试函数内部语句)
print	p	打印内部变量值
continue	c	继续程序的运行,直到遇到下一个断点
set var name=v		设置变量的值
start	st	开始执行程序,在main函数的第一条语句前面停下来
file		装入需要调试的程序
kill	k	终止正在调试的程序
watch		监视变量值的变化
backtrace	bt	查看函数调用信息(堆栈)
frame	f	查看栈帧
quit	q	退出gdb

3. 开始调试GreatSQL源码

第一次运行gdb准备调试时，可能会提示类似下面的信息

warning: Unable to find libthread_db matching inferior's thread library, thread debugging will not be available.
0x00007ffb358ada41 in poll () from /lib64/libc.so.6
Missing separate debuginfos, use: dnf debuginfo-install keyutils-libs-1.5.10-9.el8.x86_64 ...

这表示缺少一些相关的debuginfo包，可以根据提示内容补充安装，例如：

dnf debuginfo-install keyutils-libs-1.5.10-9.el8.x86_64 ...

如果提示找不到这些安装包：

Could not find debuginfo package for the following installed packages: keyutils-libs-1.5.10-9.el8.x86_64 ...

可以检查yum配置文件 /etc/yum.repos.d/CentOS-Linux-Debuginfo.repo，确认是否设置了 enable = 1，例如：

# CentOS-Linux-Debuginfo.repo
#
# All debug packages are merged into a single repo, split by basearch, and are
# not signed.

[debuginfo]
name=CentOS Linux $releasever - Debuginfo
baseurl=http://debuginfo.centos.org/$releasever/$basearch/
gpgcheck=1
enabled=1    #<---这里要设置1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-centosofficial

此外，还要把GreatSQL 8.0.25-16的源码包解压缩到 /opt 目录下：

$ tar zxf PATH/greatsql-8.0.25-16.tar.gz -C /opt/

接下来，演示如何跟踪调试。

先初始化GreatSQL数据文件，然后再启动GreatSQL服务进程：

# 初始化GreatSQL
$ cd /usr/local/GreatSQL-8.0.25-16-Linux-glibc2.28-x86_64
$ ./bin/mysqld-debug --no-defaults --initialize-insecure --user=mysql --datadir=./data

# 启动GreatSQL
$ ./bin/mysqld-debug --no-defaults --user=mysql --datadir=./data1 &

# 查看进程pid
$ ps -ef | grep mysqld
...
mysql    2644322 2542442  3 14:38 pts/7    00:00:01 ./bin/mysqld-debug --no-defaults --user=mysql --datadir=./data1

# 在另一个终端（终端#2），连入GreatSQL
$ mysql -S/tmp/mysql.sock
Welcome to the MySQL monitor.  Commands end with ; or \\g.
Your MySQL connection id is 7
Server version: 8.0.25-16-debug Source distribution
...
mysql>\\s
...
Server version:         8.0.25-16-debug Source distribution
...

启动gdb，准备调试跟踪GreatSQL，我们分别演示几种不同方式。

3.1 利用gdb设置断点

在 终端#1 中启动gdb，并挂接GreatSQL进程，准备跟踪

$ gdb -p 2644322
GNU gdb (GDB) Red Hat Enterprise Linux 9.2-4.el8
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
...
Attaching to process 2644322
[New LWP 2643482]
[New LWP 2643483]
[New LWP 2643484]
...
[New LWP 2643522]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
--Type <RET> for more, q to quit, c to continue without paging--  #<-- 这里按下回车，即可进入
0x00007fb7ae93ba41 in __GI___poll (fds=0x7fb7ae229140, nfds=2, timeout=-1) at ../sysdeps/unix/sysv/linux/poll.c:29
29        return SYSCALL_CANCEL (poll, fds, nfds, timeout);
(gdb)
(gdb) p mysql_sysvar_version  #<-- 打印变量，查看GreatSQL版本号
$1 = flags = 68101, name = 0x7f10d1c6cc90 "innodb_version", comment = 0x6c47f92 "InnoDB version", check = 0x37dd9e2
     <check_func_str(THD*, SYS_VAR*, void*, st_mysql_value*)>, update = 0x37ddeb0 <update_func_str(THD*, SYS_VAR*, void*, void const*)>,
  value = 0x7e7c768 <innodb_version_str>, def_val = 0x6c38440 "8.0.25-15"
(gdb) 
(gdb) 
(gdb) b mysql_execute_command  #<--- 输入指令"b dispatch_command"设置断点，意为当GreatSQL程序运行到这个函数时，就会停下来
Breakpoint 3 at 0x379c3f2: file /opt/greatsql-8.0.25-16/sql/sql_parse.cc, line 2875.
(gdb)

切换到 终端#2，随便执行一条SQL命令：

mysql> select 'debug' from dual;

回到 终端#1，继续调试：

(gdb) 
(gdb) bt  #<-- 打印函数调用信息
#0  dispatch_command (thd=0x7f10a3a0b000, com_data=0x7f10d12a7370, command=COM_QUERY) at /opt/greatsql-8.0.25-16/sql/sql_parse.cc:1605
#1  0x0000000003797c48 in do_command (thd=0x7f10a3a0b000) at /opt/greatsql-8.0.25-16/sql/sql_parse.cc:1388
#2  0x0000000003991168 in handle_connection (arg=0x7f10d1f9d120) at /opt/greatsql-8.0.25-16/sql/conn_handler/connection_handler_per_thread.cc:307
#3  0x00000000052e4b22 in pfs_spawn_thread (arg=0x7f10e8a45660) at /opt/greatsql-8.0.25-16/storage/perfschema/pfs.cc:2899
#4  0x00007f10eb1e917a in start_thread (arg=<optimized out>) at pthread_create.c:479
#5  0x00007f10e9128dc3 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
(gdb)
(gdb) p thd->m_query_string  #<-- 打印SQL语句
$14 = str = 0x7f10a3a0e828 "select 'debug' from dual", length = 24
(gdb)
(gdb) c  #<-- 继续执行，终端#2里被阻塞的SQL语句就可以执行了
Continuing.

切回 终端#2 查看SQL语句执行结果：

mysql> select 'debug' from dual;
+-------+
| debug |
+-------+
| debug |
+-------+
1 row in set (12 min 11.55 sec)

可以看到，因为一直被阻塞，这条SQL请求耗时超过12分钟。当 终端#2 的连接断开退出后，可以看到gdb端也有相应提示：

Thread 39 "mysqld-debug" hit Breakpoint 1, dispatch_command (thd=0x7f10a3a0b000, com_data=0x7f10d12a7370, command=COM_QUIT)
    at /opt/greatsql-8.0.25-16/sql/sql_parse.cc:1605
1605      bool error = false;
(gdb)

如果不想继续跟踪调试了，只需输入指令 q 或 quit 即可退出gdb。

(gdb) quit
A debugging session is active.

        Inferior 1 [process 2644322] will be detached.

Quit anyway? (y or n) y
Detaching from program: /usr/local/GreatSQL-8.0.25-16-Linux-glibc2.28-x86_64/bin/mysqld-debug, process 2644322
[Inferior 1 (process 2644322) detached]

3.2 使用 Trace 文件调试

还可以在GreatSQL客户端中设置变量 debug 为不同值，就可以输出GreatSQL运行过程中涉及的调用模块、函数、状态信息等全部信息，并记录到本地文件中。用法示例：

mysql> SET SESSION debug = 'debug_options';

变量 debug 支持多种设置模式：

debug_options = field_1:field_2:...:field_N
field = [+|-]flag[,modifier,modifier,...,modifier]

+, - 表示从当前debug值添加或者减少某些选项。

flag相关可选项如下：

flag	说明
d	开启DBUG
f	只跟踪指定的函数
F	跟踪指定的源码文件
i	跟踪指定的线程
L	跟踪指定的源码行数
n	打印函数调用层次序号
N	输出日志从0开始打印行号
o	指定输出到某个文件
O	类似o,每次写文件都会flush，reopen
P	匹配DBUG_PROCESS
p	打印process name
t	打印函数调用和退出

使用案例1（精简模式）

# 设置debug选项
mysql> set session debug='d:t:o,/tmp/mysqld.trace';

# 执行SQL请求
mysql> select 'debug' from dual;
+-------+
| debug |
+-------+
| debug |
+-------+
1 row in set (0.00 sec)

查看生成的trace文件：

$ cat /tmp/mysqld.trace
...
>do_command
| >THD::clear_error
| <THD::clear_error
| >Diagnostics_area::reset_diagnostics_area
| <Diagnostics_area::reset_diagnostics_area
| >my_net_set_read_timeout
| | enter: timeout: 28800
| | >vio_socket_timeout
| | <vio_socket_timeout
| <my_net_set_read_timeout
| >vio_is_blocking
| <vio_is_blocking
| >net_read_raw_loop
| | >vio_read
| | | >vio_is_blocking
| | | <vio_is_blocking
| | | >vio_io_wait
| | | <vio_io_wait
| | <vio_read
| <net_read_raw_loop
| THD::enter_stage: 'starting' /opt/greatsql-8.0.25-16/sql/conn_handler/init_net_server_extension.cc:102
...

使用案例2（复杂模式）增加了打印文件名和行号等信息，更方便定位查找。

mysql> set session debug='d:t:L:F:o,/tmp/mysqld.trace';
mysql> select 'debug' from dual;
...

查看生成的trace文件：

$ cat /tmp/mysqld.trace
...
  sql_parse.cc: <do_command
  sql_parse.cc:  1269: >do_command
   sql_class.h:  3287: | >THD::clear_error
   sql_class.h: | <THD::clear_error
  sql_error.cc:   357: | >Diagnostics_area::reset_diagnostics_area
  sql_error.cc: | <Diagnostics_area::reset_diagnostics_area
   net_serv.cc:  2246: | >my_net_set_read_timeout
   net_serv.cc:  2247: | | enter: timeout: 28800
  viosocket.cc:   380: | | >vio_socket_timeout
  viosocket.cc: | | <vio_socket_timeout
   net_serv.cc: | <my_net_set_read_timeout
  viosocket.cc:   373: | >vio_is_blocking
  viosocket.cc: | <vio_is_blocking
   net_serv.cc:  1341: | >net_read_raw_loop
  viosocket.cc:   169: | | >vio_read
  viosocket.cc:   373: | | | >vio_is_blocking
  viosocket.cc: | | | <vio_is_blocking
  viosocket.cc:  1118: | | | >vio_io_wait
  viosocket.cc: | | | <vio_io_wait
  viosocket.cc: | | <vio_read
   net_serv.cc: | <net_read_raw_loop
  sql_parse.cc:   320: | THD::enter_stage: 'starting' /opt/greatsql-8.0.25-16/sql/conn_handler/init_net_server_extension.cc:102
...

本文简单演示了如何跟踪调试GreatSQL的几种方法，更多有趣实用的方法还有待进一步挖掘，一起探索新世界吧。

P.S，我也在MacOS环境下构建了基于vscode的跟踪调试环境，但还是更喜欢在Linux终端命令行模式下工作，所以本文没介绍如何利用vscode跟踪调试，有兴趣的读者可以根据其他资料自行构建。

Enjoy GreatSQL :)

《深入浅出MGR》视频课程

戳此小程序即可直达B站

https://www.bilibili.com/medialist/play/1363850082?business=space_collection&business_id=343928&desc=0

文章推荐：

想看更多技术好文，点个“在看”吧！

以上是关于GreatSQL删除分区慢的跟踪的主要内容，如果未能解决你的问题，请参考以下文章