GDB再学习(10):线程调试相关

Posted Stoneshen1211

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了GDB再学习(10):线程调试相关相关的知识,希望对你有一定的参考价值。



1 介绍

前面介绍了一些常用的指令和断点相关的设置,在实际工程应用中,我们可能会开启很多的线程,有着不同的功能,线程之间的数据同步或者异步。因此需要线程相关的指令进行调试。

在官方GDB文档《gdb.pdf》中章节 4.10 Debugging Programs with Multiple Threads /5.5 Stopping and Starting Multi-thread Programs 以及其它章节中,介绍了以下线程相关指令(只摘录一些关键指令,详细指令可参考文档说明):

指令说明
thread thread-id切换thread-id为当前线程
info threads [id]查询指定id线程或全部线程信息
thread name [name]为当前线程设置一个名称
thread find [regexp]查找与regexp匹配的线程信息
break location thread thread-id [ if cond ]在location指定的位置处建立断点,断点的作用范围为指定ID的线程内
thread apply [thread-id-list /all] args将args命令(next/continue/silent/quiet等)作用于指定的线程或者全部线程
set scheduler-locking mode线程锁定模式

2 代码准备

#include <stdio.h>
#include <string.h>
#include <signal.h>
#include <stdlib.h>
#include <pthread.h>
#include <unistd.h>
#include <sys/prctl.h>

int j = 0;

int test2()
{
	char* s8Buf = NULL;
	
	strcpy(s8Buf, "8888");
	
	return 0;
}

void printids(const char* s)
{
    pid_t           pid;
    pthread_t       tid;
 
    pid = getpid();
    tid = pthread_self();
    printf("%s pid %llu tid %llu (0x%llx)\\n", s, (long long unsigned int)pid, (long long unsigned int)tid, (long long unsigned int)tid);
}

void* thread_test1()
{
	prctl(PR_SET_NAME, "thread_test1");
	printids("thread_test1");

	int i = 0;
	
	while(1)
	{
		printf("-------->thread1 index %d\\n", i++);
		sleep(1);
	}
}

void* thread_test2()
{
	prctl(PR_SET_NAME, "thread_test2");
	printids("thread_test2");
	
	int i = 0;
	
	while(1)
	{
		printf("-------->thread2 index %d\\n", i++);
		sleep(1);
	}
}

int main()
{
	int i = 0;
	pthread_t pthTest1 = 0;
	pthread_t pthTest2 = 0;

	pthread_create(&pthTest1, NULL, thread_test1, NULL);
	pthread_create(&pthTest2, NULL, thread_test2, NULL);

	sleep(2);
	
	while(1)
	{
		j++;
		printf("-------->index %d\\n", i++);
		sleep(1);
	}

	//test2();
	
	return 0;
}

3 指令介绍

3.1 线程启动相关打印信息

(gdb) break 73
Breakpoint 1 at 0x40090b: file test_gdb.c, line 73.
(gdb) r
Starting program: /home/test_demo/gdb/test_gdb 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff77ef700 (LWP 22728)]
[New Thread 0x7ffff6fee700 (LWP 22729)]
thread_test1 pid 22724 tid 140737345681152 (0x7ffff77ef700)
-------->thread1 index 0
thread_test2 pid 22724 tid 140737337288448 (0x7ffff6fee700)
-------->thread2 index 0
-------->thread2 index 1
-------->thread1 index 1
-------->index 0

Thread 1 "test_gdb" hit Breakpoint 1, main () at test_gdb.c:73
73			sleep(1);
(gdb) 

如上面展示,当线程启动的时候,gdb中会打印

[New Thread 0x7ffff77ef700 (LWP 22728)]

这样的信息,其中0x7ffff77ef700为线程的ID,LWP为线程或者进程ID(当我们使用ps时候,可以看到ID号为进程ID, 线程ID在ps中是不可见的,但是可以使用指令ls /proc/xxx_id/task/, 查看相关ID信息)。

同时上面的打印信息也展示了那个线程触发了断点:

Thread 1 "test_gdb" hit Breakpoint 1, main () at test_gdb.c:73

3.2 查询指定id线程或全部线程信息 info threads [id]

使用指令info threads可以查看全部线程的信息:

(gdb) info threads
  Id   Target Id         Frame 
* 1    Thread 0x7ffff7fd9700 (LWP 22724) "test_gdb" main () at test_gdb.c:73
  2    Thread 0x7ffff77ef700 (LWP 22728) "thread_test1" 0x00007ffff78bc38d in nanosleep () at ../sysdeps/unix/syscall-template.S:84
  3    Thread 0x7ffff6fee700 (LWP 22729) "thread_test2" 0x00007ffff78bc38d in nanosleep () at ../sysdeps/unix/syscall-template.S:84

*号代表当前的线程,ID代表线程的编号,Target ID代表线程ID,Frame是线程的一些信息,包含线程名称、线程暂停的具体位置等。

3.3 切换当前线程 thread thread-id

使用thread id指令切换线程。

(gdb) info threads
  Id   Target Id         Frame 
* 1    Thread 0x7ffff7fd9700 (LWP 22724) "test_gdb" main () at test_gdb.c:73
  2    Thread 0x7ffff77ef700 (LWP 22728) "thread_test1" 0x00007ffff78bc38d in nanosleep () at ../sysdeps/unix/syscall-template.S:84
  3    Thread 0x7ffff6fee700 (LWP 22729) "thread_test2" 0x00007ffff78bc38d in nanosleep () at ../sysdeps/unix/syscall-template.S:84
(gdb) thread 3
[Switching to thread 3 (Thread 0x7ffff6fee700 (LWP 22729))]
#0  0x00007ffff78bc38d in nanosleep () at ../sysdeps/unix/syscall-template.S:84
84	../sysdeps/unix/syscall-template.S: No such file or directory.
(gdb) info threads
  Id   Target Id         Frame 
  1    Thread 0x7ffff7fd9700 (LWP 22724) "test_gdb" main () at test_gdb.c:73
  2    Thread 0x7ffff77ef700 (LWP 22728) "thread_test1" 0x00007ffff78bc38d in nanosleep () at ../sysdeps/unix/syscall-template.S:84
* 3    Thread 0x7ffff6fee700 (LWP 22729) "thread_test2" 0x00007ffff78bc38d in nanosleep () at ../sysdeps/unix/syscall-template.S:84
(gdb) 

如上,我们切换线程到线程编号为3的线程,同时gdb也会有提示,切换到那个线程:

[Switching to thread 3 (Thread 0x7ffff6fee700 (LWP 22729))]

3.4 为当前线程设置一个名称 thread name [name]

使用thread name [name] 为当前线程设置一个名称,即覆盖系统给的名称;
使用thread name则会删除刚才设置的名称,恢复系统名称;

(gdb) info threads
  Id   Target Id         Frame 
  1    Thread 0x7ffff7fd9700 (LWP 22724) "test_gdb" main () at test_gdb.c:73
  2    Thread 0x7ffff77ef700 (LWP 22728) "thread_test1" 0x00007ffff78bc38d in nanosleep () at ../sysdeps/unix/syscall-template.S:84
* 3    Thread 0x7ffff6fee700 (LWP 22729) "thread_test2" 0x00007ffff78bc38d in nanosleep () at ../sysdeps/unix/syscall-template.S:84
(gdb) thread name test_666
(gdb) info threads
  Id   Target Id         Frame 
  1    Thread 0x7ffff7fd9700 (LWP 22724) "test_gdb" main () at test_gdb.c:73
  2    Thread 0x7ffff77ef700 (LWP 22728) "thread_test1" 0x00007ffff78bc38d in nanosleep () at ../sysdeps/unix/syscall-template.S:84
* 3    Thread 0x7ffff6fee700 (LWP 22729) "test_666" 0x00007ffff78bc38d in nanosleep () at ../sysdeps/unix/syscall-template.S:84

如上,线程编号为3的线程名称为"thread_test2", 设置它的名称为"test_666" 。

(gdb) info threads
  Id   Target Id         Frame 
  1    Thread 0x7ffff7fd9700 (LWP 22724) "test_gdb" main () at test_gdb.c:73
  2    Thread 0x7ffff77ef700 (LWP 22728) "thread_test1" 0x00007ffff78bc38d in nanosleep () at ../sysdeps/unix/syscall-template.S:84
* 3    Thread 0x7ffff6fee700 (LWP 22729) "test_666" 0x00007ffff78bc38d in nanosleep () at ../sysdeps/unix/syscall-template.S:84
(gdb) thread name 
(gdb) info threads
  Id   Target Id         Frame 
  1    Thread 0x7ffff7fd9700 (LWP 22724) "test_gdb" main () at test_gdb.c:73
  2    Thread 0x7ffff77ef700 (LWP 22728) "thread_test1" 0x00007ffff78bc38d in nanosleep () at ../sysdeps/unix/syscall-template.S:84
* 3    Thread 0x7ffff6fee700 (LWP 22729) "thread_test2" 0x00007ffff78bc38d in nanosleep () at ../sysdeps/unix/syscall-template.S:84

如上,恢复系统名称。

3.5 查找与regexp匹配的线程信息 thread find [regexp]

regexp可以为线程ID,LWP ID,线程的名称等。
如下:

(gdb) thread find 22729
Thread 3 has target id 'Thread 0x7ffff6fee700 (LWP 22729)'
(gdb) thread find 0x7ffff6fee700
Thread 3 has target id 'Thread 0x7ffff6fee700 (LWP 22729)'
(gdb) thread find thread_test2
Thread 3 has target name 'thread_test2'

3.6 在location指定的位置处建立断点,断点的作用范围为指定ID的线程内 break location thread thread-id [ if cond ]

这条指令需要注意,location必须在指定线程内,如果不在,即便这条线程创建成功了,但是其实是没有作用的。
如下,我们设置两个断点,指定线程为编号为3的线程:

(gdb) break 54 thread 3
Breakpoint 11 at 0x400868: file test_gdb.c, line 54.
(gdb) break 40 thread 3
Breakpoint 12 at 0x400815: file test_gdb.c, line 40.

上面两条指令,断点位置分别为54行和40行,因为40行是不在线程编号3即函数thread_test2内,所以这个断点是不会被触发的。

(gdb) break 54 thread 3
Breakpoint 11 at 0x400868: file test_gdb.c, line 54.
(gdb) break 40 thread 3
Breakpoint 12 at 0x400815: file test_gdb.c, line 40.
(gdb) info breaks
Undefined info command: "breaks".  Try "help info".
(gdb) info break
Num     Type           Disp Enb Address            What
10      breakpoint     keep y   0x00000000004008e2 in main at test_gdb.c:69
	breakpoint already hit 1 time
11      breakpoint     keep y   0x0000000000400868 in thread_test2 at test_gdb.c:54 thread 3
	stop only in thread 3
12      breakpoint     keep y   0x0000000000400815 in thread_test1 at test_gdb.c:40 thread 3
	stop only in thread 3
(gdb) delete 10
(gdb) 
(gdb) c
Continuing.
-------->index 0
-------->thread1 index 2
-------->thread2 index 2
[Switching to Thread 0x7ffff6fee700 (LWP 22769)]

Thread 3 "thread_test2" hit Breakpoint 11, thread_test2 () at test_gdb.c:54
54			sleep(1);
(gdb) info break
Num     Type           Disp Enb Address            What
11      breakpoint     keep y   0x0000000000400868 in thread_test2 at test_gdb.c:54 thread 3
	stop only in thread 3
	breakpoint already hit 1 time
12      breakpoint     keep y   0x0000000000400815 in thread_test1 at test_gdb.c:40 thread 3
	stop only in thread 3
(gdb) info threads
  Id   Target Id         Frame 
  1    Thread 0x7ffff7fd9700 (LWP 22767) "test_gdb" 0x00007ffff78bc38d in nanosleep () at ../sysdeps/unix/syscall-template.S:84
  2    Thread 0x7ffff77ef700 (LWP 22768) "thread_test1" 0x000000000040081a in thread_test1 () at test_gdb.c:40
* 3    Thread 0x7ffff6fee700 (LWP 22769) "thread_test2" thread_test2 () at test_gdb.c:54

[ if cond ] 是断点触发的条件,和前面的讲解一致,这里就不再赘述。

3.7 将args命令(next/continue/silent/quiet等)作用于指定的线程或者全部线程 thread apply [thread-id-list | all] args

如下,我们继续运行线程编号为2的线程:

(gdb) info threads
  Id   Target Id         Frame 
  1    Thread 0x7ffff7fd9700 (LWP 22767) "test_gdb" 0x00007ffff78bc38d in nanosleep () at ../sysdeps/unix/syscall-template.S:84
  2    Thread 0x7ffff77ef700 (LWP 22768) "thread_test1" 0x000000000040081a in thread_test1 () at test_gdb.c:40
* 3    Thread 0x7ffff6fee700 (LWP 22769) "thread_test2" thread_test2 () at test_gdb.c:54
(gdb) thread apply 2 continue

Thread 2 (Thread 0x7ffff77ef700 (LWP 22768)):
Continuing.
-------->index 1
-------->index 2
-------->thread1 index 3
-------->thread2 index 3
[Switching to Thread 0x7ffff6fee700 (LWP 22769)]

Thread 3 "thread_test2" hit Breakpoint 11, thread_test2 () at test_gdb.c:54
54			sleep(1);

但是,上面的打印信息显示不仅仅编号为2的线程运行了,其它的线程也全部都运行了,什么鬼?

在《gdb.pdf》5.5 Stopping and Starting Multi-thread Programs 章节中有介绍,在gdb调试中,程序默认模式是全停止模式“All-Stop Mode”,在这个模式下,当程序中任意一个线程停止时候,gdb会会停止程序中的所有线程,同样,如果任意一个程序运行时候,其它线程也会运行。
还有一种模式叫做不停止模式“Non-Stop Mode”,在这种模式下,当我们检查已停止的线程时候,其它线程可以正常运行。

3.8 线程锁定模式 set scheduler-locking mode

在3.7中,当我们调试编号为2的线程时候,其它的线程也同步运行了,因此我们需要设置线程锁定,让其它程序不会运行。
参数有如下四种:

off 默认状态,不锁定线程
on 打开线程锁定,则当前线程运行时候,其它线程不会运行
step 对单步调试进行优化,当在调试时候,会阻止其它线程抢占,不让其它线程运行,防止当前线程被意外改变。当我们使用“continue”“until””finish”时候,其它线程也会运行,如果其它线程中遇到断点时候,当前线程也会切换到遇到断点的那个线程
replay

如下,我们设置锁定模式为开,则只有当前的线程运行,其它线程还处于暂停模式。

(gdb) set scheduler-locking on
(gdb) thread apply 2 continue

Thread 2 (Thread 0x7ffff77ef700 (LWP 22768)):
Continuing.
-------->thread1 index 5
-------->thread1 index 6
-------->thread1 index 7
-------->thread1 index 8
-------->thread1 index 9
-------->thread1 index 10
-------->thread1 index 11
-------->thread1 index 12
-------->thread1 index 13
-------->thread1 index 14

以上是关于GDB再学习(10):线程调试相关的主要内容,如果未能解决你的问题,请参考以下文章

GDB再学习(10):线程调试相关

Linux学习——Gdb基本调试方法&&多线程调试

gdb动态库延迟断点及线程/进程创建相关事件处理(下)

GDB调试实战(10)多线程调试

GDB再学习:断点调试之事件断点

GDB再学习:断点调试之事件断点