使用trace_event跟踪进程的一生

Posted rtoax

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了使用trace_event跟踪进程的一生相关的知识,希望对你有一定的参考价值。

 

1.关闭ftrace环形队列的总开关

echo 0 > /sys/kernel/debug/tracing/tracing_on

2.打开所有系统调用的trace_event, 包括每个系统调用的enter和exit

echo 1 > ./events/syscalls/enable

3.设置do_sys_open的kprobe_event, 便于查看打开一个文件时,知道打开的是哪一个文件名

echo 'p:do_sy_open do_sys_open arg1=+0($arg2):string' > ./kprobe_events
echo 1 > ./events/kprobes/do_sy_open/enable

4.打开信号相关的trace_event, 并使能对应的call trace

echo 1 > ./events/signal/enable
echo stacktrace > events/signal/signal_deliver/trigger
echo stacktrace > events/signal/signal_generate/trigger

5.只过滤父进程和父进程fork之后的所有子进程的trace_event信息
当前终端shell的进程号为4994(父进程)

localhost:/home/jeff/project # echo $$
4994
echo 4994 >  /sys/kernel/debug/tracing/set_event_pid
echo 1 > options/event-fork

本实验跟踪的进程名是a.out , 代码:

main.c
int main(void)
{
    while(1);
    return 0;
}

操作过程:

1.打开ftrace环形队列的总开关

echo 1 > /sys/kernel/debug/tracing/tracing_on

2.执行a.out程序

localhost:/home/jeff/project # ./a.out &

[1] 5441

3. 杀掉a.out进程

kill -9 5441

4.关闭ftrace环形队列的总开关并保存trace数据

echo 0 > /sys/kernel/debug/tracing/tracing_on

cp /sys/kernel/debug/tracing/trace ./trace

5.分析trace数据(cat ./trace)

父进程4994(bash)fork出子进程5441(a.out)

父进程sys_clone的返回值为0x1541(5441)为子进程号

子进程sys_clone返回0 

bash-4994  [005] ....1..  9978.187414: sys_clone(clone_flags: 1200011, newsp: 0, parent_tidptr: 0, child_tidptr: 7ff16d9dce50, tls: 7ff16d9dcb80)
bash-4994  [005] ....1..  9978.187638: sys_clone -> 0x1541
a.out-5441  [006] ....1..  9978.187715: sys_clone -> 0x0

子进程调用execve

a.out-5441  [006] ....1..  9978.187950: sys_execve(filename: 55f326ddedc0, argv: 55f326dd7d00, envp: 55f326de42d0)
a.out-5441  [006] ....1..  9978.188246: sys_execve -> 0x0

子进程开始加载共享库

a.out-5441  [006] ....1..  9978.188347: sys_openat(dfd: ffffffffffffff9c, filename: 7fef047cacc0, flags: 80000, mode: 0)
a.out-5441  [006] ....1..  9978.188348: do_sy_open: (do_sys_open+0x0/0x260) arg1="/lib64/libc.so.6"
a.out-5441  [006] ....1..  9978.188352: sys_openat -> 0x3
        .......
a.out-5441  [006] ....1..  9978.188361: sys_mmap(addr: 0, len: 3ba778, prot: 5, flags: 802, fd: 3, off: 0)
<> a.out-5441  [006] ....1..  9978.188365: sys_mmap -> 0x7fef041e8000

mmap的共享库可以与/proc/5441/maps完全对应上:

localhost:/home/jeff/project # cat /proc/5441/maps
...
<>7fef041e8000-7fef04399000 r-xp 00000000 00:2f 8154 /lib64/libc-2.26.so
.....

父进程使用kill -9 杀死子进程

bash-4994 ...signal_generate: sig=9 comm=a.out pid=5441

对应call trace打印:

            bash-4994  [007] ....111  9992.171042: <stack trace>
 => trace_event_raw_event_signal_generate
 => __send_signal
 => do_send_sig_info
 => kill_pid_info
 => kill_something_info
 => __x64_sys_kill
 => do_syscall_64
 => entry_SYSCALL_64_after_hwframe
            bash-4994  [007] ....1..  9992.171044: sys_kill -> 0x0

子进程收到kill -9 信号之后开始给父进程发送(sig=17)SIGCHLD信号

#kill -l | grep SIGCHLD
17(SIGCHLD)
          a.out-5441  [002] .....11  9992.171045: signal_deliver: sig=9 errno=0 code=0 sa_handler=0 sa_flags=0
           a.out-5441  [002] ....111  9992.171049: <stack trace>
 => trace_event_raw_event_signal_deliver
 => get_signal
 => do_signal
 => exit_to_usermode_loop
 => prepare_exit_to_usermode
 => swapgs_restore_regs_and_return_to_usermode

           a.out-5441  [002] .....12  9992.171118: signal_generate: sig=17 errno=0 code=2 comm=bash pid=4994 grp=1 res=0
           a.out-5441  [002] ....112  9992.171120: <stack trace>
 => trace_event_raw_event_signal_generate
 => __send_signal
 => do_notify_parent
 => do_exit
 => do_group_exit
 => get_signal
 => do_signal
 => exit_to_usermode_loop
 => prepare_exit_to_usermode
 => swapgs_restore_regs_and_return_to_usermode

父进程4994开始响应SIGCHLD信号并在wait4中给子进程5441收尸(让子进程不再是僵尸进程)

  bash-4994  [007] .....11  9992.171125: signal_deliver: sig=17 errno=0 code=2 sa_handler=55f324e07ad0 sa_flags=14000000
            bash-4994  [007] ....111  9992.171127: <stack trace>
 => trace_event_raw_event_signal_deliver
 => get_signal
 => do_signal
 => exit_to_usermode_loop
 => do_syscall_64
 => entry_SYSCALL_64_after_hwframe
            bash-4994  [007] ....1..  9992.171130: sys_wait4(upid: ffffffffffffffff, stat_addr: 7ffca9253250, options: b, ru: 0)
            bash-4994  [007] ....1..  9992.171197: sys_wait4 -> 0x1541

本实验只是trace_event功能的小试牛刀,而trace_event也只是ftrace子系统中一项功能。

我最近发布在阅码场的linux traces 课程几乎包含了ftrace子系统的所有功能的使用方法以及底层原理实现。

 

以上是关于使用trace_event跟踪进程的一生的主要内容,如果未能解决你的问题,请参考以下文章

使用TRACE_EVENT宏添加Tracepoint(1/3部分)

使用TRACE_EVENT宏添加Tracepoint(1/3部分)

使用TRACE_EVENT宏添加Tracepoint(1/3部分)

宏中未使用的参数会怎样?

SQL Server 默认跟踪(Trace)捕获事件详解

在 Firebase 中禁用自动活动跟踪