Linux内核中oops 错误解析以及问题定位

Posted 为了维护世界和平_

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Linux内核中oops 错误解析以及问题定位相关的知识,希望对你有一定的参考价值。

目录

一、oops输出解析

二、工具

1、objdump

2、gdb

3、addr2line       

4、decodecode

5、faddr2line


文档最后有完整的oops输出文件,此处将输出分成多个小块进行分析。

一、oops输出解析

[ 2620.950912] oops_tryv1:try_oops_init():37: Lets Oops!
               Now attempting to write something to the NULL address 0x0000000000000000
[ 2620.950919] BUG: kernel NULL pointer dereference, address: 0000000000000000
[ 2620.950922] #PF: supervisor write access in kernel mode
[ 2620.950923] #PF: error_code(0x0002) - not-present page
  •  内核指针地址:0000000000000000
  • 程序执行在:监督写模式下
  • 错误码:0x0002
[ 2620.950925] PGD 0 P4D 0 
[ 2620.950927] Oops: 0002 [#1] SMP PTI
  •  0002表示特定架构的oops掩码,是非常有用的,此处的架构是x86平台

MMU设置页错误编码为特定的编码,在x86平台下

  • 0002 -> 00010   Bit 2 is 0 :内核模式  Bit 1 is 1 写 Bit 0 is 0 no page found
  • [#1]发生oops的数量,此处是1次
  • SMP:Symmetric Multi-Processing (SMP) :支持多核
  • PTI:Page Table Isolation  (页表隔离)内核中的保护机制
[ 2620.950929] CPU: 7 PID: 4959 Comm: insmod Tainted: G           OE     5.13.0-52-generic #59-Ubuntu
  • CPU 代码在执行oops错误时,发生在哪个CPU上,此处是7
  • PID:进程或线程的PID
  • Comm:进程或线程的名
  • Tainted:标志位掩码 G           OE 

  •  内核版本 uname -r   5.13.0-52-generic # 后面是数字是内核编译的次数
  • G | O | E
  • G:  GPL
  • O:  out-of-tree
  • E : unsigned module

参考 https://www.kernel.org/doc/html/latest/admin-guide/taintedkernels.html.

[ 2620.950931] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, Bios 6.00 07/22/2020
[ 2620.950932] RIP: 0010:try_oops_init+0x88/0x1000 [oops_tryv1]
  •  RIP: CPU寄存器的名字,表示要执行代码的地址,常用此地址来定位出错的地方
  • 0010:代码段
  • try_oops_init+0x88/0x1000

格式如下:
        function_name+off_from_func/size_of_func [module-name]
        0x88:表示函数开始偏移量 (字节)
        0x1000:是函数的大小

[ 2620.950937] Code: 29 4c 63 04 25 00 00 00 00 b9 32 00 00 00 48 c7 c2 30 c1 7e c0 48 c7 c6 67 c0 7e c0 48 c7 c7 72 c0 7e c0 e8 df 9f 85 f7 eb 0b <c7> 04 25 00 00 00 00 78 00 00 00 c9 31 c0 c3 00 00 00 00 00 00 00

这段代码可以用内核中的工具解析,在下文中有介绍 工具所在的位置为 scripts/decodecode 与 scripts/decode_stacktrace.sh

[ 2620.950938] RSP: 0018:ffffad3c8578bc78 EFLAGS: 00010246
[ 2620.950940] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[ 2620.950941] RDX: 0000000000000000 RSI: ffff912c741a0980 RDI: ffff912c741a0980
[ 2620.950942] RBP: ffffad3c8578bc78 R08: 0000000000000000 R09: ffffad3c8578ba68
[ 2620.950943] R10: ffffad3c8578ba60 R11: ffff912c7fec83e8 R12: ffffffffc0781000
[ 2620.950943] R13: ffff912b450a6110 R14: 0000000000000000 R15: ffffffffc07ed000
[ 2620.950945] FS:  00007f526a3a1b80(0000) GS:ffff912c74180000(0000) knlGS:0000000000000000
[ 2620.950946] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2620.950947] CR2: 0000000000000000 CR3: 0000000111048003 CR4: 00000000003706e0

寄存器信息

  • RSP 堆栈指针寄存器 指向的地址 ffffad3c8578bc78 
  • EFLAGS 寄存器,包含了状态标志信息
  • Code Segment (CS) 代码段寄存器0x0010
     

控制寄存器

  • x86_64有16个 CR0 ~ CR15 ,(11个是保留的 CR1 CR5-CR7 CR9-CR15)
  • CR0:可以编程,包含控制位,如保护模式启用、模拟、写保护、对齐掩码、缓存禁用、分页等。
  • CR2:包含KVA,当访问时,它会导致MMU引发导致Oops的页面错误.此处的地址在上面出现过。
  • CR3:位掩码,保存页表的物理地址,实际上,它告诉MMU如何获取运行上下文的分页表。
  • CR4:各种控制位 (PAE)  (PCE)   (SMEP) (SMAP)

[ 2620.950969] Call Trace:
[ 2620.950971]  <TASK>
[ 2620.950974]  do_one_initcall+0x48/0x1d0
[ 2620.950978]  ? kmem_cache_alloc_trace+0xfb/0x240
[ 2620.950984]  do_init_module+0x52/0x270
[ 2620.950987]  load_module+0xa8f/0xb10
[ 2620.950989]  __do_sys_finit_module+0xc2/0x120
[ 2620.950992]  __x64_sys_finit_module+0x18/0x20
[ 2620.950994]  do_syscall_64+0x61/0xb0
[ 2620.950997]  ? fput+0x13/0x20
[ 2620.950999]  ? ksys_mmap_pgoff+0x135/0x260
[ 2620.951000]  ? exit_to_user_mode_prepare+0x37/0xb0
[ 2620.951002]  ? syscall_exit_to_user_mode+0x27/0x50
[ 2620.951004]  ? __x64_sys_mmap+0x33/0x40
[ 2620.951005]  ? do_syscall_64+0x6e/0xb0

调用栈,从下往上读,调用关系

[ 2620.951007]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 2620.951009] RIP: 0033:0x7f526a4c470d
[ 2620.951012] Code: 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d f3 66 0f 00 f7 d8 64 89 01 48
[ 2620.951013] RSP: 002b:00007ffd87e48ef8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[ 2620.951014] RAX: ffffffffffffffda RBX: 0000558a1da0b7c0 RCX: 00007f526a4c470d
[ 2620.951015] RDX: 0000000000000000 RSI: 0000558a1c698c02 RDI: 0000000000000003
[ 2620.951016] RBP: 0000000000000000 R08: 0000000000000000 R09: 00007f526a5bbc60
[ 2620.951017] R10: 0000000000000003 R11: 0000000000000246 R12: 0000558a1c698c02
[ 2620.951018] R13: 0000558a1da0b760 R14: 00007ffd87e49148 R15: 0000558a1da0b8d0
[ 2620.951019]  </TASK>
[ 2620.951020] Modules linked in: oops_tryv1(OE+) kcsan_datarace(OE) xt_conntrack nft_chain_nat xt_MASQUERADE nf_nat nf_conntrack_netlink nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xfrm_user xfrm_algo nft_counter xt_addrtype nft_compat nf_tables libcrc32c nfnetlink br_netfilter bridge stp llc overlay vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock binfmt_misc nls_iso8859_1 intel_rapl_msr intel_rapl_common snd_ens1371 snd_ac97_codec gameport ac97_bus snd_pcm crct10dif_pclmul ghash_clmulni_intel snd_seq_midi snd_seq_midi_event snd_rawmidi aesni_intel snd_seq crypto_simd cryptd snd_seq_device snd_timer snd rapl vmw_balloon soundcore joydev input_leds serio_raw vmw_vmci mac_hid sch_fq_codel vmwgfx ttm drm_kms_helper cec rc_core fb_sys_fops syscopyarea sysfillrect sysimgblt msr parport_pc ppdev lp parport drm ip_tables x_tables autofs4 hid_generic mptspi usbhid mptscsih mptbase ahci crc32_pclmul psmouse hid e1000 libahci scsi_transport_spi pata_acpi i2c_piix4
[ 2620.951056] CR2: 0000000000000000
[ 2620.951058] ---[ end trace eabb70a32207bb48 ]---
[ 2620.951059] RIP: 0010:try_oops_init+0x88/0x1000 [oops_tryv1]
[ 2620.951061] Code: 29 4c 63 04 25 00 00 00 00 b9 32 00 00 00 48 c7 c2 30 c1 7e c0 48 c7 c6 67 c0 7e c0 48 c7 c7 72 c0 7e c0 e8 df 9f 85 f7 eb 0b <c7> 04 25 00 00 00 00 78 00 00 00 c9 31 c0 c3 00 00 00 00 00 00 00
[ 2620.951062] RSP: 0018:ffffad3c8578bc78 EFLAGS: 00010246
[ 2620.951063] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[ 2620.951064] RDX: 0000000000000000 RSI: ffff912c741a0980 RDI: ffff912c741a0980
[ 2620.951065] RBP: ffffad3c8578bc78 R08: 0000000000000000 R09: ffffad3c8578ba68
[ 2620.951066] R10: ffffad3c8578ba60 R11: ffff912c7fec83e8 R12: ffffffffc0781000
[ 2620.951067] R13: ffff912b450a6110 R14: 0000000000000000 R15: ffffffffc07ed000
[ 2620.951068] FS:  00007f526a3a1b80(0000) GS:ffff912c74180000(0000) knlGS:0000000000000000
[ 2620.951069] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2620.951070] CR2: 0000000000000000 CR3: 0000000111048003 CR4: 00000000003706e0

Modules linked:显示Oop时内核中加载的所有模块,内核模块是第三方的,当内核遇到错误时,会高度怀疑第三方模块导致的。此处oops_tryv1是出现问题的模块。OE标志在上文有介绍。

最后一行是总结 CR2的值,oop的原因:是(虚拟)地址导致


二、工具

1、objdump

//查看地址
#grep oops_tryv1 /proc/modules
oops_tryv1 24576 1 - Loading 0xffffffffc0223000 (OE+)

//使用地址
#objdump -dS --adjust-vma=0xffffffffc0223000 ./oops_tryv1.ko > oops_tryv1.disas

嵌入式环境下  $CROSS_COMPILEobjdump -dS <path/to/kernel-src/>/vmlinux > vmlinux.disas

查看oops_tryv1.disas的汇编文件

static int __init try_oops_init(void)

ffffffffc071b000:       e8 00 00 00 00          call   ffffffffc071b005 <init_module+0x5>
...

                *(int *)val = 'x';
ffffffffc071b088:       c7 04 25 00 00 00 00    movl   $0x78,0x0
ffffffffc071b08f:       78 00 00 00

  • RIP的地址0x88
  • ffffffffc071b000 + 0x88 = ffffffffc071b088
  • 即可看到出问题的地方*(int *)val = 'x';

2、gdb

# gdb -q ./oops_tryv1.ko
Reading symbols from ./oops_tryv1.ko...
(gdb) list *try_oops_init+0x88
0x12f is in try_oops_init (/home/wy/misc/kernel/Linux-Kernel-Debugging/ch7/oops_tryv1/oops_tryv1.c:60).
55			 * https://www.kernel.org/doc/Documentation/printk-formats.txt
56			 */
57		 else                     // try writing to NULL
58			*(int *)val = 'x';
59	
60		return 0;		/* success */
61	
62	
63	static void __exit try_oops_exit(void)
64	

3、addr2line       

        将地址转化为相应的行号,可用在模块与系统启动出现的错误

使用的语法

        addr2line -e </path/to/>vmlinux -p -f <faulting_kernel_address>

        -e  Set the input file name (default is a.out)

        -p  Make the output easier to read for humans

        -f   Show function names

# addr2line -e ./oops_tryv1.o -p -f 0x88
try_oops_init at <>/oops_tryv1/oops_tryv1.c:58

//在58行的位置查看源码
# vim oops_tryv1.c
...
 57          else                     // try writing to NULL
 58                 *(int *)val = 'x';
 59 
 60         return 0; 
...

内核帮助脚本

$ objdump -d <...>/linux-5.10.60/vmlinux | <...>/linux-5.10.60/ 
scripts/checkstack.pl


$ </path/to/>/linux-5.10.60/scripts/decode_stacktrace.sh 
Usage: 
<...>/linux-5.10.60/scripts/decode_stacktrace.sh -r <release> | 
<vmlinux> [base path] [modules path]

4、decodecode

# /kernel/linux-5.15/scripts/decodecode < dmesg_oops_buginworkq.txt 
[ 380.853996] Code: 29 4c 63 04 25 00 00 00 00 b9 32 00 00 00 48 c7 c2 78 41 22 c0 48 c7 c6 67 40 22 c0 48 c7 c7 72 40 22 c0 e8 28 90 fd cd eb 0b <c7> 04 25 00 00 00 00 78 00 00 00 48 8d 65 f0 31 c0 5b 41 5c 5d c3
All code
========
   0:	29 4c 63 04          	sub    %ecx,0x4(%rbx,%riz,2)
   4:	25 00 00 00 00       	and    $0x0,%eax
   9:	b9 32 00 00 00       	mov    $0x32,%ecx
   e:	48 c7 c2 78 41 22 c0 	mov    $0xffffffffc0224178,%rdx
  15:	48 c7 c6 67 40 22 c0 	mov    $0xffffffffc0224067,%rsi
  1c:	48 c7 c7 72 40 22 c0 	mov    $0xffffffffc0224072,%rdi
  23:	e8 28 90 fd cd       	call   0xffffffffcdfd9050
  28:	eb 0b                	jmp    0x35
  2a:*	c7 04 25 00 00 00 00 	movl   $0x78,0x0		<-- trapping instruction
  31:	78 00 00 00 
  35:	48 8d 65 f0          	lea    -0x10(%rbp),%rsp
  39:	31 c0                	xor    %eax,%eax
  3b:	5b                   	pop    %rbx
  3c:	41 5c                	pop    %r12
  3e:	5d                   	pop    %rbp
  3f:	c3                   	ret    

Code starting with the faulting instruction
===========================================
   0:	c7 04 25 00 00 00 00 	movl   $0x78,0x0
   7:	78 00 00 00 
   b:	48 8d 65 f0          	lea    -0x10(%rbp),%rsp
   f:	31 c0                	xor    %eax,%eax
  11:	5b                   	pop    %rbx
  12:	41 5c                	pop    %r12
  14:	5d                   	pop    %rbp
  15:	c3                   	ret    

汇编代码精确显示了陷阱所在的位置
*    c7 04 25 00 00 00 00     movl   $0x78,0x0        <-- trapping instruction

5、faddr2line

usage: faddr2line [--list] <object file> <func+offset>

先查看内核配置有没有开启配置文件

#grep  CONFIG_RANDOMIZE_BASE /boot/config-5.15.0-47-generic

CONFIG_RANDOMIZE_BASE=y
# <>/scripts/faddr2line  ./oops_tryv1.ko try_oops_init+0x88
try_oops_init+0x88/0x100:
try_oops_init at <...>/oops_tryv1/oops_tryv1.c:58

定位到了58行

如果发现

# ./oops_tryv1.ko try_oops_init+0xdb
bad symbol size: base: 0x0000000000000000 end: 0x0000000000000000

请参考:[PATCH] scripts/faddr2line: Fix overlapping text section failures - Josh Poimboeuf

需要更新faddr2line   5.19的内核已经解决这个问题 


全部的oops输出文件

2620.950912] oops_tryv1:try_oops_init():37: Lets Oops!
               Now attempting to write something to the NULL address 0x0000000000000000
[ 2620.950919] BUG: kernel NULL pointer dereference, address: 0000000000000000
[ 2620.950922] #PF: supervisor write access in kernel mode
[ 2620.950923] #PF: error_code(0x0002) - not-present page
[ 2620.950925] PGD 0 P4D 0 
[ 2620.950927] Oops: 0002 [#1] SMP PTI
[ 2620.950929] CPU: 7 PID: 4959 Comm: insmod Tainted: G           OE     5.13.0-52-generic #59-Ubuntu
[ 2620.950931] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/22/2020
[ 2620.950932] RIP: 0010:try_oops_init+0x88/0x1000 [oops_tryv1]
[ 2620.950937] Code: 29 4c 63 04 25 00 00 00 00 b9 32 00 00 00 48 c7 c2 30 c1 7e c0 48 c7 c6 67 c0 7e c0 48 c7 c7 72 c0 7e c0 e8 df 9f 85 f7 eb 0b <c7> 04 25 00 00 00 00 78 00 00 00 c9 31 c0 c3 00 00 00 00 00 00 00
[ 2620.950938] RSP: 0018:ffffad3c8578bc78 EFLAGS: 00010246
[ 2620.950940] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[ 2620.950941] RDX: 0000000000000000 RSI: ffff912c741a0980 RDI: ffff912c741a0980
[ 2620.950942] RBP: ffffad3c8578bc78 R08: 0000000000000000 R09: ffffad3c8578ba68
[ 2620.950943] R10: ffffad3c8578ba60 R11: ffff912c7fec83e8 R12: ffffffffc0781000
[ 2620.950943] R13: ffff912b450a6110 R14: 0000000000000000 R15: ffffffffc07ed000
[ 2620.950945] FS:  00007f526a3a1b80(0000) GS:ffff912c74180000(0000) knlGS:0000000000000000
[ 2620.950946] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2620.950947] CR2: 0000000000000000 CR3: 0000000111048003 CR4: 00000000003706e0
[ 2620.950969] Call Trace:
[ 2620.950971]  <TASK>
[ 2620.950974]  do_one_initcall+0x48/0x1d0
[ 2620.950978]  ? kmem_cache_alloc_trace+0xfb/0x240
[ 2620.950984]  do_init_module+0x52/0x270
[ 2620.950987]  load_module+0xa8f/0xb10
[ 2620.950989]  __do_sys_finit_module+0xc2/0x120
[ 2620.950992]  __x64_sys_finit_module+0x18/0x20
[ 2620.950994]  do_syscall_64+0x61/0xb0
[ 2620.950997]  ? fput+0x13/0x20
[ 2620.950999]  ? ksys_mmap_pgoff+0x135/0x260
[ 2620.951000]  ? exit_to_user_mode_prepare+0x37/0xb0
[ 2620.951002]  ? syscall_exit_to_user_mode+0x27/0x50
[ 2620.951004]  ? __x64_sys_mmap+0x33/0x40
[ 2620.951005]  ? do_syscall_64+0x6e/0xb0
[ 2620.951007]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 2620.951009] RIP: 0033:0x7f526a4c470d
[ 2620.951012] Code: 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d f3 66 0f 00 f7 d8 64 89 01 48
[ 2620.951013] RSP: 002b:00007ffd87e48ef8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[ 2620.951014] RAX: ffffffffffffffda RBX: 0000558a1da0b7c0 RCX: 00007f526a4c470d
[ 2620.951015] RDX: 0000000000000000 RSI: 0000558a1c698c02 RDI: 0000000000000003
[ 2620.951016] RBP: 0000000000000000 R08: 0000000000000000 R09: 00007f526a5bbc60
[ 2620.951017] R10: 0000000000000003 R11: 0000000000000246 R12: 0000558a1c698c02
[ 2620.951018] R13: 0000558a1da0b760 R14: 00007ffd87e49148 R15: 0000558a1da0b8d0
[ 2620.951019]  </TASK>
[ 2620.951020] Modules linked in: oops_tryv1(OE+) kcsan_datarace(OE) xt_conntrack nft_chain_nat xt_MASQUERADE nf_nat nf_conntrack_netlink nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xfrm_user xfrm_algo nft_counter xt_addrtype nft_compat nf_tables libcrc32c nfnetlink br_netfilter bridge stp llc overlay vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock binfmt_misc nls_iso8859_1 intel_rapl_msr intel_rapl_common snd_ens1371 snd_ac97_codec gameport ac97_bus snd_pcm crct10dif_pclmul ghash_clmulni_intel snd_seq_midi snd_seq_midi_event snd_rawmidi aesni_intel snd_seq crypto_simd cryptd snd_seq_device snd_timer snd rapl vmw_balloon soundcore joydev input_leds serio_raw vmw_vmci mac_hid sch_fq_codel vmwgfx ttm drm_kms_helper cec rc_core fb_sys_fops syscopyarea sysfillrect sysimgblt msr parport_pc ppdev lp parport drm ip_tables x_tables autofs4 hid_generic mptspi usbhid mptscsih mptbase ahci crc32_pclmul psmouse hid e1000 libahci scsi_transport_spi pata_acpi i2c_piix4
[ 2620.951056] CR2: 0000000000000000
[ 2620.951058] ---[ end trace eabb70a32207bb48 ]---
[ 2620.951059] RIP: 0010:try_oops_init+0x88/0x1000 [oops_tryv1]
[ 2620.951061] Code: 29 4c 63 04 25 00 00 00 00 b9 32 00 00 00 48 c7 c2 30 c1 7e c0 48 c7 c6 67 c0 7e c0 48 c7 c7 72 c0 7e c0 e8 df 9f 85 f7 eb 0b <c7> 04 25 00 00 00 00 78 00 00 00 c9 31 c0 c3 00 00 00 00 00 00 00
[ 2620.951062] RSP: 0018:ffffad3c8578bc78 EFLAGS: 00010246
[ 2620.951063] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[ 2620.951064] RDX: 0000000000000000 RSI: ffff912c741a0980 RDI: ffff912c741a0980
[ 2620.951065] RBP: ffffad3c8578bc78 R08: 0000000000000000 R09: ffffad3c8578ba68
[ 2620.951066] R10: ffffad3c8578ba60 R11: ffff912c7fec83e8 R12: ffffffffc0781000
[ 2620.951067] R13: ffff912b450a6110 R14: 0000000000000000 R15: ffffffffc07ed000
[ 2620.951068] FS:  00007f526a3a1b80(0000) GS:ffff912c74180000(0000) knlGS:0000000000000000
[ 2620.951069] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2620.951070] CR2: 0000000000000000 CR3: 0000000111048003 CR4: 00000000003706e0

参考

https://course.0voice.com/v1/course/intro?courseId=2&agentId=0


以上是关于Linux内核中oops 错误解析以及问题定位的主要内容,如果未能解决你的问题,请参考以下文章

[RK3568][Android11]内核Oops日志分析

[RK3568][Android11]内核Oops日志分析

Linux内存管理 (23)一个内存Oops解析

oops信息的分析

linux kernel elv_queue_empty野指针访问内核故障定位与解决

Linux内核崩溃时如何显示Oops信息