linux的crash之hardlock排查记录

Posted 安庆

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了linux的crash之hardlock排查记录相关的知识,希望对你有一定的参考价值。

3.10.0-327的内核,crash记录如下:

KERNEL: vmlinux
DUMPFILE: vmcore [PARTIAL DUMP]
CPUS: 48
DATE: Wed Oct 18 20:37:18 2017
UPTIME: 1 days, 09:43:06
LOAD AVERAGE: 13.42, 10.66, 9.48
TASKS: 7329
NODENAME: host-10-229-143-10
RELEASE: 3.10.0-327.22.2.el7.x86_64
VERSION: #1 SMP Fri Sep 29 15:13:08 CST 2017
MACHINE: x86_64 (2199 Mhz)
MEMORY: 383.6 GB
PANIC: "Kernel panic - not syncing: Watchdog detected hard LOCKUP on cpu 10"
PID: 24023
COMMAND: "fas_readwriter"
TASK: ffff882f460a2e00 [THREAD_INFO: ffff882f10c44000]
CPU: 10
STATE: TASK_RUNNING (PANIC)-----------------------------------------R状态死锁,进程长时间处于TASK_RUNNING 状态抢占CPU而不发生切换,一般是,进程关抢占后一直执行任务,或者进程关抢占后处于死循环或者睡眠,此时往往会导致多个CPU互锁,整个系统异常。

crash> bt
PID: 24023 TASK: ffff882f460a2e00 CPU: 10 COMMAND: "fas_readwriter"
#0 [ffff882fbfd459c8] machine_kexec at ffffffff81051c5b
#1 [ffff882fbfd45a28] crash_kexec at ffffffff810f3ec2
#2 [ffff882fbfd45af8] panic at ffffffff816326d1
#3 [ffff882fbfd45b78] watchdog_overflow_callback at ffffffff8111d0e2
#4 [ffff882fbfd45b88] __perf_event_overflow at ffffffff811608d1
#5 [ffff882fbfd45c00] perf_event_overflow at ffffffff811613a4
#6 [ffff882fbfd45c10] intel_pmu_handle_irq at ffffffff81032628
#7 [ffff882fbfd45e60] perf_event_nmi_handler at ffffffff81642bcb
#8 [ffff882fbfd45e80] nmi_handle at ffffffff81642319
#9 [ffff882fbfd45ec8] do_nmi at ffffffff81642430
#10 [ffff882fbfd45ef0] end_repeat_nmi at ffffffff81641753
[exception RIP: put_compound_page+336]
RIP: ffffffff81178b60 RSP: ffff882f10c47d80 RFLAGS: 00000006
RAX: 006016c60138402c RBX: ffffea0123302a40 RCX: 0000000000000022
RDX: 0000000000000246 RSI: 000000000a6a9000 RDI: ffffea0123300000
RBP: ffff882f10c47d98 R8: ffff882f10c47dc8 R9: ffff882f10c47d74
R10: ffff880000000298 R11: 000000000a6aa000 R12: ffffea0123300000
R13: 0000000000000246 R14: 0000000000000000 R15: ffffea0123302a40
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
--- <NMI exception stack> ---
#11 [ffff882f10c47d80] put_compound_page at ffffffff81178b60
#12 [ffff882f10c47da0] put_page at ffffffff81178bac
#13 [ffff882f10c47db0] get_futex_key at ffffffff810e3c86
#14 [ffff882f10c47e08] futex_wake at ffffffff810e3f1a
#15 [ffff882f10c47e70] do_futex at ffffffff810e6a12
#16 [ffff882f10c47f08] sys_futex at ffffffff810e6f20
#17 [ffff882f10c47f80] system_call_fastpath at ffffffff81649909

 

首先,一般hardlock的触发原因是因为关中断时间太长,那么需要查找对应的堆栈中是否有如此处理,而关中断的常见函数,如spinlock,irq_disable等。

根据堆栈,get_futex_key有一段代码如下:

#ifdef CONFIG_TRANSPARENT_HUGEPAGE
page_head = page;
if (unlikely(PageTail(page))) {
put_page(page);
/* serialize against __split_huge_page_splitting() */
local_irq_disable();-------------------------------------------------------------------------关中断
if (likely(__get_user_pages_fast(address, 1, !ro, &page) == 1)) {------------------调用了__get_user_pages_fast
page_head = compound_head(page);
/*
* page_head is valid pointer but we must pin
* it before taking the PG_lock and/or
* PG_compound_lock. The moment we re-enable
* irqs __split_huge_page_splitting() can
* return and the head page can be freed from
* under us. We can‘t take the PG_lock and/or
* PG_compound_lock on a page that could be
* freed from under us.
*/
if (page != page_head) {
get_page(page_head);
put_page(page);
}
local_irq_enable();
} else {
local_irq_enable();
goto again;
}
}
#else
page_head = compound_head(page);
if (page != page_head) {
get_page(page_head);
put_page(page);
}
#endif

确定下CONFIG_TRANSPARENT_HUGEPAGE 是否配置了:

grep CONFIG_TRANSPARENT_HUGEPAGE /boot/config-3.10.0-327.22.2.el7.x86_64
CONFIG_TRANSPARENT_HUGEPAGE=y

说明已经配置了,反汇编get_futex_key确认下,通过简单搜索__get_user_pages_fast是否编译确认了确实开启了透明巨页。

下一步,需要分析,为什么put_page调用put_compound_page会长时间不返回。

void put_page(struct page *page)
{
if (unlikely(PageCompound(page)))
put_compound_page(page);
else if (put_page_testzero(page))
__put_single_page(page);
}

 

系统配置的/proc/sys/vm/nr_hugepages为0。























































































以上是关于linux的crash之hardlock排查记录的主要内容,如果未能解决你的问题,请参考以下文章

一次内核 crash 的排查记录

一次内核 crash 的排查记录

龙叔运维问题排查记录java线程泄露引起OOM导致JVM Crash

日志快速筛选 之 linux命令grep|uniq|wc|awk

问题记录记一次ConnectionTimeout问题排查

记一些linux安全应急排查思路和命令