崩溃程序不生成核心转储
Posted
技术标签:
【中文标题】崩溃程序不生成核心转储【英文标题】:Crashing program does not generate core dump 【发布时间】:2019-01-16 15:16:40 【问题描述】:我写了一个 C 程序,有时几天后就死掉了。它在嵌入式设备上运行,因此很难正确调试问题(没有本地 gdb,没有 valgrind,但我有 strace)。
它死时不会生成核心文件,即使使用了ulimit -c unlimited
。
当它死亡时,控制台上显示的所有内容都被“杀死”。程序本身的日志没有帮助。
我怀疑是缓冲区溢出、内存溢出(缺少free
)或多线程问题。
我没有在代码中使用信号处理程序(有帮助吗?)。这个kill -9
来自哪里?!?
我尝试了以下方法:
$ ./MyProg
killed
$ time -v ./MyProg
Command terminated by signal 9
Command being timed: "./MyProg"
User time (seconds): 762.04
System time (seconds): 1360.74
Percent of CPU this job got: 2%
Elapsed (wall clock) time (h:mm:ss or m:ss): 23h 4m 23s
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 0
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 40
Minor (reclaiming a frame) page faults: 29567
Voluntary context switches: 4742276
Involuntary context switches: 187702
Swaps: 0
File system inputs: 0
File system outputs: 0
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
$ cat /proc/PID/smaps # Shortly before the crash
10000000-10042000 r-xp 00000000 00:0b 203162716 /root/MyProg
Size: 264 kB
Rss: 224 kB
Pss: 224 kB
Shared_Clean: 0 kB
Shared_Dirty: 0 kB
Private_Clean: 224 kB
Private_Dirty: 0 kB
Referenced: 120 kB
10052000-10055000 rwxp 00042000 00:0b 203162716 /root/MyProg
Size: 12 kB
Rss: 12 kB
Pss: 12 kB
Shared_Clean: 0 kB
Shared_Dirty: 0 kB
Private_Clean: 0 kB
Private_Dirty: 12 kB
Referenced: 12 kB
10055000-1706a000 rwxp 10055000 00:00 0 [heap]
Size: 114772 kB
Rss: 114716 kB
Pss: 114716 kB
Shared_Clean: 0 kB
Shared_Dirty: 0 kB
Private_Clean: 0 kB
Private_Dirty: 114716 kB
Referenced: 114716 kB
30000000-30005000 r-xp 00000000 00:0b 135513112 /lib/ld-uClibc-0.9.29.so
Size: 20 kB
Rss: 20 kB
Pss: 1 kB
Shared_Clean: 20 kB
Shared_Dirty: 0 kB
Private_Clean: 0 kB
Private_Dirty: 0 kB
Referenced: 20 kB
30005000-30006000 rw-p 30005000 00:00 0
Size: 4 kB
Rss: 4 kB
Pss: 4 kB
Shared_Clean: 0 kB
Shared_Dirty: 0 kB
Private_Clean: 0 kB
Private_Dirty: 4 kB
Referenced: 4 kB
30014000-30015000 r--p 00004000 00:0b 135513112 /lib/ld-uClibc-0.9.29.so
Size: 4 kB
Rss: 4 kB
Pss: 4 kB
Shared_Clean: 0 kB
Shared_Dirty: 0 kB
Private_Clean: 0 kB
Private_Dirty: 4 kB
Referenced: 4 kB
30015000-30016000 rwxp 00005000 00:0b 135513112 /lib/ld-uClibc-0.9.29.so
Size: 4 kB
Rss: 4 kB
Pss: 4 kB
Shared_Clean: 0 kB
Shared_Dirty: 0 kB
Private_Clean: 0 kB
Private_Dirty: 4 kB
Referenced: 4 kB
30016000-30027000 r-xp 00000000 00:0b 135513124 /lib/libm-0.9.29.so
Size: 68 kB
Rss: 12 kB
Pss: 1 kB
Shared_Clean: 12 kB
Shared_Dirty: 0 kB
Private_Clean: 0 kB
Private_Dirty: 0 kB
Referenced: 12 kB
30027000-30036000 ---p 30027000 00:00 0
Size: 60 kB
Rss: 0 kB
Pss: 0 kB
Shared_Clean: 0 kB
Shared_Dirty: 0 kB
Private_Clean: 0 kB
Private_Dirty: 0 kB
Referenced: 0 kB
30036000-30037000 r--p 00010000 00:0b 135513124 /lib/libm-0.9.29.so
Size: 4 kB
Rss: 4 kB
Pss: 0 kB
Shared_Clean: 4 kB
Shared_Dirty: 0 kB
Private_Clean: 0 kB
Private_Dirty: 0 kB
Referenced: 4 kB
30037000-30038000 rwxp 00011000 00:0b 135513124 /lib/libm-0.9.29.so
Size: 4 kB
Rss: 4 kB
Pss: 4 kB
Shared_Clean: 0 kB
Shared_Dirty: 0 kB
Private_Clean: 0 kB
Private_Dirty: 4 kB
Referenced: 4 kB
30038000-30043000 r-xp 00000000 00:0b 135513129 /lib/libpthread-0.9.29.so
Size: 44 kB
Rss: 44 kB
Pss: 44 kB
Shared_Clean: 0 kB
Shared_Dirty: 0 kB
Private_Clean: 44 kB
Private_Dirty: 0 kB
Referenced: 20 kB
30043000-30052000 ---p 30043000 00:00 0
Size: 60 kB
Rss: 0 kB
Pss: 0 kB
Shared_Clean: 0 kB
Shared_Dirty: 0 kB
Private_Clean: 0 kB
Private_Dirty: 0 kB
Referenced: 0 kB
30052000-30053000 r--p 0000a000 00:0b 135513129 /lib/libpthread-0.9.29.so
Size: 4 kB
Rss: 4 kB
Pss: 4 kB
Shared_Clean: 0 kB
Shared_Dirty: 0 kB
Private_Clean: 0 kB
Private_Dirty: 4 kB
Referenced: 4 kB
30053000-30058000 rwxp 0000b000 00:0b 135513129 /lib/libpthread-0.9.29.so
Size: 20 kB
Rss: 8 kB
Pss: 8 kB
Shared_Clean: 0 kB
Shared_Dirty: 0 kB
Private_Clean: 0 kB
Private_Dirty: 8 kB
Referenced: 8 kB
30058000-3005a000 rwxp 30058000 00:00 0
Size: 8 kB
Rss: 0 kB
Pss: 0 kB
Shared_Clean: 0 kB
Shared_Dirty: 0 kB
Private_Clean: 0 kB
Private_Dirty: 0 kB
Referenced: 0 kB
3005a000-3005b000 r-xp 00000000 00:0b 135513131 /lib/librt-0.9.29.so
Size: 4 kB
Rss: 4 kB
Pss: 2 kB
Shared_Clean: 4 kB
Shared_Dirty: 0 kB
Private_Clean: 0 kB
Private_Dirty: 0 kB
Referenced: 0 kB
3005b000-3006a000 ---p 3005b000 00:00 0
Size: 60 kB
Rss: 0 kB
Pss: 0 kB
Shared_Clean: 0 kB
Shared_Dirty: 0 kB
Private_Clean: 0 kB
Private_Dirty: 0 kB
Referenced: 0 kB
3006a000-3006b000 r--p 00000000 00:0b 135513131 /lib/librt-0.9.29.so
Size: 4 kB
Rss: 4 kB
Pss: 2 kB
Shared_Clean: 4 kB
Shared_Dirty: 0 kB
Private_Clean: 0 kB
Private_Dirty: 0 kB
Referenced: 0 kB
3006b000-3006c000 rwxp 00001000 00:0b 135513131 /lib/librt-0.9.29.so
Size: 4 kB
Rss: 4 kB
Pss: 4 kB
Shared_Clean: 0 kB
Shared_Dirty: 0 kB
Private_Clean: 0 kB
Private_Dirty: 4 kB
Referenced: 4 kB
3006c000-30079000 r-xp 00000000 00:0b 135513120 /lib/libgcc_s.so.1
Size: 52 kB
Rss: 28 kB
Pss: 21 kB
Shared_Clean: 8 kB
Shared_Dirty: 0 kB
Private_Clean: 20 kB
Private_Dirty: 0 kB
Referenced: 20 kB
30079000-30088000 ---p 30079000 00:00 0
Size: 60 kB
Rss: 0 kB
Pss: 0 kB
Shared_Clean: 0 kB
Shared_Dirty: 0 kB
Private_Clean: 0 kB
Private_Dirty: 0 kB
Referenced: 0 kB
30088000-30089000 rwxp 0000c000 00:0b 135513120 /lib/libgcc_s.so.1
Size: 4 kB
Rss: 4 kB
Pss: 4 kB
Shared_Clean: 0 kB
Shared_Dirty: 0 kB
Private_Clean: 0 kB
Private_Dirty: 4 kB
Referenced: 4 kB
30089000-300d0000 r-xp 00000000 00:0b 135513132 /lib/libuClibc-0.9.29.so
Size: 284 kB
Rss: 188 kB
Pss: 22 kB
Shared_Clean: 180 kB
Shared_Dirty: 0 kB
Private_Clean: 8 kB
Private_Dirty: 0 kB
Referenced: 164 kB
300d0000-300df000 ---p 300d0000 00:00 0
Size: 60 kB
Rss: 0 kB
Pss: 0 kB
Shared_Clean: 0 kB
Shared_Dirty: 0 kB
Private_Clean: 0 kB
Private_Dirty: 0 kB
Referenced: 0 kB
300df000-300e0000 r--p 00046000 00:0b 135513132 /lib/libuClibc-0.9.29.so
Size: 4 kB
Rss: 4 kB
Pss: 4 kB
Shared_Clean: 0 kB
Shared_Dirty: 0 kB
Private_Clean: 0 kB
Private_Dirty: 4 kB
Referenced: 4 kB
300e0000-300e1000 rwxp 00047000 00:0b 135513132 /lib/libuClibc-0.9.29.so
Size: 4 kB
Rss: 4 kB
Pss: 4 kB
Shared_Clean: 0 kB
Shared_Dirty: 0 kB
Private_Clean: 0 kB
Private_Dirty: 4 kB
Referenced: 4 kB
300e1000-300e6000 rwxp 300e1000 00:00 0
Size: 20 kB
Rss: 16 kB
Pss: 16 kB
Shared_Clean: 0 kB
Shared_Dirty: 0 kB
Private_Clean: 0 kB
Private_Dirty: 16 kB
Referenced: 16 kB
7f3fc000-7f400000 rwxp 7f3fc000 00:00 0
Size: 16 kB
Rss: 16 kB
Pss: 16 kB
Shared_Clean: 0 kB
Shared_Dirty: 0 kB
Private_Clean: 0 kB
Private_Dirty: 16 kB
Referenced: 16 kB
7faf8000-7fb0d000 rwxp 7ffeb000 00:00 0 [stack]
Size: 84 kB
Rss: 12 kB
Pss: 12 kB
Shared_Clean: 0 kB
Shared_Dirty: 0 kB
Private_Clean: 0 kB
Private_Dirty: 12 kB
Referenced: 12 kB
来自 /var/log/messages:
user.warn kernel: dropbear invoked oom-killer: gfp_mask=0x1201d2, order=0, oomkilladj=0
user.warn kernel: Call Trace:
user.warn kernel: show_stack+0x50/0x184 (unreliable)
user.warn kernel: oom_kill_process+0x54/0x1ac
user.warn kernel: out_of_memory+0x1a8/0x1dc
user.warn kernel: __alloc_pages+0x24c/0x2dc
user.warn kernel: __do_page_cache_readahead+0xc4/0x220
user.warn kernel: filemap_fault+0x150/0x37c
user.warn kernel: __do_fault+0x6c/0x40c
user.warn kernel: do_page_fault+0x274/0x3ec
user.warn kernel: handle_page_fault+0xc/0x80
user.warn kernel: Mem-info:
user.warn kernel: DMA per-cpu:
user.warn kernel: CPU 0: hi: 42, btch: 7 usd: 31
user.warn kernel: Active:29872 inactive:194 dirty:0 writeback:0 unstable:0
user.warn kernel: free:356 slab:1392 mapped:84 pagetables:65 bounce:0
user.warn kernel: DMA free:1424kB min:1440kB low:1800kB high:2160kB active:119488kB inactive:776kB present:130048kB pages_scanned:194475 all_unreclaimable? yes
user.warn kernel: lowmem_reserve[]: 0 0 0
user.warn kernel: DMA: 0*4kB 0*8kB 1*16kB 0*32kB 0*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 0*2048kB 0*4096kB = 1424kB
user.warn kernel: 204 total pagecache pages
user.warn kernel: Free swap: 0kB
user.warn kernel: 32768 pages of RAM
user.warn kernel: 0 pages of HIGHMEM
user.warn kernel: 1236 free pages
user.warn kernel: 770 reserved pages
user.warn kernel: 112 pages shared
user.warn kernel: 0 pages swap cached
user.err kernel: Out of memory: kill process 206 (MyProg) score 623 or a child
user.err kernel: Killed process 206 (MyProg)
我还能尝试什么?谢谢
【问题讨论】:
信号 9 是SIGKILL
,这意味着它被故意杀死而不是崩溃。程序被杀死时的内存使用量是多少?一般的系统?当 Linux 系统内存不足时,通常会发生这种情况。
SIGKILL
可能来自诸如 OOM 杀手之类的东西。您的系统是否有/var/log/messages
或类似名称? grep -i oom /var/log/messages
可能有用。在任何情况下,您都应该检查/var/log/messages
之类的文件,以了解您的进程被杀死的时间。
由于日志中提到了oom-killer
,您的程序正在停止,因为它使用了太多内存。你可能有内存泄漏。
这似乎相关:lwn.net/Articles/104185
@Jonathan - Linux 设计导致无法解释的 OOM 杀死作为正常的业务过程。它经常过度使用内存。也就是说,当它应该返回失败时,它会在分配请求上返回成功。无需泄漏。我一直在使用 1 GB RAM 作为 Web 服务器的 GoDaddy VM 上看到它。 OOM 杀手有时会攻击 mysql 进程并破坏我们的数据库。为了避免损坏的分配器,然后切换到 Solaris。它不会超额订阅内存。
【参考方案1】:
我还能尝试什么?
我建议尝试remote gdb
debugging。而且您最好在调试主机(即您的开发笔记本电脑)上使用 Linux。
(您甚至可以使用另一个文件中的DWARF 调试信息交叉构建您的程序;我知道这是可能的,但我忘记了细节)。
如果您的嵌入式系统运行 Linux,请务必禁用 memory over-commitment。
另见this。
【讨论】:
鉴于问题,禁用过度使用应该是第一件事。以上是关于崩溃程序不生成核心转储的主要内容,如果未能解决你的问题,请参考以下文章
Sun JDK 能否在 JVM 崩溃时生成核心/堆转储文件?