测试库异常down分析(abnormal instance termination)

Posted lvcha001

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了测试库异常down分析(abnormal instance termination)相关的知识,希望对你有一定的参考价值。

客户测试库,down问题分析,根据alert 的问题指向,实例异常终止,但是无其它有价值的信息

Terminating the Instance Due to Error 471 Out-Of-Memory(OOM) Killer Crashes Oracle Database (Doc ID 1622379.1)    

SYMPTOMS

Instance terminated due to death of background process. In this case, it was DBWR.

No more information in alert / traces why DBWR process dead.

Tue Feb 04 13:00:03 2014
LNS: Standby redo logfile selected for thread 1 sequence 6206 for destination LOG_ARCHIVE_DEST_2
Archived Log entry 10547 added for thread 1 sequence 6205 ID 0x77e10623 dest 1:
Tue Feb 04 13:05:09 2014
LGWR waiting for instance termination
Tue Feb 04 13:05:15 2014
System state dump requested by (instance=1, osid=13406 (PMON)), summary=[abnormal instance termination].
System State dumped to trace file /opt/oracle/diag/rdbms/<SID>/<SID>/trace/<SID>_diag_13429.trc
Tue Feb 04 13:05:21 2014
PMON (ospid: 13406): terminating the instance due to error 471
Tue Feb 04 13:05:21 2014

根据操作系统版本,查询相应操作系统日志,可以明确得到如下关于kill spid信息,匹配上了,并且查询操作系统free -m 与sga_max_target参数匹配

OOM机制kill process,是默认情况下启用的Linux功能。当内存压力很大时,它是一种采用Linux内核的自我保护机制。

CAUSE

As per os logs, OOM killer killed Oracle background process to free up memory.

Feb 4 13:05:15 <HOST> kernel: Out of memory: Kill process 13439 (oracle) score 239 or sacrifice child
Feb 4 13:05:15 <HOST> kernel: Killed process 13439 (oracle) total-vm:52681184kB, anon-rss:12404kB, file-rss:22732576kB
Feb 4 13:05:20 <HOST> kernel: zabbix_agentd invoked oom-killer: gfp_mask=0x201da, order=0, oom_adj=0, oom_score_adj=0
Feb 4 13:05:20 <HOST> kernel: zabbix_agentd cpuset=/ mems_allowed=0
Feb 4 13:05:20 <HOST> kernel: Pid: 1750, comm: zabbix_agentd Not tainted 2.6.39-400.209.1.el6uek.x86_64 #1

 

SOLUTION

OOM killer, is a Linux feature that is enabled by default. It is a self protection mechanism employed the Linux kernel when under severe memory pressure.

Please check below note for more information:

Linux: Out-of-Memory (OOM) Killer (Doc ID 452000.1)


 Solution: is to add more RAM / swap to server to avoid this issue. or engage your OS administrator to address the memory shortage problem.

 

以上是关于测试库异常down分析(abnormal instance termination)的主要内容,如果未能解决你的问题,请参考以下文章

Python----文件和异常

Android 模拟器视觉效果看起来很大(异常)

记一次redis集群异常.(error) CLUSTERDOWN The cluster is down

C++异常分析使用Process Explorer和Dependency Walker定位dll库动态启动失败的问题

记一次ceph心跳机制异常的案例

[原]排错实战——通过对比分析sysinternals事件修复程序功能异常