11G RAC 节点2 主机down(两个节点RAC)

Posted ss-33

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了11G RAC 节点2 主机down(两个节点RAC)相关的知识,希望对你有一定的参考价值。

--节点2 数据库日志

Mon Jul 01 06:38:22 2019
SUCCESS: diskgroup SAS_ARCH was dismounted
Mon Jul 01 06:38:22 2019
Shutting down instance (abort)
License high water mark = 1923
USER (ospid: 82381): terminating the instance
Mon Jul 01 06:38:22 2019
opiodr aborting process unknown ospid (12589) as a result of ORA-1092
Mon Jul 01 06:38:22 2019
opiodr aborting process unknown ospid (45276) as a result of ORA-1092
Mon Jul 01 06:38:22 2019
opiodr aborting process unknown ospid (107399) as a result of ORA-1092
Instance terminated by USER, pid = 82381
Mon Jul 01 06:38:24 2019
Instance shutdown complete

 

--主机日志

Jul 1 06:35:01 test2 auditd[16253]: Audit daemon rotating log files
Jul 1 06:38:19 test2 init: oracle-ohasd main process (15639) killed by TERM signal
Jul 1 06:38:19 test2 init: oracle-tfa main process (15638) killed by TERM signal
Jul 1 06:38:19 test2 init: tty (/dev/tty2) main process (16997) killed by TERM signal
Jul 1 06:38:19 test2 init: tty (/dev/tty3) main process (16999) killed by TERM signal
Jul 1 06:38:19 test2 init: tty (/dev/tty4) main process (17004) killed by TERM signal
Jul 1 06:38:19 test2 init: tty (/dev/tty5) main process (17006) killed by TERM signal
Jul 1 06:38:19 test2 init: tty (/dev/tty6) main process (17008) killed by TERM signal
Jul 1 06:38:19 test2 gnome-session[17110]: WARNING: Failed to send buffer
Jul 1 06:38:19 test2 gnome-session[17110]: WARNING: Failed to send buffer
Jul 1 06:38:23 test2 ntpd[90741]: Deleting interface #15 bond0:1, 10.1.11.103#123, interface stats: received=1410, sent=0, dropped=0, active_time=56169415 secs
Jul 1 06:38:39 test2 pulseaudio[17164]: pid.c: Failed to open PID file ‘/var/lib/gdm/.pulse/45593399e441b14e2757581a00000028-runtime/pid‘: No such file or directory
Jul 1 06:38:39 test2 pulseaudio[17164]: pid.c: Failed to open PID file ‘/var/lib/gdm/.pulse/45593399e441b14e2757581a00000028-runtime/pid‘: No such file or directory
Jul 1 06:38:46 test2 ntpd[90741]: Deleting interface #14 bond1:1, 169.254.7.117#123, interface stats: received=0, sent=0, dropped=0, active_time=56169467 secs
Jul 1 06:38:51 test2 abrtd: Got signal 15, exiting
Jul 1 06:38:51 test2 xinetd[45495]: Exiting...
Jul 1 06:38:51 test2 acpid: exiting
Jul 1 06:38:51 test2 ntpd[90741]: ntpd exiting on signal 15
Jul 1 06:38:53 test2 init: Disconnected from system bus
Jul 1 06:38:53 test2 rtkit-daemon[17166]: Demoting known real-time threads.
Jul 1 06:38:53 test2 rtkit-daemon[17166]: Demoted 0 threads.
Jul 1 06:38:53 test2 auditd[16253]: The audit daemon is exiting.
Jul 1 06:38:53 test2 kernel: type=1305 audit(1561934333.370:37053744): audit_pid=0 old=16253 auid=4294967295 ses=4294967295 res=1
Jul 1 06:38:53 test2 kernel: type=1305 audit(1561934333.475:37053745): audit_enabled=0 old=1 auid=4294967295 ses=4294967295 res=1
Jul 1 06:38:53 test2 kernel: Kernel logging (proc) stopped.
Jul 1 06:38:53 test2 rsyslogd: [origin software="rsyslogd" swVersion="5.8.10" x-pid="16275" x-info="http://www.rsyslog.com"] exiting on signal 15.

 

---节点2 GRID 日志 /u01/11.2.0/grid/log/test2 下面的alertbapdb2.log
2019-07-01 06:34:49.606:
[client(75150)]CRS-0009:log file "/u01/11.2.0/grid/log/test2/client/olsnodes.log" reopened
2019-07-01 06:34:49.606:
[client(75150)]CRS-0019:file rotation terminated. log file: "/u01/11.2.0/grid/log/test2/client/olsnodes.log"
2019-07-01 06:38:33.151:
[/u01/11.2.0/grid/bin/orarootagent.bin(106660)]CRS-5822:Agent ‘/u01/11.2.0/grid/bin/orarootagent_root‘ disconnected from server. Details at (:CRSAGF00117:) 0:5:52057 in /u01/11.2.0/grid/log/test2/agent/crsd/orarootagent_root//orarootagent_root.log.
LFI-01523: rename() failed.

2019-07-01 06:34:49.606:
[client(75150)]CRS-0009:log file "/u01/11.2.0/grid/log/test2/client/olsnodes.log" reopened
2019-07-01 06:34:49.606:
[client(75150)]CRS-0019:file rotation terminated. log file: "/u01/11.2.0/grid/log/test2/client/olsnodes.log"
2019-07-01 06:38:33.151:
[/u01/11.2.0/grid/bin/orarootagent.bin(106660)]CRS-5822:Agent ‘/u01/11.2.0/grid/bin/orarootagent_root‘ disconnected from server. Details at (:CRSAGF00117:) 0:5:52057 in /u01/11.2.0/grid/log/test2/agent/crsd/orarootagent_root//orarootagent_root.log.
2019-07-01 06:38:33.887:
[ctssd(104917)]CRS-2405:The Cluster Time Synchronization Service on host test2 is shutdown by user
2019-07-01 06:38:33.892:
[mdnsd(103640)]CRS-5602:mDNS service stopping by request.
2019-07-01 06:38:45.860:
[cssd(103758)]CRS-1603:CSSD on node test2 shutdown by user.
2019-07-01 06:38:45.970:
[ohasd(103446)]CRS-2767:Resource state recovery not attempted for ‘ora.cssdmonitor‘ as its target state is OFFLINE
2019-07-01 06:38:46.064:
[cssd(103758)]CRS-1660:The CSS daemon shutdown has completed
2019-07-01 06:38:49.592:
[gpnpd(103651)]CRS-2329:GPNPD on node test2 shutdown.
2019-07-01 09:28:04.022:
[ohasd(17090)]CRS-2112:The OLR service started on node test2.
2019-07-01 09:28:04.069:
[ohasd(17090)]CRS-1301:Oracle High Availability Service started on node test2.

 

rac是通过几个必要条件进行通信,时间,磁盘心跳,链路心跳,缺一不可。

---节点1 日志

Mon Jul 01 06:38:24 2019
Reconfiguration started (old inc 16, new inc 18)
List of instances:
1 (myinst: 1)
Global Resource Directory frozen
* dead instance detected - domain 0 invalid = TRUE
Communication channels reestablished
Master broadcasted resource hash value bitmaps
Non-local Process blocks cleaned out
Mon Jul 01 06:38:25 2019
LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
Mon Jul 01 06:38:25 2019
LMS 3: 2 GCS shadows cancelled, 1 closed, 0 Xw survived
Mon Jul 01 06:38:25 2019
Mon Jul 01 06:38:25 2019
LMS 2: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
LMS 1: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
Mon Jul 01 06:38:36 2019
Set master node info
Submitted all remote-enqueue requests
Dwn-cvts replayed, VALBLKs dubious
All grantable enqueues granted
Post SMON to start 1st pass IR
Mon Jul 01 06:38:36 2019
Instance recovery: looking for dead threads
Beginning instance recovery of 1 threads
Mon Jul 01 06:38:52 2019
parallel recovery started with 32 processes
Started redo scan
Completed redo scan
read 12123 KB redo, 6138 data blocks need recovery
Mon Jul 01 06:38:55 2019
Submitted all GCS remote-cache requests
Post SMON to start 1st pass IR
Fix write in gcs resources
Mon Jul 01 06:39:07 2019
Reconfiguration complete
Mon Jul 01 06:39:32 2019
Started redo application at
Thread 2: logseq 218275, block 1708335

 

---原因:
2019-07-01 06:38:33.887:
[ctssd(104917)]CRS-2405:The Cluster Time Synchronization Service on host test2 is shutdown by user

主机test2上的集群时间同步服务由用户关闭

主机 Bios 时间不一致;

[[email protected] ~]$ su - root
Password:
[[email protected] ~]# hwclock
Mon 01 Jul 2019 11:27:27 AM CST -0.485777 seconds
[[email protected] ~]# date
Mon Jul 1 10:44:03 CST 2019

[[email protected] ~]# hwclock
Mon 01 Jul 2019 10:42:33 AM CST -0.219479 seconds
[[email protected] ~]# date
Mon Jul 1 10:42:36 CST 2019


--同步方式

--节点1 cat /etc/ntp.conf

server pbsntp01.sx.com iburst
server pbsntp02.sx.com iburst


--节点2 修改后 cat /etc/ntp.conf
server 10.0.10.2 iburst
#server pbsntp02.sx.com iburst

 

以上是关于11G RAC 节点2 主机down(两个节点RAC)的主要内容,如果未能解决你的问题,请参考以下文章

[转帖]Oracle 11G RAC For Windows 2008 R2部署手册

Oracle 11g 两个节点RAC 搭建单实例DG详细步骤以及注意事项

linux安装oracle 11g rac

11g两节点RAC添加第三个节点

安装 Oracle 11g RAC(空集群节点)时出现问题 INS-35423

orcle 11g rac crs状态正常,节点2数据库未启动