LVM故障导致RHCS启动故障

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了LVM故障导致RHCS启动故障相关的知识,希望对你有一定的参考价值。

1、故障提示

抓取资源管理日志发现提示如下错误

tail -f /var/log/cluster/rgmanager.log

May  6 18:21:24 yktdb1 rgmanager[17425]: State change: Local UP

May  6 18:21:24 yktdb1 rgmanager[17425]: Starting stopped service service:yktoracle

May  6 18:21:24 yktdb1 rgmanager[18533]: [lvm] HA LVM:  Improper setup detected

May  6 18:21:24 yktdb1 rgmanager[18555]: [lvm] * "volume_list" not specified in lvm.conf.

May  6 18:21:24 yktdb1 rgmanager[17425]: start on lvm "yktoracledb" returned 1 (generic error)

May  6 18:21:24 yktdb1 rgmanager[17425]: #68: Failed to start service:yktoracle; return value: 1

May  6 18:21:24 yktdb1 rgmanager[17425]: Stopping service service:yktoracle

May  6 18:21:25 yktdb1 rgmanager[18586]: [script] Executing /etc/init.d/dbora stop

May  6 18:21:25 yktdb1 rgmanager[18682]: [fs] stop: Could not match /dev/yktoracledb/oracledblv with a real device

May  6 18:21:25 yktdb1 rgmanager[18720]: [lvm] HA LVM:  Improper setup detected

May  6 18:21:25 yktdb1 rgmanager[18742]: [lvm] * "volume_list" not specified in lvm.conf.

May  6 18:21:25 yktdb1 rgmanager[18778]: [lvm] Deactivating yktoracledb/oracledblv

May  6 18:21:25 yktdb1 rgmanager[18800]: [lvm] Making resilient : lvchange -an yktoracledb/oracledblv

May  6 18:21:25 yktdb1 rgmanager[18825]: [lvm] Resilient command: lvchange -an yktoracledb/oracledblv --config devices{filter=["a|/dev/mapper/LUN-1800G|","a|/dev/mappe

May  6 18:21:26 yktdb1 rgmanager[17425]: Service service:yktoracle is recovering

May  6 18:21:26 yktdb1 rgmanager[17425]: #71: Relocating failed service service:yktoracle

May  6 18:21:26 yktdb1 rgmanager[17425]: Service service:yktoracle is stopped

May  6 18:21:35 yktdb1 rgmanager[17425]: State change: 192.168.10.2 UP

May  6 18:21:35 yktdb1 rgmanager[17425]: Starting stopped service service:yktoracle

May  6 18:21:36 yktdb1 rgmanager[18886]: [lvm] HA LVM:  Improper setup detected

May  6 18:21:36 yktdb1 rgmanager[18908]: [lvm] * "volume_list" not specified in lvm.conf.

May  6 18:21:36 yktdb1 rgmanager[17425]: start on lvm "yktoracledb" returned 1 (generic error)

May  6 18:21:36 yktdb1 rgmanager[17425]: #68: Failed to start service:yktoracle; return value: 1

May  6 18:21:36 yktdb1 rgmanager[17425]: Stopping service service:yktoracle

May  6 18:21:36 yktdb1 rgmanager[18939]: [script] Executing /etc/init.d/dbora stop

May  6 18:21:36 yktdb1 rgmanager[19035]: [fs] stop: Could not match /dev/yktoracledb/oracledblv with a real device

May  6 18:21:36 yktdb1 rgmanager[19073]: [lvm] HA LVM:  Improper setup detected

May  6 18:21:37 yktdb1 rgmanager[19095]: [lvm] * "volume_list" not specified in lvm.conf.

May  6 18:21:37 yktdb1 rgmanager[19131]: [lvm] Deactivating yktoracledb/oracledblv

May  6 18:21:37 yktdb1 rgmanager[19153]: [lvm] Making resilient : lvchange -an yktoracledb/oracledblv

May  6 18:21:37 yktdb1 rgmanager[19178]: [lvm] Resilient command: lvchange -an yktoracledb/oracledblv --config devices{filter=["a|/dev/mapper/LUN-1800G|","a|/dev/mappe

May  6 18:21:37 yktdb1 rgmanager[17425]: Service service:yktoracle is recovering

May  6 18:21:37 yktdb1 rgmanager[17425]: #71: Relocating failed service service:yktoracle

May  6 18:21:39 yktdb1 rgmanager[17425]: Service service:yktoracle is stopped

查看lvdiskplay  对应的oracledblv 状态提示 Not available

在/dev/yktoraclevg/下面竟然没有这个oracledblv

除非把clvmd停止后才这个在/dev/yktoarclevg/里就可以看了

查了好多资料都不知道怎么回事

查到一个service clvmd status 后发现 集群 vg和lv都是显示none

这一下让我找到了问题所在

直接用命令vgchange -cy yktoracledb 

在查看service clvmd status 

[[email protected] ~]#   service clvmd status

clvmd (pid  7550) 正在运行...

Clustered Volume Groups: yktoracledb

Active clustered Logical Volumes: oracledblv ysbaklv test

[[email protected] ~]# 

已经可以看见集群共享的vg和lv了

在查看集群状态正常了服务也启动了,然后对这个两个节点测试是否可以正常切换。

[[email protected] ~]# clustat 

Cluster Status for ytkcluter @ Sun May  7 11:53:49 2017

Member Status: Quorate


 Member Name                                                 ID   Status

 ------ ----                                                 ---- ------

 192.168.10.1                                                    1 Online, Local, rgmanager

 192.168.10.2                                                    2 Online, rgmanager


 Service Name                                       Owner (Last)                                       State         

 ------- ----                                       ----- ------                                       -----         

 service:yktoracle                                  192.168.10.1                                       started   


本文出自 “itgg1982” 博客,转载请与作者联系!

以上是关于LVM故障导致RHCS启动故障的主要内容,如果未能解决你的问题,请参考以下文章

OS-Linux-后台启动与前台启动导致的差异故障-文件加载异常

esxi光纤卡故障导致虚拟机无法启动

Linux故障处理系统启动类故障

由于dns服务为启动导致的GI集群启动故障

一次vm 虚拟机时间倒流而导致的oracle 数据库启动故障

RHCS