ceph 2 pgs inconsistent故障

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了ceph 2 pgs inconsistent故障相关的知识,希望对你有一定的参考价值。

[root@node141 ~]# ceph health detail
HEALTH_ERR 2 scrub errors; Possible data damage: 2 pgs inconsistent
OSD_SCRUB_ERRORS 2 scrub errors
PG_DAMAGED Possible data damage: 2 pgs inconsistent
pg 3.3e is active+clean+inconsistent, acting [11,17,4]
pg 3.42 is active+clean+inconsistent, acting [17,6,0]

官网故障解决方案:
https://ceph.com/geen-categorie/ceph-manually-repair-object/

步骤如下:
(1)找出异常的PG,然后找对对应的osd,在对应的主机上进行修复
[root@node140 /]# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 8.71826 root default
-2 3.26935 host node140
0 hdd 0.54489 osd.0 up 1.00000 1.00000
1 hdd 0.54489 osd.1 up 1.00000 1.00000
2 hdd 0.54489 osd.2 up 1.00000 1.00000
3 hdd 0.54489 osd.3 up 1.00000 1.00000
4 hdd 0.54489 osd.4 up 1.00000 1.00000
5 hdd 0.54489 osd.5 up 1.00000 1.00000
-3 3.26935 host node141
12 hdd 0.54489 osd.12 up 1.00000 1.00000
13 hdd 0.54489 osd.13 up 1.00000 1.00000
14 hdd 0.54489 osd.14 up 1.00000 1.00000
15 hdd 0.54489 osd.15 down 1.00000 1.00000
16 hdd 0.54489 osd.16 up 1.00000 1.00000
17 hdd 0.54489 osd.17 up 1.00000 1.00000
-4 2.17957 host node142
6 hdd 0.54489 osd.6 up 1.00000 1.00000
9 hdd 0.54489 osd.9 up 1.00000 1.00000
10 hdd 0.54489 osd.10 up 1.00000 1.00000
11 hdd 0.54489 osd.11 up 1.00000 1.00000

##这个命令也行
[root@node140 /]# ceph osd find 11

"osd": 11,
"addrs":
"addrvec": [

"type": "v2",
"addr": "10.10.202.142:6820",
"nonce": 24423
,

"type": "v1",
"addr": "10.10.202.142:6821",
"nonce": 24423

]
,
"osd_fsid": "1e977e5f-f514-4eef-bd88-c3632d03b2c3",
"host": "node142",
"crush_location":
"host": "node142",
"root": "default"

(2)对应的问题osd 11 17 ,切换到该主机,停掉osd

[root@node142 ~]# systemctl stop ceph-osd@11

(3)将日志刷入磁盘
[root@node142 ~]# ceph-osd -i 15 --flush-journal

(4)启动osd
[root@node142 ~]# systemctl start ceph-osd@11

(5)修复pg
[root@node142 ~]# ceph pg repair pg 3.3e

###osd 17 也同样进行修复####
(6)查看状态
[root@node141 ~]# ceph health detail
HEALTH_OK

以上是关于ceph 2 pgs inconsistent故障的主要内容,如果未能解决你的问题,请参考以下文章

ceph集群报错:HEALTH_ERR 1 pgs inconsistent; 1 scrub errors

Ceph: too many PGs per OSD

ceph故障:too many PGs per OSD

CEPH -S集群报错TOO MANY PGS PER OSD

ceph修改pg inconsistent

pg inconsistent