部署k8s集群+ceph存储出现依赖性缺失的故障案例(最小化系统arm架构)

Posted kiroct

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了部署k8s集群+ceph存储出现依赖性缺失的故障案例(最小化系统arm架构)相关的知识,希望对你有一定的参考价值。

环境:最小化安装的系统(以redhat系为例)


报障案例1

故障:k8s集群拉起来了,ceph也起来了,但是在安装docker镜像仓库的时候,发现仓库一直处于pending的状态,重启也无法解决问题。


```html/xml
排障过程:首先是通过Kubectl describe pod -n namespace 来查看,发现pv和pvc挂掉了
接着我们继续查看pv和pvc的情况,发现cepth报错
最后我们查看cepth容器的日志,发现是容器内存ceph的配置文件出错
解决方案:cepth这里是运行早容器里面的,cephmount官方的默认配置是支持xfs文件系统格式,但是我们这边配置的是ext4格式的
操作系 统,只能去cepth.conf的配置文件里面修改本容器ceph的文件格式,再重启ceph即可。后面重新部署docker仓库
的时候就可以部署了。


----
### 日志保障2
```html/xml
2022-07-04 10:02:18 [INFO] - fatal: [master01]: 
FAILED! => "changed": false, "failures": [], "msg": "
Depsolve Error occured: \\n Problem: conflicting requests\\n  
- nothing provides libcrypto.so.10()(64bit) needed by 
- cephmount-1.0.0-1.aarch64\\n  - nothing provides
- libcrypto.so.10(libcrypto.so.10)(64bit) needed by cephmount-1.0.0-1.aarch64", "rc": 1, "results": []

注释:原因是缺少openssl及其依赖包,使用外网下载最新版本的openssl-1:1.1.1m-5 后,依旧出现如下问题。
解决方案: 首先去我们自制的软件仓库使用rpm -ivh cephmount-1.0.0-1.aarch64.rpm 。提示报错,原因是需要openssl-1:1.0.2k的低版本的openssl。我们这边只需要重新打包一个最新的cephmount ,rpm包即可。或者如果是使用开源技术的话直接去外网下载最新的cephmount

总结:cephmount版本过低,无法兼容最新的opnessl


报障日志3

fatal: [master01]: FAILED! => "changed": true, "cmd": "kubeadm init --config /etc/kubernetes/kubeadm-config.yaml -v=5", "delta": "0:00:00.380079", "end": "2022-07-04 09:30:56.707434", "msg": "non-zero return code", "rc": 1, "start": "2022-07-04 09:30:56.327355", "stderr": "I0704 09:30:56.427082 207693 initconfiguration.go:246] loading configuration from \\"/etc/kubernetes/kubeadm-config.yaml\\"\\nI0704 09:30:56.440094 207693 interface.go:431] Looking for default routes with IPv4 addresses\\nI0704 09:30:56.440134 207693 interface.go:436] Default route transits interface \\"enp0s18\\"\\nI0704 09:30:56.440506 207693 interface.go:208] Interface enp0s18 is up\\nI0704 09:30:56.440690 207693 interface.go:256] Interface \\"enp0s18\\" has 2 addresses :[10.165.141.79/24 fe80::9c6d:aaa3:9e6b:2d60/64].\\nI0704 09:30:56.440742 207693 interface.go:223] Checking addr 10.165.141.79/24.\\nI0704 09:30:56.440759 207693 interface.go:230] IP found 10.165.141.79\\nI0704 09:30:56.440775 207693 interface.go:262] Found valid IPv4 address 10.165.141.79 for interface \\"enp0s18\\".\\nI0704 09:30:56.440789 207693 interface.go:442] Found active IP 10.165.141.79 \\nI0704 09:30:56.452720 207693 checks.go:582] validating Kubernetes and kubeadm version\\nI0704 09:30:56.452789 207693 checks.go:167] validating if the firewall is enabled and active\\nI0704 09:30:56.486520 207693 checks.go:202] validating availability of port 6443\\nI0704 09:30:56.487059 207693 checks.go:202] validating availability of port 10259\\nI0704 09:30:56.487141 207693 checks.go:202] validating availability of port 10257\\nI0704 09:30:56.487196 207693 checks.go:287] validating the existence of file /etc/kubernetes/manifests/kube-apiserver.yaml\\nI0704 09:30:56.487248 207693 checks.go:287] validating the existence of file /etc/kubernetes/manifests/kube-controller-manager.yaml\\nI0704 09:30:56.487270 207693 checks.go:287] validating the existence of file /etc/kubernetes/manifests/kube-scheduler.yaml\\nI0704 09:30:56.487286 207693 checks.go:287] validating the existence of file /etc/kubernetes/manifests/etcd.yaml\\nI0704 09:30:56.487302 207693 checks.go:437] validating if the connectivity type is via proxy or direct\\nI0704 09:30:56.487371 207693 checks.go:476] validating http connectivity to first IP address in the CIDR\\nI0704 09:30:56.487421 207693 checks.go:476] validating http connectivity to first IP address in the CIDR\\nI0704 09:30:56.487446 207693 checks.go:103] validating the container runtime\\nI0704 09:30:56.513452 207693 checks.go:377] validating the presence of executable crictl\\nI0704 09:30:56.513568 207693 checks.go:336] validating the contents of file /proc/sys/net/bridge/bridge-nf-call-iptables\\nI0704 09:30:56.513709 207693 checks.go:336] validating the contents of file /proc/sys/net/ipv4/ip_forward\\nI0704 09:30:56.513791 207693 checks.go:654] validating whether swap is enabled or not\\nI0704 09:30:56.513882 207693 checks.go:377] validating the presence of executable conntrack\\nI0704 09:30:56.513934 207693 checks.go:377] validating the presence of executable ip\\nI0704 09:30:56.513964 207693 checks.go:377] validating the presence of executable iptables\\nI0704 09:30:56.514102 207693 checks.go:377] validating the presence of executable mount\\nI0704 09:30:56.514147 207693 checks.go:377] validating the presence of executable nsenter\\nI0704 09:30:56.514176 207693 checks.go:377] validating the presence of executable ebtables\\nI0704 09:30:56.514272 207693 checks.go:377] validating the presence of executable ethtool\\nI0704 09:30:56.514315 207693 checks.go:377] validating the presence of executable socat\\nI0704 09:30:56.514354 207693 checks.go:377] validating the presence of executable tc\\nI0704 09:30:56.514400 207693 checks.go:377] validating the presence of executable touch\\nI0704 09:30:56.514435 207693 checks.go:525] running all checks\\nI0704 09:30:56.531998 207693 checks.go:408] checking whether the given node name is valid and reachable using net.LookupHost\\nI0704 09:30:56.532455 207693 checks.go:623] validating kubelet version\\nI0704 09:30:56.668051 207693 checks.go:129] validating if the \\"kubelet\\" service is enabled and active\\n\\t[WARNING Service-Kubelet]: kubelet service is not enabled, please run systemctl enable kubelet.service\\nI0704 09:30:56.702480 207693 checks.go:202] validating availability of port 10250\\nI0704 09:30:56.702604 207693 checks.go:202] validating availability of port 2379\\nI0704 09:30:56.702683 207693 checks.go:202] validating availability of port 2380\\nI0704 09:30:56.702748 207693 checks.go:250] validating the existence and emptiness of directory /u01/local/kube-system/etcd/\\n[preflight] Some fatal errors occurred:\\n\\t[ERROR KubeletVersion]: the kubelet version is higher than the control plane version. This is not a supported version skew and may lead to a malfunctional cluster. Kubelet version: \\"1.24.2\\" Control plane version: \\"1.21.8\\"\\n[preflight] If you know what you are doing, you can make a check non-fatal with --ignore-preflight-errors=...\\nerror execution phase preflight\\nk8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(Runner).Run.func1\\n\\t/root/kubernetes-1.21.8/_output/local/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:235\\nk8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(Runner).visitAll\\n\\t/root/kubernetes-1.21.8/_output/local/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:421\\nk8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(Runner).Run\\n\\t/root/kubernetes-1.21.8/_output/local/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:207\\nk8s.io/kubernetes/cmd/kubeadm/app/cmd.newCmdInit.func1\\n\\t/root/kubernetes-1.21.8/_output/local/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/init.go:152\\nk8s.io/kubernetes/vendor/github.com/spf13/cobra.(Command).execute\\n\\t/root/kubernetes-1.21.8/_output/local/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:850\\nk8s.io/kubernetes/vendor/github.com/spf13/cobra.(Command).ExecuteC\\n\\t/root/kubernetes-1.21.8/_output/local/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:958\\nk8s.io/kubernetes/vendor/github.com/spf13/cobra.(Command).Execute\\n\\t/root/kubernetes-1.21.8/_output/local/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:895\\nk8s.io/kubernetes/cmd/kubeadm/app.Run\\n\\t/root/kubernetes-1.21.8/_output/local/go/src/k8s.io/kubernetes/cmd/kubeadm/app/kubeadm.go:50\\nmain.main\\n\\t_output/local/go/src/k8s.io/kubernetes/cmd/kubeadm/kubeadm.go:25\\nruntime.main\\n\\t/usr/local/go/src/runtime/proc.go:255\\nruntime.goexit\\n\\t/usr/local/go/src/runtime/asm_arm64.s:1133", "stderr_lines": ["I0704 09:30:56.427082 207693 initconfiguration.go:246] loading configuration from \\"/etc/kubernetes/kubeadm-config.yaml\\"", "I0704 09:30:56.440094 207693 interface.go:431] Looking for default routes with IPv4 addresses", "I0704 09:30:56.440134 207693 interface.go:436] Default route transits interface \\"enp0s18\\"", "I0704 09:30:56.440506 207693 interface.go:208] Interface enp0s18 is up", "I0704 09:30:56.440690 207693 interface.go:256] Interface \\"enp0s18\\" has 2 addresses :[10.165.141.79/24 fe80::9c6d:aaa3:9e6b:2d60/64].", "I0704 09:30:56.440742 207693 interface.go:223] Checking addr 10.165.141.79/24.", "I0704 09:30:56.440759 207693 interface.go:230] IP found 10.165.141.79", "I0704 09:30:56.440775 207693 interface.go:262] Found valid IPv4 address 10.165.141.79 for interface \\"enp0s18\\".", "I0704 09:30:56.440789 207693 interface.go:442] Found active IP 10.165.141.79 ", "I0704 09:30:56.452720 207693 checks.go:582] validating Kubernetes and kubeadm version", "I0704 09:30:56.452789 207693 checks.go:167] validating if the firewall is enabled and active", "I0704 09:30:56.486520 207693 checks.go:202] validating availability of port 6443", "I0704 09:30:56.487059 207693 checks.go:202] validating availability of port 10259", "I0704 09:30:56.487141 207693 checks.go:202] validating availability of port 10257", "I0704 09:30:56.487196 207693 checks.go:287] validating the existence of file /etc/kubernetes/manifests/kube-apiserver.yaml", "I0704 09:30:56.487248 207693 checks.go:287] validating the existence of file /etc/kubernetes/manifests/kube-controller-manager.yaml", "I0704 09:30:56.487270 207693 checks.go:287] validating the existence of file /etc/kubernetes/manifests/kube-scheduler.yaml", "I0704 09:30:56.487286 207693 checks.go:287] validating the existence of file /etc/kubernetes/manifests/etcd.yaml", "I0704 09:30:56.487302 207693 checks.go:437] validating if the connectivity type is via proxy or direct", "I0704 09:30:56.487371 207693 checks.go:476] validating http connectivity to first IP address in the CIDR", "I0704 09:30:56.487421 207693 checks.go:476] validating http connectivity to first IP address in the CIDR", "I0704 09:30:56.487446 207693 checks.go:103] validating the container runtime", "I0704 09:30:56.513452 207693 checks.go:377] validating the presence of executable crictl", "I0704 09:30:56.513568 207693 checks.go:336] validating the contents of file /proc/sys/net/bridge/bridge-nf-call-iptables", "I0704 09:30:56.513709 207693 checks.go:336] validating the contents of file /proc/sys/net/ipv4/ip_forward", "I0704 09:30:56.513791 207693 checks.go:654] validating whether swap is enabled or not", "I0704 09:30:56.513882 207693 checks.go:377] validating the presence of executable conntrack", "I0704 09:30:56.513934 207693 checks.go:377] validating the presence of executable ip", "I0704 09:30:56.513964 207693 checks.go:377] validating the presence of executable iptables", "I0704 09:30:56.514102 207693 checks.go:377] validating the presence of executable mount", "I0704 09:30:56.514147 207693 checks.go:377] validating the presence of executable nsenter", "I0704 09:30:56.514176 207693 checks.go:377] validating the presence of executable ebtables", "I0704 09:30:56.514272 207693 checks.go:377] validating the presence of executable ethtool", "I0704 09:30:56.514315 207693 checks.go:377] validating the presence of executable socat", "I0704 09:30:56.514354 207693 checks.go:377] validating the presence of executable tc", "I0704 09:30:56.514400 207693 checks.go:377] validating the presence of executable touch", "I0704 09:30:56.514435 207693 checks.go:525] running all checks", "I0704 09:30:56.531998 207693 checks.go:408] checking whether the given node name is valid and reachable using net.LookupHost", "I0704 09:30:56.532455 207693 checks.go:623] validating kubelet version", "I0704 09:30:56.668051 207693 checks.go:129] validating if the \\"kubelet\\" service is enabled and active", "\\t[WARNING Service-Kubelet]: kubelet service is not enabled, please run systemctl enable kubelet.service", "I0704 09:30:56.702480 207693 checks.go:202] validating availability of port 10250", "I0704 09:30:56.702604 207693 checks.go:202] validating availability of port 2379", "I0704 09:30:56.702683 207693 checks.go:202] validating availability of port 2380", "I0704 09:30:56.702748 207693 checks.go:250] validating the existence and emptiness of directory /u01/local/kube-system/etcd/", "[preflight] Some fatal errors occurred:", "\\t[ERROR KubeletVersion]: the kubelet version is higher than the control plane version. This is not a supported version skew and may lead to a malfunctional cluster. Kubelet version: \\"1.24.2\\" Control plane version: \\"1.21.8\\"", "[preflight] If you know what you are doing, you can make a check non-fatal with --ignore-preflight-errors=...", "error execution phase preflight", "k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(Runner).Run.func1", "\\t/root/kubernetes-1.21.8/_output/local/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:235", "k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(Runner).visitAll", "\\t/root/kubernetes-1.21.8/_output/local/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:421", "k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(Runner).Run", "\\t/root/kubernetes-1.21.8/_output/local/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:207", "k8s.io/kubernetes/cmd/kubeadm/app/cmd.newCmdInit.func1", "\\t/root/kubernetes-1.21.8/_output/local/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/init.go:152", "k8s.io/kubernetes/vendor/github.com/spf13/cobra.(Command).execute", "\\t/root/kubernetes-1.21.8/_output/local/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:850", "k8s.io/kubernetes/vendor/github.com/spf13/cobra.(Command).ExecuteC", "\\t/root/kubernetes-1.21.8/_output/local/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:958", "k8s.io/kubernetes/vendor/github.com/spf13/cobra.(Command).Execute", "\\t/root/kubernetes-1.21.8/_output/local/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:895", "k8s.io/kubernetes/cmd/kubeadm/app.Run", "\\t/root/kubernetes-1.21.8/_output/local/go/src/k8s.io/kubernetes/cmd/kubeadm/app/kubeadm.go:50", "main.main", "\\t_output/local/go/src/k8s.io/kubernetes/cmd/kubeadm/kubeadm.go:25", "runtime.main", "\\t/usr/local/go/src/runtime/proc.go:255", "runtime.goexit", "\\t/usr/local/go/src/runtime/asm_arm64.s:1133"], "stdout": "[config] WARNING: Ignored YAML document with GroupVersionKind kubeadm.k8s.io/v1beta2, Kind=JoinConfiguration\\n[init] Using Kubernetes version: v1.21.8\\n[preflight] Running pre-flight checks", "stdout_lines": ["[config] WARNING: Ignored YAML document with GroupVersionKind kubeadm.k8s.io/v1beta2, Kind=JoinConfiguration", "[init] Using Kubernetes version: v1.21.8", "[preflight] Running pre-flight checks"]
```html/xml
原因:在安装的时候已经存在高版本的k8s了,我这边的产品需求的是1218版本,
而这台机器可能因为连接外网的yum源的缘故,导致在安装k8s的时候是最新版本的1242版本
。我们自制软件仓库里的1218版本无法安装

解决方案:在我部署我的产品前,我先卸载掉1242版本的k8s,
并且删除掉对应的缓存,然后再重新安装我指定的1218的k8s

以上是关于部署k8s集群+ceph存储出现依赖性缺失的故障案例(最小化系统arm架构)的主要内容,如果未能解决你的问题,请参考以下文章

k8s对接ceph存储

记一次 K8S 排错实战过程

云原生之存储实战部署Ceph分布式存储集群

Ceph高可用部署和主要组件介绍

Ceph持久化存储为k8s应用提供存储方案

⑩ OpenStack高可用集群部署方案(train版)—OpenStack对接Ceph存储