部署的pod处于CrashLoopBackOff状态
Posted 翟海飞
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了部署的pod处于CrashLoopBackOff状态相关的知识,希望对你有一定的参考价值。
1 问题描述
使用命令kubectl create -f myubuntu_deploy.yaml --record生成pod,结果显示pod处于CrashLoopBackOff状态。
CrashLoopBackOff 告诉我们,Kubernetes 正在尽力启动这个 Pod,但是一个或多个容器已经挂了,或者正被删除。
This is what I keep getting:
[root@centos-master ~]# kubectl get pods
NAME READY STATUS RESTARTS AGE
nfs-server-h6nw8 1/1 Running 0 1h
nfs-web-07rxz 0/1 CrashLoopBackOff 8 16m
nfs-web-fdr9h 0/1 CrashLoopBackOff 8 16m
Below is output from "describe pods" kubectl describe pods
Events:
FirstSeen LastSeen Count From SubobjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
16m 16m 1 default-scheduler Normal Scheduled Successfully assigned nfs-web-fdr9h to centos-minion-2
16m 16m 1 kubelet centos-minion-2 spec.containersweb Normal Created Created container with docker id 495fcbb06836
16m 16m 1 kubelet centos-minion-2 spec.containersweb Normal Started Started container with docker id 495fcbb06836
16m 16m 1 kubelet centos-minion-2 spec.containersweb Normal Started Started container with docker id d56f34ae4e8f
16m 16m 1 kubelet centos-minion-2 spec.containersweb Normal Created Created container with docker id d56f34ae4e8f
16m 16m 2 kubelet centos-minion-2 Warning FailedSync Error syncing pod, skipping: failed to "StartContainer" for "web" with CrashLoopBackOff: "Back-off 10s restarting failed container=web pod=nfs-web-fdr9h_default(461c937d-d870-11e6-98de-005056040cc2)"
I have two pods: nfs-web-07rxz, nfs-web-fdr9h, but if I do "kubectl logs nfs-web-07rxz" or with "-p" option I don't see any log in both pods.
[root@centos-master ~]# kubectl logs nfs-web-07rxz -p
[root@centos-master ~]# kubectl logs nfs-web-07rxz
This is my replicationController yaml file: replicationController yaml file
apiVersion: v1 kind: ReplicationController metadata: name: nfs-web spec: replicas: 2 selector:
role: web-frontend template:
metadata:
labels:
role: web-frontend
spec:
containers:
- name: web
image: eso-cmbu-docker.artifactory.eng.vmware.com/demo-container:demo-version3.0
ports:
- name: web
containerPort: 80
securityContext:
privileged: true
My Docker image was made from this simple docker file:
FROM ubuntu
RUN apt-get update
RUN apt-get install -y nginx
RUN apt-get install -y nfs-common
I am running my kubernetes cluster on CentOs-1611, kube version:
[root@centos-master ~]# kubectl version
Client Version: version.InfoMajor:"1", Minor:"3", GitVersion:"v1.3.0", GitCommit:"86dc49aa137175378ac7fba7751c3d3e7f18e5fc", GitTreeState:"clean", BuildDate:"2016-12-15T16:57:18Z", GoVersion:"go1.6.3", Compiler:"gc", Platform:"linux/amd64"
Server Version: version.InfoMajor:"1", Minor:"3", GitVersion:"v1.3.0", GitCommit:"86dc49aa137175378ac7fba7751c3d3e7f18e5fc", GitTreeState:"clean", BuildDate:"2016-12-15T16:57:18Z", GoVersion:"go1.6.3", Compiler:"gc", Platform:"linux/amd64"
If I run the docker image by "docker run" I was able to run the image without any issue, only through kubernetes I got the crash.
Can someone help me out, how can I debug without seeing any log?
The entire dokcerfile is just one command "FROM ubuntu" and it is still crashing
2 解决方法
you need to have your Dockerfile have a Command to run or have your ReplicationController specify a command.
The pod is crashing because it starts up then immediately exits, thus Kubernetes restarts and the cycle continues.
查看了我制作镜像的Dockerfile,是dockerfile文件中最后的CMD命令出错。
修改后执行命令重新生成镜像:
docker build -t mynginx:1.13.9 .
执行命令:kubectl create -f nginx_deploy.yaml --record生成pod
deployment文件:
root@master:~/deployment# cat nginx_deploy.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
replicas: 2
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: mynginx:1.13.9
ports:
- containerPort: 80
3 常用操作
使用以下命令可以看到目前集群里的信息:
1 2 3 4 5 6 | kubectl get po # 查看目前所有的pod kubectl get rs # 查看目前所有的replica set kubectl get deployment # 查看目前所有的deployment kubectl describe po my-nginx # 查看my-nginx pod的详细状态 kubectl describe rs my-nginx # 查看my-nginx replica set的详细状态 kubectl describe deployment my-nginx # 查看my-nginx deployment的详细状态 |
8 kubectl delete deployment my-nginx
参考:
1 https://stackoverflow.com/questions/41604499/my-kubernetes-pods-keep-crashing-with-crashloopbackoff-but-i-cant-find-any-lo
以上是关于部署的pod处于CrashLoopBackOff状态的主要内容,如果未能解决你的问题,请参考以下文章
[Istioc]Istio部署sock-shop时rabbitmq出现CrashLoopBackOff
Prometheus kube_pod_container_status_waiting_reason 未捕获 pod CrashLoopBackOff 原因
k8s启动Pod遇到CrashLoopBackOff的解决方法
谷歌云数据流卡住重复错误“同步 pod 时出错...使用 CrashLoopBackOff 无法为“sdk”的“StartContainer”