部署的pod处于CrashLoopBackOff状态

Posted 翟海飞

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了部署的pod处于CrashLoopBackOff状态相关的知识,希望对你有一定的参考价值。

1 问题描述

使用命令kubectl create -f myubuntu_deploy.yaml --record生成pod,结果显示pod处于CrashLoopBackOff状态。

CrashLoopBackOff 告诉我们,Kubernetes 正在尽力启动这个 Pod,但是一个或多个容器已经挂了,或者正被删除。

This is what I keep getting:

[root@centos-master ~]# kubectl get pods
NAME               READY     STATUS             RESTARTS   AGE
nfs-server-h6nw8   1/1       Running            0          1h
nfs-web-07rxz      0/1       CrashLoopBackOff   8          16m
nfs-web-fdr9h      0/1       CrashLoopBackOff   8          16m

Below is output from "describe pods" kubectl describe pods

Events:
  FirstSeen LastSeen    Count   From                SubobjectPath       Type        Reason      Message
  --------- --------    -----   ----                -------------       --------    ------      -------
  16m       16m     1   default-scheduler                     Normal      Scheduled   Successfully assigned nfs-web-fdr9h to centos-minion-2
  16m       16m     1   kubelet centos-minion-2   spec.containersweb    Normal      Created     Created container with docker id 495fcbb06836
  16m       16m     1   kubelet centos-minion-2   spec.containersweb    Normal      Started     Started container with docker id 495fcbb06836
  16m       16m     1   kubelet centos-minion-2   spec.containersweb    Normal      Started     Started container with docker id d56f34ae4e8f
  16m       16m     1   kubelet centos-minion-2   spec.containersweb    Normal      Created     Created container with docker id d56f34ae4e8f
  16m       16m     2   kubelet centos-minion-2               Warning     FailedSync  Error syncing pod, skipping: failed to "StartContainer" for "web" with CrashLoopBackOff: "Back-off 10s restarting failed container=web pod=nfs-web-fdr9h_default(461c937d-d870-11e6-98de-005056040cc2)"

I have two pods: nfs-web-07rxz, nfs-web-fdr9h, but if I do "kubectl logs nfs-web-07rxz" or with "-p" option I don't see any log in both pods.

[root@centos-master ~]# kubectl logs nfs-web-07rxz -p
[root@centos-master ~]# kubectl logs nfs-web-07rxz

This is my replicationController yaml file: replicationController yaml file

apiVersion: v1 kind: ReplicationController metadata:   name: nfs-web spec:   replicas: 2   selector:
    role: web-frontend   template:
    metadata:
      labels:
        role: web-frontend
    spec:
      containers:
      - name: web
        image: eso-cmbu-docker.artifactory.eng.vmware.com/demo-container:demo-version3.0
        ports:
          - name: web
            containerPort: 80
        securityContext:
          privileged: true

My Docker image was made from this simple docker file:

FROM ubuntu
RUN apt-get update
RUN apt-get install -y nginx
RUN apt-get install -y nfs-common

I am running my kubernetes cluster on CentOs-1611, kube version:

[root@centos-master ~]# kubectl version
Client Version: version.InfoMajor:"1", Minor:"3", GitVersion:"v1.3.0", GitCommit:"86dc49aa137175378ac7fba7751c3d3e7f18e5fc", GitTreeState:"clean", BuildDate:"2016-12-15T16:57:18Z", GoVersion:"go1.6.3", Compiler:"gc", Platform:"linux/amd64"
Server Version: version.InfoMajor:"1", Minor:"3", GitVersion:"v1.3.0", GitCommit:"86dc49aa137175378ac7fba7751c3d3e7f18e5fc", GitTreeState:"clean", BuildDate:"2016-12-15T16:57:18Z", GoVersion:"go1.6.3", Compiler:"gc", Platform:"linux/amd64"

If I run the docker image by "docker run" I was able to run the image without any issue, only through kubernetes I got the crash.

Can someone help me out, how can I debug without seeing any log?

The entire dokcerfile is just one command "FROM ubuntu" and it is still crashing

2 解决方法

you need to have your Dockerfile have a Command to run or have your ReplicationController specify a command.

The pod is crashing because it starts up then immediately exits, thus Kubernetes restarts and the cycle continues.

查看了我制作镜像的Dockerfile,是dockerfile文件中最后的CMD命令出错。

修改后执行命令重新生成镜像:

docker build -t mynginx:1.13.9 .

执行命令:kubectl create -f nginx_deploy.yaml --record生成pod

deployment文件:

root@master:~/deployment# cat nginx_deploy.yaml 
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
  labels:
    app: nginx
spec:
  replicas: 2
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: mynginx:1.13.9
        ports:
        - containerPort: 80

3 常用操作

使用以下命令可以看到目前集群里的信息:

master


    
     1
    
    
     2
    
    
     3
    
    
     4
    
    
     5
    
    
     6
    

    
     kubectl get po 
     # 查看目前所有的pod
    
    
     kubectl get rs 
     # 查看目前所有的replica set
    
    
     kubectl get deployment 
     # 查看目前所有的deployment
    
    
     kubectl describe po my-nginx 
     # 查看my-nginx pod的详细状态
    
    
     kubectl describe rs my-nginx 
     # 查看my-nginx replica set的详细状态
    
    
     kubectl describe deployment my-nginx 
     # 查看my-nginx deployment的详细状态
    
7     kubectl get eventskubectl get events查看相关事件

8  kubectl delete deployment my-nginx

参考:

1 https://stackoverflow.com/questions/41604499/my-kubernetes-pods-keep-crashing-with-crashloopbackoff-but-i-cant-find-any-lo


以上是关于部署的pod处于CrashLoopBackOff状态的主要内容,如果未能解决你的问题,请参考以下文章

[Istioc]Istio部署sock-shop时rabbitmq出现CrashLoopBackOff

k8s中的Pod的状态CrashLoopBackOff

Prometheus kube_pod_container_status_waiting_reason 未捕获 pod CrashLoopBackOff 原因

k8s启动Pod遇到CrashLoopBackOff的解决方法

谷歌云数据流卡住重复错误“同步 pod 时出错...使用 CrashLoopBackOff 无法为“sdk”的“StartContainer”

CrashLoopBackoff中的法兰绒吊舱kubernetes错误