Kubernetes - 机器上已经存在容器镜像

Posted

技术标签:

【中文标题】Kubernetes - 机器上已经存在容器镜像【英文标题】:Kubernetes - Container image already present on machine 【发布时间】:2019-05-01 07:27:42 【问题描述】:

所以我在 k8s 上有 2 个类似的部署,它们从 GitLab 中提取相同的图像。显然,这导致我的第二次部署出现CrashLoopBackOff 错误,我似乎无法连接到端口来检查我的 pod 的/healthz。记录 pod 表明 pod 在描述 pod 时收到了中断信号,显示以下消息。

 FirstSeen  LastSeen    Count   From            SubObjectPath                   Type        Reason          Message
  --------- --------    -----   ----            -------------                   --------    ------          -------
  29m       29m     1   default-scheduler                           Normal      Scheduled       Successfully assigned java-kafka-rest-kafka-data-2-development-5c6f7f597-5t2mr to 172.18.14.110
  29m       29m     1   kubelet, 172.18.14.110                          Normal      SuccessfulMountVolume   MountVolume.SetUp succeeded for volume "default-token-m4m55" 
  29m       29m     1   kubelet, 172.18.14.110  spec.containersconsul             Normal      Pulled          Container image "..../consul-image:0.0.10" already present on machine
  29m       29m     1   kubelet, 172.18.14.110  spec.containersconsul             Normal      Created         Created container
  29m       29m     1   kubelet, 172.18.14.110  spec.containersconsul             Normal      Started         Started container
  28m       28m     1   kubelet, 172.18.14.110  spec.containersjava-kafka-rest-development    Normal      Killing         Killing container with id docker://java-kafka-rest-development:Container failed liveness probe.. Container will be killed and recreated.
  29m       28m     2   kubelet, 172.18.14.110  spec.containersjava-kafka-rest-development    Normal      Created         Created container
  29m       28m     2   kubelet, 172.18.14.110  spec.containersjava-kafka-rest-development    Normal      Started         Started container
  29m       27m     10  kubelet, 172.18.14.110  spec.containersjava-kafka-rest-development    Warning     Unhealthy       Readiness probe failed: Get http://10.5.59.35:7533/healthz: dial tcp 10.5.59.35:7533: getsockopt: connection refused
  28m       24m     13  kubelet, 172.18.14.110  spec.containersjava-kafka-rest-development    Warning     Unhealthy       Liveness probe failed: Get http://10.5.59.35:7533/healthz: dial tcp 10.5.59.35:7533: getsockopt: connection refused
  29m       19m     8   kubelet, 172.18.14.110  spec.containersjava-kafka-rest-development    Normal      Pulled          Container image "r..../java-kafka-rest:0.3.2-dev" already present on machine
  24m       4m      73  kubelet, 172.18.14.110  spec.containersjava-kafka-rest-development    Warning     BackOff         Back-off restarting failed container

我尝试在不同的映像下重新部署部署,它似乎工作得很好。但是我认为这不会有效,因为图像始终相同。我该怎么办?

这是我的部署文件的样子:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: "java-kafka-rest-kafka-data-2-development"
  labels:
    repository: "java-kafka-rest"
    project: "java-kafka-rest"
    service: "java-kafka-rest-kafka-data-2"
    env: "development"
spec:
  replicas: 1
  selector:
    matchLabels:
      repository: "java-kafka-rest"
      project: "java-kafka-rest"
      service: "java-kafka-rest-kafka-data-2"
      env: "development"
  template:
    metadata:
      labels:
        repository: "java-kafka-rest"
        project: "java-kafka-rest"
        service: "java-kafka-rest-kafka-data-2"
        env: "development"
        release: "0.3.2-dev"
    spec:
      imagePullSecrets:
      - name: ...
      containers:
      - name: java-kafka-rest-development
        image: registry...../java-kafka-rest:0.3.2-dev
        env:
        - name: DEPLOYMENT_COMMIT_HASH
          value: "0.3.2-dev"
        - name: DEPLOYMENT_PORT
          value: "7533"
        livenessProbe:
          httpGet:
            path: /healthz
            port: 7533
          initialDelaySeconds: 30
          timeoutSeconds: 1
        readinessProbe:
          httpGet:
            path: /healthz
            port: 7533
          timeoutSeconds: 1
        ports:
        - containerPort: 7533
        resources:
          requests:
            cpu: 0.5
            memory: 6Gi
          limits:
            cpu: 3
            memory: 10Gi
        command:
          - /envconsul
          - -consul=127.0.0.1:8500
          - -sanitize
          - -upcase
          - -prefix=java-kafka-rest/
          - -prefix=java-kafka-rest/kafka-data-2
          - java
          - -jar
          - /build/libs/java-kafka-rest-0.3.2-dev.jar
        securityContext:
          readOnlyRootFilesystem: true
      - name: consul
        image: registry.../consul-image:0.0.10
        env:
        - name: SERVICE_NAME
          value: java-kafka-rest-kafka-data-2
        - name: SERVICE_ENVIRONMENT
          value: development
        - name: SERVICE_PORT
          value: "7533"
        - name: CONSUL1
          valueFrom:
            configMapKeyRef:
              name: consul-config-...
              key: node1
        - name: CONSUL2
          valueFrom:
            configMapKeyRef:
              name: consul-config-...
              key: node2
        - name: CONSUL3
          valueFrom:
            configMapKeyRef:
              name: consul-config-...
              key: node3
        - name: CONSUL_ENCRYPT
          valueFrom:
            configMapKeyRef:
              name: consul-config-...
              key: encrypt
        ports:
        - containerPort: 8300
        - containerPort: 8301
        - containerPort: 8302
        - containerPort: 8400
        - containerPort: 8500
        - containerPort: 8600
        command: [ entrypoint, agent, -config-dir=/config, -join=$(CONSUL1), -join=$(CONSUL2), -join=$(CONSUL3), -encrypt=$(CONSUL_ENCRYPT) ]
      terminationGracePeriodSeconds: 30
      nodeSelector:
        env: ...

【问题讨论】:

可能是您的readinessProbe 正在杀死您的容器。这是卡夫卡经纪人形象还是...? @Urosh T. 是的,这就是为什么也要假设。它确实是用于产生 kafka 消息的 kafka 图像。但是我很困惑是什么导致readinessProbe 以这种方式触发;据我了解,从 GitLab 拉取的图像应该放在 k8s pod 上,与其他 pod 拉取的图像无关。 是的,但是 readinesProbe 是在您的 k8s 部署文件中定义的,因此您可能需要增加值(如果 kafka 需要很多时间来启动),甚至删除探针以查看是否是是什么杀死了你的 pod 实际上 - 据我所知,Kafka 甚至没有任何健康检查端点。您是否实施了任何自定义健康检查或...? @UroshT。我确实已经实现了自定义运行状况检查,我已将其粘贴到 pastebin 并添加了我的部署文件以清楚起见。但是,即使readinesProbe 确实是造成这种情况的原因,为什么如果他们拉取相同的图像而不是从单个图像拉取时会影响我的部署? 【参考方案1】:

对于那些遇到此问题的人,我已经发现了问题并解决了我的问题。显然问题出在我的service.yml 上,我的 targetPort 指向的端口与我在 docker 映像中打开的端口不同。确保 docker 镜像中打开的端口连接到正确的端口。

希望这会有所帮助。

【讨论】:

以上是关于Kubernetes - 机器上已经存在容器镜像的主要内容,如果未能解决你的问题,请参考以下文章

Docker&Kubernetes ❀ Docker save load export import 容器镜像的导入与导出方法

Docker&Kubernetes ❀ Docker save load export import 容器镜像的导入与导出方法

Kubernetes 上调试 distroless 容器

Kubernetes三小时攻克 Kubernetes!:为每个服务创建容器镜像

云原生生态的基石 Kubernetes

在Ubuntu20.04上安装Kubernetes-Kubeadm和Minikube