无法部署具有大于一个副本的卷的 mongodb StatefulSet

Posted

技术标签:

【中文标题】无法部署具有大于一个副本的卷的 mongodb StatefulSet【英文标题】:Cannot deploy mongodb StatefulSet with volumes for replicas grater than one 【发布时间】:2019-12-22 14:25:00 【问题描述】:

上下文

我正在共享 /data/db 目录,该目录作为网络文件系统卷安装在 StatefulSet 控制的所有 pod 中。

问题

当我设置replicas: 1 stateful set 时正确部署了 mongodb。当我扩大规模时问题就开始了(副本数量大于一个,例如replicas: 2) 所有连续的 pod 都具有CrashLoopBackOff 状态。

问题

我了解错误消息 - 检查下面的调试部分。但是,我不明白。基本上,我试图实现的是 mongodb 的有状态部署,所以即使 删除 pod 后,它们将保留数据。不知何故,mongo 阻止了我这样做,因为Another mongod instance is already running on the /data/db director。 我的问题是:我做错了什么?如何部署 mongodb 使其成为有状态并持久化数据,同时扩展有状态集?

调试

集群状态

$ kubectl get svc,sts,po,pv,pvc --output=wide
NAME            TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)     AGE   SELECTOR
service/mongo   ClusterIP   None         <none>        27017/TCP   10h   run=mongo

NAME                     READY   AGE     CONTAINERS   IMAGES
statefulset.apps/mongo   1/2     8m50s   mongo        mongo:4.2.0-bionic

NAME          READY   STATUS             RESTARTS   AGE     IP          NODE        NOMINATED NODE   READINESS GATES
pod/mongo-0   1/1     Running            0          8m50s   10.44.0.2   web01       <none>           <none>
pod/mongo-1   0/1     CrashLoopBackOff   6          8m48s   10.36.0.3   compute01   <none>           <none>

NAME                                CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                     STORAGECLASS   REASON   AGE   VOLUMEMODE
persistentvolume/phenex-nfs-mongo   1Gi        RWX            Retain           Bound    phenex-nfs-mongo                           22m   Filesystem

NAME                                     STATUS   VOLUME             CAPACITY   ACCESS MODES   STORAGECLASS   AGE   VOLUMEMODE
persistentvolumeclaim/phenex-nfs-mongo   Bound    phenex-nfs-mongo   1Gi        RWX                           22m   Filesystem

日志

$ kubectl logs -f mongo-1
2019-08-14T23:52:30.632+0000 I  CONTROL  [main] Automatically disabling TLS 1.0, to force-enable TLS 1.0 specify --sslDisabledProtocols 'none'
2019-08-14T23:52:30.635+0000 I  CONTROL  [initandlisten] MongoDB starting : pid=1 port=27017 dbpath=/data/db 64-bit host=mongo-1
2019-08-14T23:52:30.635+0000 I  CONTROL  [initandlisten] db version v4.2.0
2019-08-14T23:52:30.635+0000 I  CONTROL  [initandlisten] git version: a4b751dcf51dd249c5865812b390cfd1c0129c30
2019-08-14T23:52:30.635+0000 I  CONTROL  [initandlisten] OpenSSL version: OpenSSL 1.1.1  11 Sep 2018
2019-08-14T23:52:30.635+0000 I  CONTROL  [initandlisten] allocator: tcmalloc
2019-08-14T23:52:30.635+0000 I  CONTROL  [initandlisten] modules: none
2019-08-14T23:52:30.635+0000 I  CONTROL  [initandlisten] build environment:
2019-08-14T23:52:30.635+0000 I  CONTROL  [initandlisten]     distmod: ubuntu1804
2019-08-14T23:52:30.635+0000 I  CONTROL  [initandlisten]     distarch: x86_64
2019-08-14T23:52:30.635+0000 I  CONTROL  [initandlisten]     target_arch: x86_64
2019-08-14T23:52:30.635+0000 I  CONTROL  [initandlisten] options:  net:  bindIp: "0.0.0.0" , replication:  replSet: "rs0"  
2019-08-14T23:52:30.642+0000 I  STORAGE  [initandlisten] exception in initAndListen: DBPathInUse: Unable to lock the lock file: /data/db/mongod.lock (Resource temporarily unavailable). Another mongod instance is already running on the /data/db directory, terminating
2019-08-14T23:52:30.643+0000 I  NETWORK  [initandlisten] shutdown: going to close listening sockets...
2019-08-14T23:52:30.643+0000 I  NETWORK  [initandlisten] removing socket file: /tmp/mongodb-27017.sock
2019-08-14T23:52:30.643+0000 I  -        [initandlisten] Stopping further Flow Control ticket acquisitions.
2019-08-14T23:52:30.643+0000 I  CONTROL  [initandlisten] now exiting
2019-08-14T23:52:30.643+0000 I  CONTROL  [initandlisten] shutting down with code:100

错误

Unable to lock the lock file: /data/db/mongod.lock (Resource temporarily unavailable). 
Another mongod instance is already running on the /data/db directory, terminating

YAML 文件

# StatefulSet
---
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
  name: mongo
spec:
  serviceName: mongo
  replicas: 2
  selector:
    matchLabels:
      run: mongo
      tier: backend
  template:
    metadata:
      labels:
        run: mongo
        tier: backend
    spec:
      terminationGracePeriodSeconds: 10
      containers:
        - name: mongo
          image: mongo:4.2.0-bionic
          command:
            - mongod
          args:
            - "--replSet=rs0"
            - "--bind_ip=0.0.0.0"
          ports:
            - containerPort: 27017
          volumeMounts:
            - name: phenex-nfs-mongo
              mountPath: /data/db
      volumes:
      - name: phenex-nfs-mongo
        persistentVolumeClaim:
          claimName: phenex-nfs-mongo

# PersistentVolume
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: phenex-nfs-mongo
spec:
  accessModes:
  - ReadWriteMany
  capacity:
    storage: 1Gi
  nfs:
    server: master
    path: /nfs/data/phenex/production/permastore/mongo
  claimRef:
    name: phenex-nfs-mongo
  persistentVolumeReclaimPolicy: Retain

# PersistentVolumeClaim
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: phenex-nfs-mongo
spec:
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 100Mi

【问题讨论】:

【参考方案1】:

问题:

您正在使用相同的 pvc 和 pv 部署多个 pod。

解决方案:

使用volumeClaimTemplates、example

示例:

# StatefulSet
---
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
  name: mongo
spec:
  serviceName: mongo
  replicas: 2
  selector:
    matchLabels:
      run: mongo
      tier: backend
  template:
    metadata:
      labels:
        run: mongo
        tier: backend
    spec:
      terminationGracePeriodSeconds: 10
      containers:
        - name: mongo
          image: mongo:4.2.0-bionic
          command:
            - mongod
          args:
            - "--replSet=rs0"
            - "--bind_ip=0.0.0.0"
          ports:
            - containerPort: 27017
          volumeMounts:
            - name: phenex-nfs-mongo
              mountPath: /data/db
  volumeClaimTemplates:
  - metadata:
      name: phenex-nfs-mongo
    spec:
      accessModes:
        - ReadWriteMany
      resources:
        requests:
          storage: 100Mi

【讨论】:

You are deploying more than one pod using the same pvc and pv. 别误会我的意思。但这正是我想要实现的。我想在所有 pod 之间共享 db director,所以 Stateful Set 中的所有 pod 都将具有相同的内容 - 因此,我的 pod 将是有状态的。为什么这是个问题?这与使用 nginx 部署的所有 pod 之间共享 index.html 的用例相同。非常有义务解释:) 使用 MongoDB 无法实现这一点,因为如果为多个 pod 挂载一个 pv,它必须是只读的,而 MongoDB 需要对存储的写访问权限。 如果我错了,请纠正我。但是,我通过在 pv 和 pvc 上将 accessModes 选项设置为 ReadWriteMany 来为我的节点指定写访问权限。所以理论上我应该能够读写共享卷。 理论上是的,但是 MongoDB 在/data/db 目录中创建了一些文件,并且能够发现另一个实例已经在使用相同的存储空间 您能否指出我如何正确部署 mongo 作为具有持久数据的有状态集的资源?谢谢!

以上是关于无法部署具有大于一个副本的卷的 mongodb StatefulSet的主要内容,如果未能解决你的问题,请参考以下文章

高可用MongoDB集群部署详解——搭建MongoDB副本集

MongoDB副本集(一主两从)读写分离故障转移功能环境部署记录

Open-E DSS V7 应用系列之七 卷组和卷的管理

使用 Tanzu Mission Control 部署具有额外数据卷的 Tanzu Kubernetes 集群

Windows 卷影复制错误:2155348129

MongoDB副本集部署