服务器必须由拥有数据目录的用户启动
Posted
技术标签:
【中文标题】服务器必须由拥有数据目录的用户启动【英文标题】:The server must be started by the user that owns the data directory 【发布时间】:2019-07-25 07:41:58 【问题描述】:我正在尝试为在 Kubernetes 上运行的 PostgreSQL docker 实例获取一些持久存储。但是,吊舱失败了
FATAL: data directory "/var/lib/postgresql/data" has wrong ownership
HINT: The server must be started by the user that owns the data directory.
这是 NFS 配置:
% exportfs -v
/srv/nfs/postgresql/postgres-registry
kubehost*.example.com(rw,wdelay,insecure,no_root_squash,no_subtree_check,sec=sys,rw,no_root_squash,no_all_squash)
$ ls -ldn /srv/nfs/postgresql/postgres-registry
drwxrwxrwx. 3 999 999 4096 Jul 24 15:02 /srv/nfs/postgresql/postgres-registry
$ ls -ln /srv/nfs/postgresql/postgres-registry
total 4
drwx------. 2 999 999 4096 Jul 25 08:36 pgdata
来自 pod 的完整日志:
2019-07-25T07:32:50.617532000Z The files belonging to this database system will be owned by user "postgres".
2019-07-25T07:32:50.618113000Z This user must also own the server process.
2019-07-25T07:32:50.619048000Z The database cluster will be initialized with locale "en_US.utf8".
2019-07-25T07:32:50.619496000Z The default database encoding has accordingly been set to "UTF8".
2019-07-25T07:32:50.619943000Z The default text search configuration will be set to "english".
2019-07-25T07:32:50.620826000Z Data page checksums are disabled.
2019-07-25T07:32:50.621697000Z fixing permissions on existing directory /var/lib/postgresql/data ... ok
2019-07-25T07:32:50.647445000Z creating subdirectories ... ok
2019-07-25T07:32:50.765065000Z selecting default max_connections ... 20
2019-07-25T07:32:51.035710000Z selecting default shared_buffers ... 400kB
2019-07-25T07:32:51.062039000Z selecting default timezone ... Etc/UTC
2019-07-25T07:32:51.062828000Z selecting dynamic shared memory implementation ... posix
2019-07-25T07:32:51.218995000Z creating configuration files ... ok
2019-07-25T07:32:51.252788000Z 2019-07-25 07:32:51.251 UTC [79] FATAL: data directory "/var/lib/postgresql/data" has wrong ownership
2019-07-25T07:32:51.253339000Z 2019-07-25 07:32:51.251 UTC [79] HINT: The server must be started by the user that owns the data directory.
2019-07-25T07:32:51.262238000Z child process exited with exit code 1
2019-07-25T07:32:51.263194000Z initdb: removing contents of data directory "/var/lib/postgresql/data"
2019-07-25T07:32:51.380205000Z running bootstrap script ...
部署有以下内容:
securityContext:
runAsUser: 999
supplementalGroups: [999,1000]
fsGroup: 999
我做错了什么?
编辑:添加 storage.yaml 文件:
kind: PersistentVolume
apiVersion: v1
metadata:
name: postgres-registry-pv-volume
spec:
capacity:
storage: 5Gi
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
nfs:
server: 192.168.3.7
path: /srv/nfs/postgresql/postgres-registry
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: postgres-registry-pv-claim
labels:
app: postgres-registry
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 5Gi
编辑:以及完整部署:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: postgres-registry
spec:
replicas: 1
template:
metadata:
labels:
app: postgres-registry
spec:
securityContext:
runAsUser: 999
supplementalGroups: [999,1000]
fsGroup: 999
containers:
- name: postgres-registry
image: postgres:latest
imagePullPolicy: "IfNotPresent"
ports:
- containerPort: 5432
env:
- name: POSTGRES_DB
value: postgresdb
- name: POSTGRES_USER
value: postgres
- name: POSTGRES_PASSWORD
value: Sekret
volumeMounts:
- mountPath: /var/lib/postgresql/data
subPath: "pgdata"
name: postgredb-registry-persistent-storage
volumes:
- name: postgredb-registry-persistent-storage
persistentVolumeClaim:
claimName: postgres-registry-pv-claim
更多调试添加:
command: ["/bin/bash", "-c"]
args:["id -u; ls -ldn /var/lib/postgresql/data"]
返回:
999
drwx------. 2 99 99 4096 Jul 25 09:11 /var/lib/postgresql/data
显然,UID/GID 是错误的。为什么?
即使采用 Jakub Bujny 建议的解决方法,我还是明白了:
2019-07-25T09:32:08.734807000Z The files belonging to this database system will be owned by user "postgres".
2019-07-25T09:32:08.735335000Z This user must also own the server process.
2019-07-25T09:32:08.736976000Z The database cluster will be initialized with locale "en_US.utf8".
2019-07-25T09:32:08.737416000Z The default database encoding has accordingly been set to "UTF8".
2019-07-25T09:32:08.737882000Z The default text search configuration will be set to "english".
2019-07-25T09:32:08.738754000Z Data page checksums are disabled.
2019-07-25T09:32:08.739648000Z fixing permissions on existing directory /var/lib/postgresql/data ... ok
2019-07-25T09:32:08.766606000Z creating subdirectories ... ok
2019-07-25T09:32:08.852381000Z selecting default max_connections ... 20
2019-07-25T09:32:09.119031000Z selecting default shared_buffers ... 400kB
2019-07-25T09:32:09.145069000Z selecting default timezone ... Etc/UTC
2019-07-25T09:32:09.145730000Z selecting dynamic shared memory implementation ... posix
2019-07-25T09:32:09.168161000Z creating configuration files ... ok
2019-07-25T09:32:09.200134000Z 2019-07-25 09:32:09.199 UTC [70] FATAL: data directory "/var/lib/postgresql/data" has wrong ownership
2019-07-25T09:32:09.200715000Z 2019-07-25 09:32:09.199 UTC [70] HINT: The server must be started by the user that owns the data directory.
2019-07-25T09:32:09.208849000Z child process exited with exit code 1
2019-07-25T09:32:09.209316000Z initdb: removing contents of data directory "/var/lib/postgresql/data"
2019-07-25T09:32:09.274741000Z running bootstrap script ... 999
2019-07-25T09:32:09.278124000Z drwx------. 2 99 99 4096 Jul 25 09:32 /var/lib/postgresql/data
【问题讨论】:
能否提供您的 StatefulSet .yml 文件? 如何在 Kubernetes 上进行部署比?你是通过 PersistentVolume 挂载你的 NFS 吗? 很抱歉,您的帖子中仍然缺少 pod 的定义 @JakubBujny 这行得通吗? 能否将以下调试命令添加到您的 pod 定义中?command: ["/bin/bash", "-c"]
args:["id -u; ls -ln /var/lib/postgresql/data"]
【参考方案1】:
使用您的设置并确保 nfs 挂载由 999:999 拥有,它工作得很好。
您的name: postgredb-registry-persistent-storage
中也缺少一个“s”
您的subPath: "pgdata"
是否需要更改$PGDATA?我没有为此添加子路径。
$ sudo mount 172.29.0.218:/test/nfs ./nfs
$ sudo su -c "ls -al ./nfs" postgres
total 8
drwx------ 2 postgres postgres 4096 Jul 25 14:44 .
drwxrwxr-x 3 rei rei 4096 Jul 25 14:44 ..
$ kubectl apply -f nfspv.yaml
persistentvolume/postgres-registry-pv-volume created
persistentvolumeclaim/postgres-registry-pv-claim created
$ kubectl apply -f postgres.yaml
deployment.extensions/postgres-registry created
$ sudo su -c "ls -al ./nfs" postgres
total 124
drwx------ 19 postgres postgres 4096 Jul 25 14:46 .
drwxrwxr-x 3 rei rei 4096 Jul 25 14:44 ..
drwx------ 3 postgres postgres 4096 Jul 25 14:46 base
drwx------ 2 postgres postgres 4096 Jul 25 14:46 global
drwx------ 2 postgres postgres 4096 Jul 25 14:46 pg_commit_ts
. . .
我注意到直接在持久化卷中使用nfs:
初始化数据库需要更长的时间,而对挂载的 nfs 卷使用hostPath:
表现正常。
所以几分钟后:
$ kubectl logs postgres-registry-675869694-9fp52 | tail -n 3
2019-07-25 21:50:57.181 UTC [30] LOG: database system is ready to accept connections
done
server started
$ kubectl exec -it postgres-registry-675869694-9fp52 psql
psql (11.4 (Debian 11.4-1.pgdg90+1))
Type "help" for help.
postgres=#
检查 uid/gid
$ kubectl exec -it postgres-registry-675869694-9fp52 bash
postgres@postgres-registry-675869694-9fp52:/$ whoami && id -u && id -g
postgres
999
999
nfspv.yaml
:
kind: PersistentVolume
apiVersion: v1
metadata:
name: postgres-registry-pv-volume
spec:
capacity:
storage: 5Gi
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
nfs:
server: 172.29.0.218
path: /test/nfs
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: postgres-registry-pv-claim
labels:
app: postgres-registry
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 5Gi
postgres.yaml
:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: postgres-registry
spec:
replicas: 1
template:
metadata:
labels:
app: postgres-registry
spec:
securityContext:
runAsUser: 999
supplementalGroups: [999,1000]
fsGroup: 999
containers:
- name: postgres-registry
image: postgres:latest
imagePullPolicy: "IfNotPresent"
ports:
- containerPort: 5432
env:
- name: POSTGRES_DB
value: postgresdb
- name: POSTGRES_USER
value: postgres
- name: POSTGRES_PASSWORD
value: Sekret
volumeMounts:
- mountPath: /var/lib/postgresql/data
name: postgresdb-registry-persistent-storage
volumes:
- name: postgresdb-registry-persistent-storage
persistentVolumeClaim:
claimName: postgres-registry-pv-claim
【讨论】:
谢谢。似乎我的 kubernetes 主机出于某种奇怪的未知原因将所有 NFS 挂载映射为 99:99 (nobody:nogroup)……【参考方案2】:我无法解释为什么这两个 ID 不同,但作为解决方法,我会尝试用
覆盖 postgres 的入口点command: ["/bin/bash", "-c"]
args: ["chown -R 999:999 /var/lib/postgresql/data && ./docker-entrypoint.sh postgres"]
【讨论】:
我添加了更多调试。解决方法不起作用。 ☹ 如何更改安全上下文以作为用户 UID 99 运行?initdb: could not access directory "/var/lib/postgresql/data": Permission denied
...这令人沮丧。
UID 99 的用户在您的 kubernetes 节点上有效吗?我的意思是在主机上使用 postgres 的 docker 容器
您可以通过运行 sudo cat /etc/passwd | 来检查它。 grep 99【参考方案3】:
当您将 NTFS 目录链接到 docker 容器时,这种类型的错误很常见。 NTFS 目录不支持 ext3 文件和目录访问控制。使其工作的唯一方法是将目录从 ext3 驱动器链接到您的容器中。
当我使用链接 www 文件夹的 Apache / php 容器时,我有点绝望。在我将文件链接到 ext3 文件系统后,问题就消失了。
我在 youtube 上发布了一个简短的 Docker 教程,可能有助于理解这个问题:https://www.youtube.com/watch?v=eS9O05TTFjM
【讨论】:
以上是关于服务器必须由拥有数据目录的用户启动的主要内容,如果未能解决你的问题,请参考以下文章
linux在做NIS服务器时启动 chkconfig time on时显示在 time 服务中读取信息时出错:没有那个文件或目录