无法在 docker 上执行 nvidia 运行时

Posted

技术标签:

【中文标题】无法在 docker 上执行 nvidia 运行时【英文标题】:can't execute nvidia runtime on docker 【发布时间】:2018-10-26 02:29:26 【问题描述】:

我正在尝试让 nvidia-docker 在我的 centos7 系统上运行:

$ cat /etc/systemd/system/docker.service.d/override.conf
[Service]
ExecStart=
ExecStart=/usr/bin/dockerd-current --add-runtime docker-runc=/usr/libexec/docker/docker-runc-current --add-runtime=nvidia=/usr/bin/nvidia-container-runtime --default-runtime=docker-runc --exec-opt native.cgroupdriver=systemd --userland-proxy-path=/usr/libexec/docker/docker-proxy-current --seccomp-profile=/etc/docker/seccomp.json $OPTIONS $DOCKER_STORAGE_OPTIONS $DOCKER_NETWORK_OPTIONS $ADD_REGISTRY $BLOCK_REGISTRY $INSECURE_REGISTRY $REGISTRIES

$ cat /etc/docker/daemon.json



$ docker --version
Docker version 1.13.1, build 774336d/1.13.1

$ sudo systemctl daemon-reload
$ sudo systemctl restart docker

到目前为止一切顺利:

    $ sudo docker run hello-world

Hello from Docker!
This message shows that your installation appears to be working correctly.

To generate this message, Docker took the following steps:
 1. The Docker client contacted the Docker daemon.
 2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
    (amd64)
 3. The Docker daemon created a new container from that image which runs the
    executable that produces the output you are currently reading.
 4. The Docker daemon streamed that output to the Docker client, which sent it
    to your terminal.

To try something more ambitious, you can run an Ubuntu container with:
 $ docker run -it ubuntu bash

Share images, automate workflows, and more with a free Docker ID:
 https://hub.docker.com/

For more examples and ideas, visit:
 https://docs.docker.com/engine/userguide/

现在,让我们试试 nvidia 运行时:

$ sudo docker run --runtime=nvidia --rm nvidia/cuda nvidia-smi
standard_init_linux.go:178: exec user process caused "permission denied"

但奇怪的是……

$ sudo docker run --runtime=nvidia --rm nvidia/cuda sh -c nvidia-smi
Wed May 16 06:41:17 2018
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 390.42                 Driver Version: 390.42                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla K80           Off  | 00000000:06:00.0 Off |                    0 |
| N/A   28C    P8    26W / 149W |      0MiB / 11441MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

【问题讨论】:

【参考方案1】:

所以...最后我完全禁用了 selinux 并重新启动并修复了它。

【讨论】:

以上是关于无法在 docker 上执行 nvidia 运行时的主要内容,如果未能解决你的问题,请参考以下文章

首次运行 nvidia-docker2 容器非常慢

可以在没有 GPU 的情况下运行 nvidia-docker 吗?

docker-compose 找不到 nvidia 驱动程序

在 Windows 10 + WSL2 上运行 nvidia-docker

在 Jetson nano 和 jetson xavier 上运行 Nvidia-docker 以实现 tensorflow 等深度学习框架

如何确保容器运行时是 kubernetes 节点的 nvidia-docker?