使用 nvidia gpu 创建 docker-compose 时“不允许使用‘设备’属性”
Posted
技术标签:
【中文标题】使用 nvidia gpu 创建 docker-compose 时“不允许使用‘设备’属性”【英文标题】:" 'devices' properties is not allowed" while creating docker-compose with nvidia gpu 【发布时间】:2021-05-23 16:31:02 【问题描述】:问题描述
上下文信息(用于错误报告)
docker-compose version
的输出
docker-compose version 1.17.1, build unknown
docker-py version: 2.5.1
CPython version: 2.7.17
OpenSSL version: OpenSSL 1.1.1 11 Sep 2018
docker version
的输出
Client:
Version: 19.03.6
API version: 1.40
Go version: go1.12.17
Git commit: 369ce74a3c
Built: Fri Dec 18 12:21:44 2020
OS/Arch: linux/amd64
Experimental: false
Server:
Engine:
Version: 19.03.6
API version: 1.40 (minimum version 1.12)
Go version: go1.12.17
Git commit: 369ce74a3c
Built: Thu Dec 10 13:23:49 2020
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.3.3-0ubuntu1~18.04.4
GitCommit:
runc:
Version: spec: 1.0.1-dev
GitCommit:
docker-init:
Version: 0.18.0
GitCommit:
docker-compose config
的输出
(确保添加相关的-f
和其他标志)
ERROR: The Compose file './docker-compose.yml' is invalid because:
services.testserver.deploy.resources.reservations value Additional properties are not allowed ('devices' was unexpected)
重现问题的步骤
-
使用简单的 nvidia cuda 映像拉取和检查 nvidia-gpu 的命令创建 Dockerfile
FROM nvidia/cuda:10.2-base
CMD nvidia-smi
2.当我们构建镜像并在没有 docker compose 的情况下运行它时,它就像一个魅力
docker image build testserver/ -t testserverimage
docker run --gpus all -exec -it testserverimage
显示 nvidia-gpu 设备
Sat Feb 20 13:10:46 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.80.02 Driver Version: 450.80.02 CUDA Version: 11.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla K80 Off | 00001918:00:00.0 Off | 0 |
| N/A 52C P0 71W / 149W | 7897MiB / 11441MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
+-----------------------------------------------------------------------------+
-
现在创建 docker-compose.yml
version: "3.5"
services:
testserver:
image: nvidia/cuda:10.2-base
build: './modelserver'
deploy:
resources:
reservations:
devices:
- capabilities: [gpu]
driver: nvidia
观察结果
ERROR: The Compose file './docker-compose.yml' is invalid because:
services.testserver.deploy.resources.reservations value Additional properties are not allowed ('devices' was unexpected)
预期结果
Sat Feb 20 13:10:46 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.80.02 Driver Version: 450.80.02 CUDA Version: 11.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla K80 Off | 00001918:00:00.0 Off | 0 |
| N/A 52C P0 71W / 149W | 7897MiB / 11441MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
+-----------------------------------------------------------------------------+
Stacktrace / 完整的错误信息
ERROR: The Compose file './docker-compose.yml' is invalid because:
services.testserver.deploy.resources.reservations value Additional properties are not allowed ('devices' was unexpected)
其他信息
操作系统版本/发行版,docker-compose
安装方法等。
操作系统信息:
NAME="Ubuntu"
VERSION="18.04.5 LTS (Bionic Beaver)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 18.04.5 LTS"
VERSION_ID="18.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=bionic
UBUNTU_CODENAME=bionic
Docker compose 安装:
sudo apt install docker-compose
【问题讨论】:
【参考方案1】:在文档https://docs.docker.com/compose/gpu-support/#enabling-gpu-access-to-service-containers 中:
Docker Compose v1.28.0+ 允许使用 Compose 规范中定义的设备结构来定义 GPU 预留。
您的 docker-compose 版本是 1.17.1,因此您需要将 docker-compose 至少升级到 1.28.0。
【讨论】:
非常感谢。我确实错过了版本控制,因为我认为从“apt”安装总是会给我最新版本。下次不会再发生了,再次感谢。 Ubuntu 18.04 太旧,无法在其存储库中包含 docker-compose 1.28。 是的,@zigarn。再次感谢您的帮助。以上是关于使用 nvidia gpu 创建 docker-compose 时“不允许使用‘设备’属性”的主要内容,如果未能解决你的问题,请参考以下文章
使用nvidia_gpu_expoter配合prometheus+grafana监控GPU性能
如何使用 NVIDIA 驱动程序/CUDA(支持 tensorflow-gpu)和带有 pip 的 Python3 为图像制作 Dockerfile?