docker-Namespace隔离

Posted DevOperater

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了docker-Namespace隔离相关的知识,希望对你有一定的参考价值。

1.dcoker介绍

docker实际是基于 Linux 内核的 Cgroup,Namespace,以及 Union FS 等技术,对进程进行封装隔离,属于操作系统层面的虚拟化技术,由于隔离的进程独立于宿主和其它的隔离的进程,因此也称其为容器。

优点:

容器在操作系统中只是一个进程,很轻量,所有容器共用一个操作系统内核。而传统KVM虚拟化,每一个虚拟机中都有一个独立的操作系统,再不做优化的情况下,虚拟机自身就要占用100-200MB内存。

缺点:

1.由于多个容器只是多个进程,公用一个操作系统内核,所以如果某个容器对内核有特殊需求,就需要使用新的节点,并指定这种类型的容器只能运行在该特殊节点上。

2.在inux内核中,时间是不能被namespace化的,如果容器中程序使用settimeofday(2) 系统调用修改了时间,那么整个宿主机的时间都会被修改

3.多个容器间虽然能够通过Namespace技术进行隔离,但隔离的还不是十分彻底。这对传统的KVM虚拟化来说就没有这样的问题。

2.容器实际是一个进程

2.1环境准备

在讲解之前,我们需要先准备一个ubuntu虚拟机环境,由于我是在mac上操作,为了方便,我这里使用vagrant来创建虚拟机。

2.1.1安装virtualBox

安装好后,配置一下“主机网络管理器”

docker-Namespace隔离_docker

2.1.2使用vagrant来创建虚拟机

1.安装vagrant

brew install vagrant

2.mkdir ebpf&&cd ebpf 

哈哈,由于vagrant是跟教ebpf老师学到的,所以这么命名了

3.创建和启动Ubuntu 21.10虚拟机

vagrant init ubuntu/impish64

​https://app.vagrantup.com/ubuntu​ 这里可以找到ubuntu的其他版本

A `Vagrantfile` has been placed in this directory. You are nowready to `vagrant up` your first virtual environment! Please read
the comments in the Vagrantfile as well as documentation on
`vagrantup.com` for more information on using Vagrant.
4.vagrant up 启动虚拟机
Bringing machine default up with virtualbox provider...
==> default: Box ubuntu/impish64 could not be found. Attempting to find and install...
default: Box Provider: virtualbox
default: Box Version: >= 0
==> default: Loading metadata for box ubuntu/impish64
default: URL: https://vagrantcloud.com/ubuntu/impish64
==> default: Adding box ubuntu/impish64 (v20220121.0.0) for provider: virtualbox
default: Downloading: https://vagrantcloud.com/ubuntu/boxes/impish64/versions/20220121.0.0/providers/virtualbox.box
Download redirected to host: cloud-images.ubuntu.com
==> default: Successfully added box ubuntu/impish64 (v20220121.0.0) for virtualbox!
==> default: Importing base box ubuntu/impish64...
==> default: Matching MAC address for NAT networking...
==> default: Checking if box ubuntu/impish64 version 20220121.0.0 is up to date...
==> default: Setting the name of the VM: ebpf_default_1642853654001_35017
Vagrant is currently configured to create VirtualBox synced folders with
the `SharedFoldersEnableSymlinksCreate` option enabled. If the Vagrant
guest is not trusted, you may want to disable this option. For more
information on this option, please refer to the VirtualBox manual:

https://www.virtualbox.org/manual/ch04.html#sharedfolders

This option can be disabled globally with an environment variable:

VAGRANT_DISABLE_VBOXSYMLINKCREATE=1

or on a per folder basis within the Vagrantfile:

config.vm.synced_folder /host/path, /guest/path, SharedFoldersEnableSymlinksCreate: false
==> default: Clearing any previously set network interfaces...
==> default: Preparing network interfaces based on configuration...
default: Adapter 1: nat
==> default: Forwarding ports...
default: 22 (guest) => 2222 (host) (adapter 1)
==> default: Running pre-boot VM customizations...
==> default: Booting VM...
==> default: Waiting for machine to boot. This may take a few minutes...
default: SSH address: 127.0.0.1:2222
default: SSH username: vagrant
default: SSH auth method: private key
default: Warning: Connection reset. Retrying...
default: Warning: Remote connection disconnect. Retrying...
default:
default: Vagrant insecure key detected. Vagrant will automatically replace
default: this with a newly generated keypair for better security.
default:
default: Inserting generated public key within guest...
default: Removing insecure key from the guest if its present...
default: Key inserted! Disconnecting and reconnecting using new SSH key...
==> default: Machine booted and ready!
==> default: Checking for guest additions in VM...
default: The guest additions on this VM do not match the installed version of
default: VirtualBox! In most cases this is fine, but in rare cases it can
default: prevent things such as shared folders from working properly. If you see
default: shared folder errors, please make sure the guest additions within the
default: virtual machine match the version of VirtualBox you have installed on
default: your host and reload your VM.
default:
default: Guest Additions Version: 6.0.0 r127566
default: VirtualBox Version: 6.1
==> default: Mounting shared folders...
default: /vagrant => /Users/dz0400819/Desktop/ebpf
dz0400819@MacBook-Pro  ~/Desktop/ebpf  vagrant ssh
Welcome to Ubuntu 21.10 (GNU/Linux 5.13.0-27-generic x86_64)

* Documentation: https://help.ubuntu.com
* Management: https://landscape.canonical.com
* Support: https://ubuntu.com/advantage

System information as of Sat Jan 22 12:17:33 UTC 2022

System load: 0.09 Processes: 110
Usage of /: 3.2% of 38.71GB Users logged in: 0
Memory usage: 17% IPv4 address for enp0s3: 10.0.2.15
Swap usage: 0%


0 updates can be applied immediately.

在vitualBox上就能看到这个虚拟机了

docker-Namespace隔离_linux_02

5.vagrant ssh 进入虚拟机
Welcome to Ubuntu 21.10 (GNU/Linux 5.13.0-27-generic x86_64)

* Documentation: https://help.ubuntu.com
* Management: https://landscape.canonical.com
* Support: https://ubuntu.com/advantage

System information as of Sat Jan 22 13:52:50 UTC 2022

System load: 0.1 Processes: 109
Usage of /: 6.1% of 39.86GB Users logged in: 0
Memory usage: 16% IPv4 address for enp0s3: 10.0.2.15
Swap usage: 0%


0 updates can be applied immediately.


Last login: Sat Jan 22 12:18:17 2022 from 10.0.2.2

2.1.3虚拟机中安装docker并启动

1.安装docker

vagrant@ubuntu-impish:~$ curl -fsSL https://get.docker.com | bash -s docker --mirror Aliyun

# Executing docker install script, commit: 93d2499759296ac1f9c510605fef85052a2c32be
+ sudo -E sh -c apt-get update -qq >/dev/null
+ sudo -E sh -c DEBIAN_FRONTEND=noninteractive apt-get install -y -qq apt-transport-https ca-certificates curl >/dev/null
+ sudo -E sh -c curl -fsSL "https://mirrors.aliyun.com/docker-ce/linux/ubuntu/gpg" | gpg --dearmor --yes -o /usr/share/keyrings/docker-archive-keyring.gpg
+ sudo -E sh -c echo "deb [arch=amd64 signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://mirrors.aliyun.com/docker-ce/linux/ubuntu impish stable" > /etc/apt/sources.list.d/docker.list
+ sudo -E sh -c apt-get update -qq >/dev/null
+ sudo -E sh -c DEBIAN_FRONTEND=noninteractive apt-get install -y -qq --no-install-recommends docker-ce-cli docker-scan-plugin docker-ce >/dev/null
+ version_gte 20.10
+ [ -z ]
+ return 0
+ sudo -E sh -c DEBIAN_FRONTEND=noninteractive apt-get install -y -qq docker-ce-rootless-extras >/dev/null
+ sudo -E sh -c docker version
Client: Docker Engine - Community
Version: 20.10.12
API version: 1.41
Go version: go1.16.12
Git commit: e91ed57
Built: Mon Dec 13 11:45:33 2021
OS/Arch: linux/amd64
Context: default
Experimental: true

Server: Docker Engine - Community
Engine:
Version: 20.10.12
API version: 1.41 (minimum version 1.12)
Go version: go1.16.12
Git commit: 459d0df
Built: Mon Dec 13 11:43:41 2021
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.4.12
GitCommit: 7b11cfaabd73bb80907dd23182b9347b4245eb5d
runc:
Version: 1.0.2
GitCommit: v1.0.2-0-g52b36a2
docker-init:
Version: 0.19.0
GitCommit: de40ad0

================================================================================

To run Docker as a non-privileged user, consider setting up the
Docker daemon in rootless mode for your user:

dockerd-rootless-setuptool.sh install

Visit https://docs.docker.com/go/rootless/ to learn about rootless mode.


To run the Docker daemon as a fully privileged service, but granting non-root
users access, refer to https://docs.docker.com/go/daemon-access/

WARNING: Access to the remote API on a privileged Docker daemon is equivalent
to root access on the host. Refer to the Docker daemon attack surface
documentation for details: https://docs.docker.com/go/attack-surface/

================================================================================
2.启动docker并查看状态

sudo service docker start

sudo service docker status

docker.service - Docker Application Container Engine
Loaded: loaded (/lib/systemd/system/docker.service; enabled; vendor preset: enabled)
Active: active (running) since Sat 2022-01-22 13:56:00 UTC; 4min 54s ago
TriggeredBy: ● docker.socket
Docs: https://docs.docker.com
Main PID: 2502 (dockerd)
Tasks: 8
Memory: 39.3M
CPU: 342ms
CGroup: /system.slice/docker.service
└─2502 /usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock

Jan 22 13:55:59 ubuntu-impish dockerd[2502]: time="2022-01-22T13:55:59.821521689Z" level=info msg="scheme \\"unix\\" not registered, fallback>
Jan 22 13:55:59 ubuntu-impish dockerd[2502]: time="2022-01-22T13:55:59.821737767Z" level=info msg="ccResolverWrapper: sending update to cc:>
Jan 22 13:55:59 ubuntu-impish dockerd[2502]: time="2022-01-22T13:55:59.821904849Z" level=info msg="ClientConn switching balancer to \\"pick_>
Jan 22 13:55:59 ubuntu-impish dockerd[2502]: time="2022-01-22T13:55:59.874804710Z" level=info msg="Loading containers: start."
Jan 22 13:56:00 ubuntu-impish dockerd[2502]: time="2022-01-22T13:56:00.060803788Z" level=info msg="Default bridge (docker0) is assigned wit>
Jan 22 13:56:00 ubuntu-impish dockerd[2502]: time="2022-01-22T13:56:00.140157834Z" level=info msg="Loading containers: done."
Jan 22 13:56:00 ubuntu-impish dockerd[2502]: time="2022-01-22T13:56:00.153742664Z" level=info msg="Docker daemon" commit=459d0df graphdrive>
Jan 22 13:56:00 ubuntu-impish dockerd[2502]: time="2022-01-22T13:56:00.153853677Z" level=info msg="Daemon has completed initialization"
Jan 22 13:56:00 ubuntu-impish systemd[1]: Started Docker Application Container Engine.
Jan 22 13:56:00 ubuntu-impish dockerd[2502]: time="2022-01-22T13:56:00.179098792Z" level=info msg="API listen on /run/docker.sock"

2.2创建容器并查看容器进程

2.2.1切换到root用户创建容器

容器中PID为1的进程是/bin/sh

vagrant@ubuntu-impish:~$ sudo -i
root@ubuntu-impish:~# docker run -it busybox /bin/sh
Unable to find image busybox:latest locally
latest: Pulling from library/busybox
5cc84ad355aa: Pull complete
Digest: sha256:5acba83a746c7608ed544dc1533b87c737a0b0fb730301639a0179f9344b1678
Status: Downloaded newer image for busybox:latest
/ # ps
PID USER TIME COMMAND
1 root 0:00 /bin/sh
7 root 0:00 ps

2.2.2查看容器在宿主机上的进程ID

vagrant@ubuntu-impish:~$ sudo -i
查看容器ID
root@ubuntu-impish:~# docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
ecfc45bcf970 busybox "/bin/sh" 18 seconds ago Up 17 seconds brave_haibt
查看容器进程ID
root@ubuntu-impish:~# docker inspect ecfc45bcf970 --format " .State.Pid "
5786
查看进程信息,看到了“/bin/sh”
root@ubuntu-impish:~# ps -ef|grep 5786
root 5786 5765 0 14:33 pts/0 00:00:00 /bin/sh
root 6304 5833 0 14:42 pts/1 00:00:00 grep --color=auto 5786

2.3纠正一个问题

由于容器在宿主上是一个劲成功,Docker在宿主机上也是一个进程,所以Docker Engine应该是与应用同等级的,不应该在应用之下。

docker-Namespace隔离_docker_03

3.Namespace进程隔离

我们先来看一下一个容器进程有哪些独立的namespace

看一下我们上面创建的容器有哪些namespace
root@ubuntu-impish:~# ls -la /proc/5786/ns
total 0
dr-x--x--x 2 root root 0 Jan 22 14:33 .
dr-xr-xr-x 9 root root 0 Jan 22 14:33 ..
lrwxrwxrwx 1 root root 0 Jan 22 14:53 cgroup -> cgroup:[4026532258]
lrwxrwxrwx 1 root root 0 Jan 22 14:53 ipc -> ipc:[4026532198]
lrwxrwxrwx 1 root root 0 Jan 22 14:53 mnt -> mnt:[4026532196]
lrwxrwxrwx 1 root root 0 Jan 22 14:33 net -> net:[4026532201]
lrwxrwxrwx 1 root root 0 Jan 22 14:53 pid -> pid:[4026532199]
lrwxrwxrwx 1 root root 0 Jan 22 14:53 pid_for_children -> pid:[4026532199]
lrwxrwxrwx 1 root root 0 Jan 22 14:53 time -> time:[4026531834]
lrwxrwxrwx 1 root root 0 Jan 22 14:53 time_for_children -> time:[4026531834]
lrwxrwxrwx 1 root root 0 Jan 22 14:53 user -> user:[4026531837]
lrwxrwxrwx 1 root root 0 Jan 22 14:53 uts -> uts:[4026532197]

再看一下一个系统进程有哪些namespace

root        5765       1  0 14:33 ?        00:00:00 /usr/bin/containerd-shim-runc-v2 -namespace moby -id ecfc45bcf97085c1f566c9c2f5738b14bdf
root@ubuntu-impish:~# ls -la /proc/5765/ns
total 0
dr-x--x--x 2 root root 0 Jan 22 14:59 .
dr-xr-xr-x 9 root root 0 Jan 22 14:33 ..
lrwxrwxrwx 1 root root 0 Jan 22 14:59 cgroup -> cgroup:[4026531835]
lrwxrwxrwx 1 root root 0 Jan 22 14:59 ipc -> ipc:[4026531839]
lrwxrwxrwx 1 root root 0 Jan 22 14:59 mnt -> mnt:[4026531840]
lrwxrwxrwx 1 root root 0 Jan 22 14:59 net -> net:[4026531992]
lrwxrwxrwx 1 root root 0 Jan 22 14:59 pid -> pid:[4026531836]
lrwxrwxrwx 1 root root 0 Jan 22 14:59 pid_for_children -> pid:[4026531836]
lrwxrwxrwx 1 root root 0 Jan 22 14:59 time -> time:[4026531834]
lrwxrwxrwx 1 root root 0 Jan 22 14:59 time_for_children -> time:[4026531834]
lrwxrwxrwx 1 root root 0 Jan 22 14:59 user -> user:[4026531837]
lrwxrwxrwx 1 root root 0 Jan 22 14:59 uts -> uts:[4026531838]

可见容器进程和系统中其他进程一样,拥有相同的namespace类别。

3.1Namespace介绍

Linux Namespace是一种linux 内核提供的资源隔离方案。

在Linux系统中,系统可以为进程分配不同的Namespace,并且能够保证不同的Namespace资源独立分配、进程彼此隔离,即不同的Namespace下的进程互不干扰。

namespace的类别如下:

docker-Namespace隔离_docker_04

由于我安装的是ubuntu 最新版本20.10版本,所以我这里又多了

pid_for_children 、time_for_children 、time三个Namespace。

3.2各个namespace介绍

3.2.1pid namespace

不同用户进程通过pid namespace进行隔离,且不同namespace可以有相同pid。

有了pid namespace,每个namespace中的pid能够相互隔离。

3.2.2net namespace

网络隔离是通过net namespace实现的,每个net namespace有独立的network devices、ip addresses、ip routing tables、/proc/net目录。

3.2.3ipc namespace

Container 中进程交互还是采用 linux 常见的进程间交互方法 (interprocess communication – IPC), 包括常见的信号量、消息队列和共享内存。

container 的进程间交互实际上还是 host上 具有相同 Pid namespace 中的进程间交互,因此需要在 IPC资源申请时加入 namespace 信息 - 每个 IPC 资源有一个唯一的 32 位 ID。

3.2.4mnt namespace

mnt namespace允许不同的namespace的进程看到不同的文件结构,这样每个namespace中的进程所看到的文件目录就被隔离开了。

3.2.5uts namespace

UTS(“UNIX Time-sharing System”) namespace允许每个 container 拥有独立的 hostname 和domain name, 使其在网络上可以被视作一个独立的节点而非 Host 上的一个进程。

3.2.6user namespace

每个 container 可以有不同的 user 和 group id, 也就是说可以在 container 内部用 container 内部的用户执行程序而非 Host 上的用户。

3.3Linux内核中的namespace结构体

docker-Namespace隔离_docker_05

3.4Linux中对namespace的操作方法

3.4.1clone

在创建新进程的系统调用时,可以通过 flags 参数指定需要新建的 Namespace 类型:

// CLONE_NEWCGROUP / CLONE_NEWIPC / CLONE_NEWNET / CLONE_NEWNS / CLONE_NEWPID / CLONE_NEWUSER / CLONE_NEWUTS

例如:

当我们用clone()系统调用创建一个新进城时,可以在参数中指定CLONE_NEWPID参数

int pid = clone(main_function, stack_size, CLONE_NEWPID | SIGCHLD, NULL); 


这时,新建的进程就会看到一个全新的进程空间,在这个进程空间中,它的pid是1。

3.4.2setns

该系统调用可以让调用进程加入某个已经存在的 Namespace 中:

Int setns(int fd, int nstype) 

3.4.3 unshare

该系统调用可以将调用进程移动到新的 Namespace 下:

int unshare(int flags)


3.5关于namespace的常用操作

3.5.1 查看当前系统的 namespace

lsns –t <type>

查看有哪些网络namespace
root@ubuntu-impish:~# lsns -t net
NS TYPE NPROCS PID USER NETNSID NSFS COMMAND
4026531992 net 113 1 root unassigned /sbin/init
4026532201 net 1 5786 root 0 /run/docker/netns/fdccb211a1f0 /bin/sh

查看有哪些pid namespace
root@ubuntu-impish:~# lsns -t pid
NS TYPE NPROCS PID USER COMMAND
4026531836 pid 113 1 root /sbin/init
4026532199 pid 1 5786 root /bin/sh

查看有哪些mnt namespace
root@ubuntu-impish:~# lsns -t mnt
NS TYPE NPROCS PID USER COMMAND
4026531840 mnt 106 1 root /sbin/init
4026531860 mnt 1 23 root kdevtmpfs
4026532176 mnt 1 382 root /lib/systemd/systemd-udevd
4026532179 mnt 1 5

以上是关于docker-Namespace隔离的主要内容,如果未能解决你的问题,请参考以下文章

Docker是如何实现隔离的

Docker学习笔记Docker容器相关技术

yarn 容器资源隔离和docker容器资源隔离实现原理

白话 Linux 容器资源的隔离限制原理

Docker快速指南

docker入门LXCwindows container 和 Hyper知识基础实用情况