docker - container and lightweight virtualization

20
Docker - Container and Lightweight virtualization Paul Sim Technical Account Manager [email protected]

Upload: janghoon-sim

Post on 15-Jan-2015

1.506 views

Category:

Technology


4 download

DESCRIPTION

Docker

TRANSCRIPT

Page 1: Docker - container and lightweight virtualization

Docker - Container and Lightweight virtualization

Paul SimTechnical Account [email protected]

Page 2: Docker - container and lightweight virtualization

Virtualization

Virtual Machine

Hardware

Hypervisor

Virtual Machine

Virtual Machine

Application

Application

Application

Application

Application

Application

Hardware

Operating System

Container Container

Kernel Kernel Kernel

Para-virtualizatio

n

Para-virtualizatio

n

Container

Application

Application

Application

Application

Application

Application

Type 1, Type 2 Lightweight virtualization

Page 3: Docker - container and lightweight virtualization

Namespace Namespace

Ubuntu Precise Ubuntu Trusty

Namespace

CentOS

Linux Container - aka LXC

Hardware

Linux Kernel Namespace

- UTS - IPC - PID - Network - User

Control group

MySQL

apache

Rails

tomcat

Nginx

MongoDB

running env running env running env

Page 4: Docker - container and lightweight virtualization

Linux Container - performance

less than 1 % degradation

Realizing Linux Containers (LXC) - IBM

Page 5: Docker - container and lightweight virtualization

Linux Container - performance

Realizing Linux Containers (LXC) - IBM

Page 6: Docker - container and lightweight virtualization

Docker

janghoon@ubuntu:~$ sudo docker run -i -t centos:latest /bin/bashbash-4.1# ps -efUID PID PPID C STIME TTY TIME CMDroot 1 0 0 15:51 ? 00:00:00 /bin/bashroot 75 1 0 15:54 ? 00:00:00 /usr/sbin/httpdapache 77 75 0 15:54 ? 00:00:00 /usr/sbin/httpdapache 78 75 0 15:54 ? 00:00:00 /usr/sbin/httpdapache 79 75 0 15:54 ? 00:00:00 /usr/sbin/httpdapache 80 75 0 15:54 ? 00:00:00 /usr/sbin/httpdapache 81 75 0 15:54 ? 00:00:00 /usr/sbin/httpdapache 82 75 0 15:54 ? 00:00:00 /usr/sbin/httpdapache 83 75 0 15:54 ? 00:00:00 /usr/sbin/httpdapache 84 75 0 15:54 ? 00:00:00 /usr/sbin/httpdbash-4.1# lsbin boot dev etc home lib lib64 lost+found media mnt opt proc root sbin selinux srv sys tmp usr varbash-4.1# uname -aLinux 7c6702b13a48 3.13.0-24-generic #47-Ubuntu SMP Fri May 2 23:30:00 UTC 2014 x86_64 x86_64 x86_64 GNU/Linuxbash-4.1# cat /etc/redhat-release CentOS release 6.5 (Final)

janghoon@ubuntu:~$ ps -ef | grep httpdroot 7605 7256 0 6월12 ? 00:00:00 /usr/sbin/httpd48 7607 7605 0 6월12 ? 00:00:00 /usr/sbin/httpd48 7608 7605 0 6월12 ? 00:00:00 /usr/sbin/httpd48 7609 7605 0 6월12 ? 00:00:00 /usr/sbin/httpd48 7610 7605 0 6월12 ? 00:00:00 /usr/sbin/httpd48 7611 7605 0 6월12 ? 00:00:00 /usr/sbin/httpd48 7612 7605 0 6월12 ? 00:00:00 /usr/sbin/httpd48 7613 7605 0 6월12 ? 00:00:00 /usr/sbin/httpd48 7614 7605 0 6월12 ? 00:00:00 /usr/sbin/httpd

Page 7: Docker - container and lightweight virtualization

Docker

Docker is an open platform for developers and sysadmins to build, ship, and run distributed applications.

Consisting of Docker Engine, a portable, lightweight runtime and packaging tool, and Docker Hub, a

cloud service for sharing applications and automating workflows, Docker enables apps to be quickly

assembled from components and eliminates the friction between development, QA, and production

environments. As a result, IT can ship faster and run the same app, unchanged, on laptops, data center

VMs, and any cloud.

Page 8: Docker - container and lightweight virtualization

Docker

Docker Hub/Repository

Linux

Namespace

Control group

Capabilities

AppArmor

netfilter

libcontainer lxc libvirt systemd-nspawn

Page 9: Docker - container and lightweight virtualization

Docker

Docker images

A Docker image is a read-only template. For example, an image could contain an Ubuntu operating

system with Apache and your web application installed. Images are used to create Docker containers.

Docker provides a simple way to build new images or update existing images, or you can download Docker

images that other people have already created. Docker images are the build component of Docker.

Docker Registries

Docker registries hold images. These are public or private stores from which you upload or download

images. The public Docker registry is called Docker Hub. It provides a huge collection of existing images for

your use. These can be images you create yourself or you can use images that others have previously

created. Docker registries are the distribution component of Docker.

Docker containers

Docker containers are similar to a directory. A Docker container holds everything that is needed for an

application to run. Each container is created from a Docker image. Docker containers can be run, started,

stopped, moved, and deleted. Each container is an isolated and secure application platform. Docker

containers are the run component of Docker.

Page 10: Docker - container and lightweight virtualization

Custom Image

Docker

Docker Image

Debian, Ubuntu, CentOS

Running environment - libraries, binaries...

Rails MongoDB Apache

MySQL Memcached node.js

Docker Hub/Registry

Container

Run

pull

commit

Page 11: Docker - container and lightweight virtualization

Docker - networking

Host machine

Container

vNIC

NIC

vNIC

Bridge

Container

vNIC

NIC

Bridge

veth

veth

veth

veth peer

NameSpace -1 NameSpace-2

Page 12: Docker - container and lightweight virtualization

Docker - networking

janghoon@ubuntu:~$ ifconfigdocker0 Link encap:Ethernet HWaddr 4e:da:3e:50:cb:ef inet addr:172.17.42.1 Bcast:0.0.0.0 Mask:255.255.0.0 inet6 addr: fe80::18df:90ff:fe07:45b7/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:10569 errors:0 dropped:0 overruns:0 frame:0 TX packets:18750 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:553677 (553.6 KB) TX bytes:28304983 (28.3 MB)

em1 Link encap:Ethernet HWaddr c0:3f:d5:62:44:05 inet addr:172.30.1.51 Bcast:172.30.1.255 Mask:255.255.255.0 inet6 addr: fe80::c23f:d5ff:fe62:4405/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:29764 errors:0 dropped:0 overruns:0 frame:0 TX packets:20619 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:30366129 (30.3 MB) TX bytes:2240284 (2.2 MB) Interrupt:20 Memory:f7c00000-f7c20000

veth3adc Link encap:Ethernet HWaddr 4e:da:3e:50:cb:ef inet6 addr: fe80::4cda:3eff:fe50:cbef/64 Scope:Link UP BROADCAST RUNNING MTU:1500 Metric:1 RX packets:10569 errors:0 dropped:0 overruns:0 frame:0 TX packets:18739 errors:0 dropped:1 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:701643 (701.6 KB) TX bytes:28302881 (28.3 MB)

bash-4.1# ifconfigeth0 Link encap:Ethernet HWaddr FA:25:A0:77:55:C9 inet addr:172.17.0.2 Bcast:0.0.0.0 Mask:255.255.0.0 inet6 addr: fe80::f825:a0ff:fe77:55c9/64 Scope:Link UP BROADCAST RUNNING MTU:1500 Metric:1 RX packets:18739 errors:0 dropped:2 overruns:0 frame:0 TX packets:10569 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:28302881 (26.9 MiB) TX bytes:701643 (685.1 KiB)

lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)bash-4.1# routeKernel IP routing tableDestination Gateway Genmask Flags Metric Ref Use Ifacedefault 172.17.42.1 0.0.0.0 UG 0 0 0 eth0172.17.0.0 * 255.255.0.0 U 0 0 0 eth0

within a container on host machine

Page 13: Docker - container and lightweight virtualization

Docker - storage

root@ubuntu:~# ls -l /var/lib/docker/containers/7c6702b13a48a9b5ba9c70537e8c82ec95a5a98ac8fb23fa70669f39d80b1d07/roottotal 80dr-xr-xr-x 2 root root 4096 6월 12 23:52 bindrwxr-xr-x 3 root root 4096 6월 12 23:52 bootdrwxr-xr-x 4 root root 4096 6월 12 23:51 devdrwxr-xr-x 57 root root 4096 6월 12 23:52 etcdrwxr-xr-x 2 root root 4096 9월 23 2011 homedr-xr-xr-x 8 root root 4096 6월 12 23:52 libdr-xr-xr-x 6 root root 4096 6월 12 23:52 lib64drwx------ 2 root root 4096 6월 10 00:10 lost+founddrwxr-xr-x 2 root root 4096 9월 23 2011 mediadrwxr-xr-x 2 root root 4096 9월 23 2011 mntdrwxr-xr-x 2 root root 4096 9월 23 2011 optdrwxr-xr-x 2 root root 4096 6월 10 00:10 procdr-xr-x--- 2 root root 4096 6월 10 00:14 rootdr-xr-xr-x 2 root root 4096 6월 12 23:52 sbindrwxr-xr-x 3 root root 4096 6월 10 00:14 selinuxdrwxr-xr-x 2 root root 4096 9월 23 2011 srvdrwxr-xr-x 2 root root 4096 6월 10 00:10 sysdrwxrwxrwt 2 root root 4096 6월 12 23:54 tmpdrwxr-xr-x 18 root root 4096 6월 12 23:52 usrdrwxr-xr-x 24 root root 4096 6월 12 23:54 var

Page 14: Docker - container and lightweight virtualization

Namespace

root@ubuntu:~# docker ps -aCONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES7c6702b13a48 centos:centos6 /bin/bash About an hour ago Up About an hour berserk_ptolemy

root@ubuntu:~# docker inspect 7c6702b13a48 | grep Pid "Pid": 7256,

root@ubuntu:~# ls -l /proc/7256/nstotal 0lrwxrwxrwx 1 root root 0 6월 13 01:15 ipc -> ipc:[4026532245]lrwxrwxrwx 1 root root 0 6월 13 01:15 mnt -> mnt:[4026532243]lrwxrwxrwx 1 root root 0 6월 13 01:15 net -> net:[4026532248]lrwxrwxrwx 1 root root 0 6월 13 01:15 pid -> pid:[4026532246]lrwxrwxrwx 1 root root 0 6월 13 01:15 user -> user:[4026531837]lrwxrwxrwx 1 root root 0 6월 13 01:15 uts -> uts:[4026532244]

root@ubuntu:~# ps faux...root 976 0.0 0.1 439596 14768 ? Sl 6월12 0:00 \_ /usr/bin/docker.io -droot 7256 0.0 0.0 11484 1660 pts/3 Ss+ 6월12 0:00 \_ /bin/bashroot 7605 0.0 0.0 175764 3748 ? Ss 6월12 0:00 \_ /usr/sbin/httpd48 7607 0.0 0.0 175764 2216 ? S 6월12 0:00 \_ /usr/sbin/httpd48 7608 0.0 0.0 175764 2216 ? S 6월12 0:00 \_ /usr/sbin/httpd48 7609 0.0 0.0 175764 2216 ? S 6월12 0:00 \_ /usr/sbin/httpd48 7610 0.0 0.0 175764 2216 ? S 6월12 0:00 \_ /usr/sbin/httpd48 7611 0.0 0.0 175764 2216 ? S 6월12 0:00 \_ /usr/sbin/httpd48 7612 0.0 0.0 175764 2216 ? S 6월12 0:00 \_ /usr/sbin/httpd48 7613 0.0 0.0 175764 2216 ? S 6월12 0:00 \_ /usr/sbin/httpd48 7614 0.0 0.0 175764 2216 ? S 6월12 0:00 \_ /usr/sbin/httpd...

Page 15: Docker - container and lightweight virtualization

Namespace

Currently, Linux implements six different types of namespaces. The purpose of each namespace is to wrap a particular global system resource in an abstraction that makes it appear to the processes within the namespace that they have their own isolated instance of the global resource. One of the overall goals of namespaces is to support the implementation of containers, a tool for lightweight virtualization (as well as other purposes) that provides a group of processes with the illusion that they are the only processes on the system.

1. Mount namespaces (CLONE_NEWNS, Linux 2.4.19) isolate the set of filesystem mount points seen by a group of processes. Thus, processes in different mount namespaces can have different views of the filesystem hierarchy. One use of mount namespaces is to create environments that are similar to chroot jails. However, by contrast with the use of the chroot() system call, mount namespaces are a more secure and flexible tool for this task. Other more sophisticated uses of mount namespaces are also possible. For example, separate mount namespaces can be set up in a master-slave relationship, so that the mount events are automatically propagated from one namespace to another; this allows, for example, an optical disk device that is mounted in one namespace to automatically appear in other namespaces.

2. UTS namespaces (CLONE_NEWUTS, Linux 2.6.19) isolate two system identifiers—nodename and domainname—returned by the uname() system call; the names are set using the sethostname() and setdomainname() system calls. In the context of containers, the UTS namespaces feature allows each container to have its own hostname and NIS domain name.

Page 16: Docker - container and lightweight virtualization

Namespace

3. IPC namespaces (CLONE_NEWIPC, Linux 2.6.19) isolate certain interprocess communication (IPC) resources, namely, System V IPC objects and (since Linux 2.6.30) POSIX message queues. The common characteristic of these IPC mechanisms is that IPC objects are identified by mechanisms other than filesystem pathnames. Each IPC namespace has its own set of System V IPC identifiers and its own POSIX message queue filesystem.

4. PID namespaces (CLONE_NEWPID, Linux 2.6.24) isolate the process ID number space. In other words, processes in different PID namespaces can have the same PID.

5. Network namespaces (CLONE_NEWNET, started in Linux 2.4.19 2.6.24 and largely completed by about Linux 2.6.29) provide isolation of the system resources associated with networking. Thus, each network namespace has its own network devices, IP addresses, IP routing tables, /proc/net directory, port numbers, and so on.

6. User namespaces (CLONE_NEWUSER, started in Linux 2.6.23 and completed in Linux 3.8) isolate the user and group ID number spaces. In other words, a process's user and group IDs can be different inside and outside a user namespace.

Page 17: Docker - container and lightweight virtualization

Namespace

root@ubuntu:~# ps faux...root 976 0.0 0.1 439596 14768 ? Sl 6월12 0:00 \_ /usr/bin/docker.io -droot 7256 0.0 0.0 11484 1660 pts/3 Ss+ 6월12 0:00 \_ /bin/bashroot 7605 0.0 0.0 175764 3748 ? Ss 6월12 0:00 \_ /usr/sbin/httpd48 7607 0.0 0.0 175764 2216 ? S 6월12 0:00 \_ /usr/sbin/httpd48 7608 0.0 0.0 175764 2216 ? S 6월12 0:00 \_ /usr/sbin/httpd...root@ubuntu:~# ls -l /proc/7256/nslrwxrwxrwx 1 root root 0 6월 13 01:15 ipc -> ipc:[4026532245]lrwxrwxrwx 1 root root 0 6월 13 01:15 mnt -> mnt:[4026532243]lrwxrwxrwx 1 root root 0 6월 13 01:15 net -> net:[4026532248]lrwxrwxrwx 1 root root 0 6월 13 01:15 pid -> pid:[4026532246]lrwxrwxrwx 1 root root 0 6월 13 01:15 user -> user:[4026531837]lrwxrwxrwx 1 root root 0 6월 13 01:15 uts -> uts:[4026532244]root@ubuntu:~# ls -l /proc/7605/nslrwxrwxrwx 1 root root 0 6월 13 01:48 ipc -> ipc:[4026532245]lrwxrwxrwx 1 root root 0 6월 13 01:48 mnt -> mnt:[4026532243]lrwxrwxrwx 1 root root 0 6월 13 01:48 net -> net:[4026532248]lrwxrwxrwx 1 root root 0 6월 13 01:48 pid -> pid:[4026532246]lrwxrwxrwx 1 root root 0 6월 13 01:48 user -> user:[4026531837]lrwxrwxrwx 1 root root 0 6월 13 01:48 uts -> uts:[4026532244]root@ubuntu:~# ls -l /proc/976/nslrwxrwxrwx 1 root root 0 6월 13 01:47 ipc -> ipc:[4026531839]lrwxrwxrwx 1 root root 0 6월 13 01:47 mnt -> mnt:[4026531840]lrwxrwxrwx 1 root root 0 6월 13 01:47 net -> net:[4026531968]lrwxrwxrwx 1 root root 0 6월 13 01:47 pid -> pid:[4026531836]lrwxrwxrwx 1 root root 0 6월 13 01:47 user -> user:[4026531837]lrwxrwxrwx 1 root root 0 6월 13 01:47 uts -> uts:[4026531838]

Page 18: Docker - container and lightweight virtualization

Security - POSIX Capabilities

For the purpose of performing permission checks, traditional UNIX implementations distinguish two categories of processes: privileged processes (whose effective user ID is 0, referred to as superuser or root), and unprivileged processes (whose effective UID is nonzero). Privileged processes bypass all kernel permission checks, while unprivileged processes are subject to full permission checking based on the process's credentials (usually: effective UID, effective GID, and supplementary group list).

Starting with kernel 2.2, Linux divides the privileges traditionally associated with superuser into distinct units, known as capabilities, which can be independently enabled and disabled. Capabilities are a per-thread attribute. - man capabilities

Permitted

Inheritable

Effective

CAP_CHOWN

CAP_SETPCAP

CAP_NET_ADMIN

CAP_SYS_BOOT

……

Page 19: Docker - container and lightweight virtualization

Security - POSIX Capabilities

janghoon@ubuntu:~# sudo docker ps -aCONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMESc33faf7c3537 ubuntu:12.04 /bin/bash 17 hours ago Up 16 hours berserk_wright 758e5b50d54e ubuntu:12.04 /bin/bash 18 hours ago Up 16 hours prickly_fermat

* --privileged=truejanghoon@ubuntu:~# sudo getpcaps 1951Capabilities for `1951': =ep

* --privileged=falsejanghoon@ubuntu:~# sudo getpcaps 2174Capabilities for `2174': =ep cap_setpcap,cap_net_admin,cap_sys_module,cap_sys_rawio,cap_sys_pacct,cap_sys_admin,cap_sys_nice,cap_sys_resource,cap_sys_time,cap_sys_tty_config,cap_mknod,cap_audit_write,cap_audit_control,cap_mac_override,cap_mac_admin-ep

root@758e5b50d54e:/# iduid=0(root) gid=0(root) groups=0(root)

root@758e5b50d54e:/# rmmod bridgeERROR: Removing 'bridge': Operation not permitted

root@758e5b50d54e:/# iptables -L -niptables v1.4.12: can't initialize iptables table `filter': Permission denied (you must be root)Perhaps iptables or your kernel needs to be upgraded.

Page 20: Docker - container and lightweight virtualization

Control Group - aka cgroup

cgroups (control groups) is a Linux kernel feature to limit, account and isolate resource usage (CPU, memory, disk I/O, etc.) of process groups. In late 2007 it was merged to kernel version 2.6.24.By using cgroups, system administrators gain fine-grained control over allocating, prioritizing, denying, managing, and monitoring system resources. Hardware resources can be smartly divided up among tasks and users, increasing overall efficiencyCgroups are organized hierarchically, like processes, and child cgroups inherit some of the attributes of their parents.

● blkio — this subsystem sets limits on input/output access to and from block devices such as physical drives (disk, solid state, USB, etc.).

● cpu — this subsystem uses the scheduler to provide cgroup tasks access to the CPU.● cpuacct — this subsystem generates automatic reports on CPU resources used by tasks in a cgroup.● cpuset — this subsystem assigns individual CPUs (on a multicore system) and memory nodes to tasks

in a cgroup.● devices — this subsystem allows or denies access to devices by tasks in a cgroup.● freezer — this subsystem suspends or resumes tasks in a cgroup.● memory — this subsystem sets limits on memory use by tasks in a cgroup, and generates automatic

reports on memory resources used by those tasks.● net_cls — this subsystem tags network packets with a class identifier (classid) that allows the Linux

traffic controller (tc) to identify packets originating from a particular cgroup task.● net_prio — this subsystem provides a way to dynamically set the priority of network traffic per

network interface.● ns — the namespace subsystem.