containerd SummitDeep Dive into containerd
Agenda 09:00 -11:00 - containerd Deep Dive / What’s new / Roadmap (Michael Crosby & team)
• Container execution and supervision • Image distribution & Local storage • Network Interfaces Management • Integrating containerd with other systems, Native plumbing level API, etc
11:00 -11:30 - Talk #1 - Use of the gRPC API for “driving” containerd by Phil Estes (IBM)11:30 -12:00 - Talk #2 - containerd and Kubernetes CRI by Tim Hockin (Google)
12:30 to 13:00 Lunch & networking
13:00 to 15:00 - Hacking & Open-source-a-thon• Container execution and supervision by Michael (video game room)• Image distribution & Local storage by Stephen and Derek (main room)
15:30 - 16:30 Feedback on governance - Integrating containerd with other systems (Native plumbing level API, CRI, Networking) by Phil and Tim, Michael (main room)
16:30 - BOFs recap + AMA / panel
17:30 - Happy hour2
Donations going to Girls Develop It$1.5K going to Girls Develop It thanks to your donations!
Girl Develop It, a national nonprofit, provides women with low-cost, judgment-free opportunities to learn software development through in-person programs. In 50 cities throughout the US, they cultivate thriving tech communities built around education and support.
3
Docker Internals Summit @ DockerCon• containerd only in the AM
• Other Docker Internals in the PM (Libnetwork, Notary, SwarmKit, InfraKit, VPNKit, DataKit, HyperKit, etc)
You don’t have to attend the whole conference to attend this summit on 4/20
4
containerd: What is a Core Container Runtime?Component that provides core primitives to manage containers on a host• Container execution and supervision• Image distribution• Network Interfaces & Mgmt• Local storage• Native plumbing level API
5
containerd’s role in Container Ecosystem
6
containerd 1.0 planned for Q2 2017
7
Architecture & Flow• Distribution• Content Store• Snapshots
– RO Image Data– RW Container Data
• Bundle Creation– Configuration– Root Filesystem
• Execution
8
Evolution • containerd is an evolution, not a rewrite
9
containerd
containerd report
10
What do runtimes need?
{ "schemaVersion": 2, "mediaType": "application/vnd.docker.distribution.manifest.list.v2+json", "manifests": [ { "mediaType": "application/vnd.docker.distribution.manifest.v1+json", "size": 2094, "digest": "sha256:7820f9a86d4ad15a2c4f0c0e5479298df2aa7c2f6871288e2ef8546f3e7b6783", "platform": { "architecture": "ppc64le", "os": "linux" } }, { "mediaType": "application/vnd.docker.distribution.manifest.v1+json", "size": 1922, "digest": "sha256:ae1b0e06e8ade3a11267564a26e750585ba2259c0ecab59ab165ad1af41d1bdd", "platform": { "architecture": "amd64", "os": "linux", "features": [ "sse" ] } }, { "mediaType": "application/vnd.docker.distribution.manifest.v1+json", "size": 2084, "digest": "sha256:e4c0df75810b953d6717b8f8f28298d73870e8aa2a0d5e77b8391f16fdfbbbe2", "platform": { "architecture": "s390x", "os": "linux" } }, { "mediaType": "application/vnd.docker.distribution.manifest.v1+json", "size": 2084, "digest": "sha256:07ebe243465ef4a667b78154ae6c3ea46fdb1582936aac3ac899ea311a701b40", "platform": { "architecture": "arm", "os": "linux", "variant": "armv7" } }, { "mediaType": "application/vnd.docker.distribution.manifest.v1+json", "size": 2090, "digest": "sha256:fb2fc0707b86dafa9959fe3d29e66af8787aee4d9a23581714be65db4265ad8a", "platform": { "architecture": "arm64", "os": "linux", "variant": "armv8" } } ]}
Image FormatsDocker and OCI
12
Index (Manifest List)
linux amd64
linux ppc64le
windows amd64
Manifests:
Manifest
linux arm64
Layers:
Config:
L0
L1
Ln
Root Filesystem
/usr/bin/dev/etc/home/lib
C
OCI Spec
processargsenvcwd…
rootmounts
Content Addressabilitydigest.FromString(“foo”) ->
“sha256:2c26b46b68ffc68ff99b453c1d30413413422d706483bfa0f98a5e886266e7ae”digest.FromString(“foo tampered”) ->
“sha256:51f7f1d1f6bebed72b936c8ea257896cb221b91d303c5b5c44073fce33ab8dd8”digest.FromString(“bar sha256:2c...”) ->
“sha256:2e94890c66fbcccca9ad680e1b1c933cc323a5b4bcb14cc8a4bc78bb88d41055”
13
“foo”
“bar sha256:2c…”
“foo tampered”
“bar sha256:2c…”
{ "schemaVersion": 2, "mediaType": "application/vnd.docker.distribution.manifest.list.v2+json", "manifests": [ { "mediaType": "application/vnd.docker.distribution.manifest.v1+json", "size": 2094, "digest": "sha256:7820f9a86d4ad15a2c4f0c0e5479298df2aa7c2f6871288e2ef8546f3e7b6783", "platform": { "architecture": "ppc64le", "os": "linux" } }, { "mediaType": "application/vnd.docker.distribution.manifest.v1+json", "size": 1922, "digest": "sha256:ae1b0e06e8ade3a11267564a26e750585ba2259c0ecab59ab165ad1af41d1bdd", "platform": { "architecture": "amd64", "os": "linux", "features": [ "sse" ] } }, { "mediaType": "application/vnd.docker.distribution.manifest.v1+json", "size": 2084, "digest": "sha256:e4c0df75810b953d6717b8f8f28298d73870e8aa2a0d5e77b8391f16fdfbbbe2", "platform": { "architecture": "s390x", "os": "linux" } }, { "mediaType": "application/vnd.docker.distribution.manifest.v1+json", "size": 2084, "digest": "sha256:07ebe243465ef4a667b78154ae6c3ea46fdb1582936aac3ac899ea311a701b40", "platform": { "architecture": "arm", "os": "linux", "variant": "armv7" } }, { "mediaType": "application/vnd.docker.distribution.manifest.v1+json", "size": 2090, "digest": "sha256:fb2fc0707b86dafa9959fe3d29e66af8787aee4d9a23581714be65db4265ad8a", "platform": { "architecture": "arm64", "os": "linux", "variant": "armv8" } } ]}
Image FormatsDocker and OCI
14
Index (Manifest List)
linux amd64
linux ppc64le
windows amd64
Manifests:
Manifest
linux arm64
Layers:
Config:
L0
Ln
C
Digest
Layer File 0
Layer File 0
Layer File 0
L1Digest
Digest
Digest
Digest
Pulling an Image
15
Data Flow
Content Metadata Snapshots
Pull
Fetch Unpack
Events
Remote
Mounts
Content Service// Content provides access to a content addressable storage system.
service Content {
// Info returns information about a committed object.
rpc Info(InfoRequest) returns (InfoResponse);
// Read allows one to read an object based on the offset into the content.
rpc Read(ReadRequest) returns (stream ReadResponse);
// Status returns the status of ongoing object ingestions, started via
// Write.
rpc Status(StatusRequest) returns (stream StatusResponse);
// Write begins or resumes writes to a resource identified by a unique ref.
// Only one active stream may exist at a time for each ref.
rpc Write(stream WriteRequest) returns (stream WriteResponse);
}
16
Content Service
Write
Read
Content
Digested
How does one get a manifest?
ResolutionGetting a digest from a name
18
ubuntu
sha256:71cd81252a3563a03ad8daee81047b62ab5d892ebbfbf71cf53415f29c130950
Names in docker
19
Reference Type CLI Canonical
Repository ubuntu docker.io/library/ubuntu
Untagged ubuntu docker.io/libary/ubuntu:latest
Tagged ubuntu:16.04 docker.io/library/ubuntu:16.04
Content Trust ubuntu:latest docker.io/library/ubuntu@sha256:...
By digest ubuntu@sha256:.... docker.io/library/ubuntu@sha256:...
Unofficial tagged stevvooe/ubuntu:latest docker.io/stevvooe/ubuntu:latest
Private registry tagged myregistry.com/repo:latest myregistry.com/repo:latest
Other approaches
20
- Self Describing- Massive collisions- Complex trust scenarios
- URI Schemes: docker://docker.io/library/ubuntu- Redundant- Confuses protocols and formats- Operationally Limiting
- let configuration choose protocol and format
Naming
Locators
(docker.io/library/ubuntu, latest)
21
Schema-less URIs
ubuntu (docker name)
docker.io/library/ubuntu:latest (docker canonical)
locator object
RemotesLocators and Resolution
22
type Fetcher interface {
Fetch(ctx context.Context, id string, hints ...string) (io.ReadCloser, error)
}
type Resolver interface {
Resolve(ctx context.Context, locator string) (Fetcher, error)
}
fetcher := resolver.Resolve("docker.io/library/ubuntu")
Endlessly Configurable!
(hint: think git remotes)
Pulling an Image
1. Resolve manifest or index (manifest list)2. Download all the resources referenced by the manifest3. Unpack layers into snapshots4. Register the mappings between manifests and constituent resources
23
Pulling an Image
24
Data Flow
Content Metadata Snapshots
Pull
Fetch Unpack
Events
Remote
Mounts
The Dist Tool
$ ./bin/distUSAGE: dist [global options] command [command options] [arguments...]
VERSION: a463ba3.m
COMMANDS: pull pull an image into containerd fetch retrieve objects from a remote ingest accept content into the store active display active transfers. get get the data for an object delete, del permanently delete one or more blobs. list, ls list all blobs in the store. apply apply layer from stdin to dir help, h Shows a list of commands or help for one command
25
Experimental Toolkit for Image Distribution
Docker Graph Driver
• History– AUFS - union filesystem model for layers– Graph Driver interface
• Block level snapshots (devicemapper, btrfs, zfs)
• Union filesystems (aufs, overlay)– Content Addressability (1.10.0)
• No changes to graph driver• Layerstore - content addressability over
layers • ImageStore - content addressability over
images• ReferenceStore - name to image content
address26
Docker Storage Architecture
27
Graph Driver“layers” “mounts”
Layer Store“content addressable layers”
Image Store“image configs”
Containers“container configs”
Reference Store“names to image”
Daemon
Containerd Storage Architecture
28
Snapshotter“layer snapshots”
Content Store“content addressed blobs”
Metadata Store“references”
dist ctr
ConfigRootfs (mounts)
Snapshots
29
type Snapshotter interface {
Stat(key string) (Info, error)
Mounts(key string) ([]containerd.Mount, error)
Prepare(key, parent string) ([]containerd.Mount, error)
View(key, parent string) ([]containerd.Mount, error)
Commit(name, key string) error
Remove(key string) error
Walk(fn func(Info) error) error
}
type Info struct {
Name string // name or key of snapshot
Parent string
Kind Kind
Readonly bool
}
type Kind int
const (
KindActive Kind = iota
KindCommitted
)
● No mounting, just returns mounts!● Explicit active (rw) and committed (ro)● Commands represent lifecycle● Reference key chosen by caller (allows
using content addresses)● No tars and no diffs
Evolved from Graph Drivers● Simple layer relationships● Small and focused interface● Non-opinionated string keys
30
Active CommittedPrepare(a, P0)
Commit(P1, a′)
Snapshot Model
P0a
a′
a′′
P1
P2
Commit(P2, a′′)
Remove(c)
Networking in Containerd...
31
• No networking in containerd• https://github.com/docker/containerd/issues/362
Networking in Containerd...
32
• Provide a network namespace– Join a pre populated network namespace
• Use OCI Hooks to initialize namespace– Exec a command with the container’s state to initialize network
• Setup networking between create and start– Create container– Setup network interfaces– Start user’s process
Runtime
33
• Manage Containers Lifecycle• Mount Root Filesystems
– No container mounting in the daemon• Resilient to daemon death (e.g. Restore Containers)• Multi-Platform Support
– Differences in functionality
Runtimes
34
type Runtime interface { Create(ctx context.Context, id string, opts CreateOpts) (Container, error) Containers(context.Context) ([]Container, error) Delete(context.Context, Container) error Events(context.Context) <-chan *Event}
type Container interface { Info() ContainerInfo Start(context.Context) error State(context.Context) (State, error)}
Integration
35
• Extensible via plugin– runtimes– grpc services– snapshotters
• Lazy Porting Over• Streamlined client experience
– magic lays within containerd– concentrate on added value
Roadmap• End2End PoC
– Fetch– Store– Overlay– Execution
• Metadata Store• Windows Support
36
Meeting NotesMeeting notes from the various sessions will be sent as PR to the containerd Github repo
37
THANK YOU