linux-cr: transparent application checkpoint-restart in linuxorenl/talks/ksummit-2010.pdf ·...

Post on 30-May-2020

4 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Linux Kernel Summit, November 2010 1 orenl@cs.columbia.eduLinux Kernel Summit, November 2010 1 orenl@cs.columbia.edu

Linux-CR:

Transparent Application Checkpoint-Restart in Linux

Linux-CR:

Transparent Application Checkpoint-Restart in Linux

Oren LaadanColumbia University

orenl@cs.columbia.edu

Linux Kernel Summit, November 2010 2 orenl@cs.columbia.eduLinux Kernel Summit, November 2010 2 orenl@cs.columbia.edu

Application C/RApplication C/R

◆ Application Checkpoint/Restart

a mechanism to save the state ofrunning application(s) so that they can later resume execution from that point

Linux Kernel Summit, November 2010 3 orenl@cs.columbia.eduLinux Kernel Summit, November 2010 3 orenl@cs.columbia.edu

checkpointimage

Application C/RApplication C/R

original restoredhierarchy hierarchy

restartcheckpoint

Linux Kernel Summit, November 2010 4 orenl@cs.columbia.eduLinux Kernel Summit, November 2010 4 orenl@cs.columbia.edu

What is it good for ?What is it good for ?

◆ Application roll back to the past◆ Application suspend and resume◆ Application migration

Linux Kernel Summit, November 2010 5 orenl@cs.columbia.eduLinux Kernel Summit, November 2010 5 orenl@cs.columbia.edu

Application rollbackApplication rollback

◆ Fault tolerance◆ Effective debugging◆ Fast application start-up◆ Software testing◆ Generic time-machine

Linux Kernel Summit, November 2010 6 orenl@cs.columbia.eduLinux Kernel Summit, November 2010 6 orenl@cs.columbia.edu

Application rollbackApplication rollback

◆ Fault tolerance◆ long running applications◆ cloud, HPC, at work, at home

◆ Effective debugging◆ Fast application start-up◆ Software testing◆ Generic time-machine

Linux Kernel Summit, November 2010 7 orenl@cs.columbia.eduLinux Kernel Summit, November 2010 7 orenl@cs.columbia.edu

Application rollbackApplication rollback

◆ Fault tolerance◆ Effective debugging

◆ Super-core-dump● more details, multiple tasks

◆ re-run from checkpoint● trace, profile, and instrument

◆ Fast application start-up◆ Software testing◆ Generic time-machine

Linux Kernel Summit, November 2010 8 orenl@cs.columbia.eduLinux Kernel Summit, November 2010 8 orenl@cs.columbia.edu

Application rollbackApplication rollback

◆ Fault tolerance◆ Effective debugging◆ Fast application start-up

◆ from default/previous state (ccache...)◆ improve desktop boot time

◆ Software testing◆ Generic time-machine

Linux Kernel Summit, November 2010 9 orenl@cs.columbia.eduLinux Kernel Summit, November 2010 9 orenl@cs.columbia.edu

Application rollbackApplication rollback

◆ Fault tolerance◆ Effective debugging◆ Fast application start-up◆ Software testing

◆ repeat from specific point(s)◆ distribute on multiple hosts

◆ Generic time-machine

Linux Kernel Summit, November 2010 10 orenl@cs.columbia.eduLinux Kernel Summit, November 2010 10 orenl@cs.columbia.edu

Application rollbackApplication rollback

◆ Fault tolerance◆ Effective debugging◆ Fast application start-up◆ Software testing◆ Generic time-machine

◆ revive old server/desktop state◆ retry a move in a game

Linux Kernel Summit, November 2010 11 orenl@cs.columbia.eduLinux Kernel Summit, November 2010 11 orenl@cs.columbia.edu

Application suspend/resumeApplication suspend/resume

◆ Improved OOM handling◆ Better system utilization◆ Suspend/resume a user's session

Linux Kernel Summit, November 2010 12 orenl@cs.columbia.eduLinux Kernel Summit, November 2010 12 orenl@cs.columbia.edu

Application suspend/resumeApplication suspend/resume

◆ Improved OOM handling◆ suspend applications, don't kill◆ smart “swap” on embedded

◆ Better system utilization◆ Suspend/resume a user's session

Linux Kernel Summit, November 2010 13 orenl@cs.columbia.eduLinux Kernel Summit, November 2010 13 orenl@cs.columbia.edu

Application suspend/resumeApplication suspend/resume

◆ Improved OOM handling◆ Better system utilization

◆ suspend application to reduce load

◆ Suspend/resume a user's session

Linux Kernel Summit, November 2010 14 orenl@cs.columbia.eduLinux Kernel Summit, November 2010 14 orenl@cs.columbia.edu

Application suspend/resumeApplication suspend/resume

◆ Improved OOM handling◆ Better system utilization◆ Suspend/resume a user's session

◆ mobile desktop on USB key◆ linux based VPS/VDI systems

Linux Kernel Summit, November 2010 15 orenl@cs.columbia.eduLinux Kernel Summit, November 2010 15 orenl@cs.columbia.edu

Application MigrationApplication Migration

◆ Load balancing / resource sharing◆ Zero-downtime maintenance◆ High availability

Linux Kernel Summit, November 2010 16 orenl@cs.columbia.eduLinux Kernel Summit, November 2010 16 orenl@cs.columbia.edu

Application MigrationApplication Migration

◆ Load balancing / resource sharing◆ HPC (e.g. BlueWaters project)◆ cloud environments◆ linux-based VPS/VDI

◆ Zero-downtime maintenance◆ High availability

Linux Kernel Summit, November 2010 17 orenl@cs.columbia.eduLinux Kernel Summit, November 2010 17 orenl@cs.columbia.edu

Application MigrationApplication Migration

◆ Load balancing / resource sharing◆ Zero-downtime maintenance

◆ live migration of applications

◆ High availability

Linux Kernel Summit, November 2010 18 orenl@cs.columbia.eduLinux Kernel Summit, November 2010 18 orenl@cs.columbia.edu

Application MigrationApplication Migration

◆ Load balancing / resource sharing◆ Zero-downtime maintenance◆ High availability

◆ primary/backup in lock-step◆ frequent incremental checkpoints

Linux Kernel Summit, November 2010 19 orenl@cs.columbia.eduLinux Kernel Summit, November 2010 19 orenl@cs.columbia.edu

Application vs Virtual-MachineApplication vs Virtual-Machine

Application Virtual C/R Machine

granularity specific operating systemapplications as a whole unit

saved state application entire operatingstate only system state

overhead none visible

flexibility application operating systemawareness is black box

deployment linux only same arch family

Linux Kernel Summit, November 2010 20 orenl@cs.columbia.eduLinux Kernel Summit, November 2010 20 orenl@cs.columbia.edu

Some examplesSome examples

◆ HPC environments◆ can extend linux-cr → distributed-cr

◆ Cloud deployments◆ using linux containers

◆ Light-weight clusters of ARMs◆ combine LXC and linux-cr

Linux Kernel Summit, November 2010 21 orenl@cs.columbia.eduLinux Kernel Summit, November 2010 21 orenl@cs.columbia.edu

Some concrete examplesSome concrete examples

◆ BlueWaters◆ NCSA's most powerful supercomputer◆ checkpointing based on linux-cr

◆ OpenVZ◆ VPS hosting with migration capabilities

◆ Canonical / Ubuntu◆ add LXC & linux-cr in UEC cluster stack

Linux Kernel Summit, November 2010 22 orenl@cs.columbia.eduLinux Kernel Summit, November 2010 22 orenl@cs.columbia.edu

Who and WhoWho and Who

◆ Who is doing ?◆ Matt Helsley, Dan Smith, Serge Hallyn,

Nathan Lynch, Sukadev Bhattiprolu, me

◆ Who is interested ?◆ IBM, Canonical, OpenVZ, HPC industry,

Kerrighed, Google (?), ...

◆ Who else does/did ?◆ OS: AIX, OpenVZ, IRIX, Cray...◆ Systems: Moab, BLCR/Beowolf, Condor...

Linux Kernel Summit, November 2010 23 orenl@cs.columbia.eduLinux Kernel Summit, November 2010 23 orenl@cs.columbia.edu

Linux-C/R design goalsLinux-C/R design goals

◆ Transparency◆ Reliability◆ Security/safety◆ Performance◆ Maintainability

Linux Kernel Summit, November 2010 24 orenl@cs.columbia.eduLinux Kernel Summit, November 2010 24 orenl@cs.columbia.edu

Linux-C/R design goalsLinux-C/R design goals

◆ Transparency◆ applications oblivious to operation◆ allow notify of checkpoint or restart◆ allow application awareness

◆ Reliability◆ Security/safety◆ Performance◆ Maintainability

Linux Kernel Summit, November 2010 25 orenl@cs.columbia.eduLinux Kernel Summit, November 2010 25 orenl@cs.columbia.edu

Linux-C/R design goalsLinux-C/R design goals

◆ Transparency◆ Reliability

◆ checkpoint succeeds → restart succeeds◆ report non-checkpoint-able reasons◆ checkpoint is non-intrusive

◆ Security/safety◆ Performance◆ Maintainability

Linux Kernel Summit, November 2010 26 orenl@cs.columbia.eduLinux Kernel Summit, November 2010 26 orenl@cs.columbia.edu

Linux-C/R design goalsLinux-C/R design goals

◆ Transparency◆ Reliability◆ Security/safety

◆ ptrace capabilities to checkpoint◆ reuse kernel code to reconstruct state

◆ Performance◆ Maintainability

Linux Kernel Summit, November 2010 27 orenl@cs.columbia.eduLinux Kernel Summit, November 2010 27 orenl@cs.columbia.edu

Linux-C/R design goalsLinux-C/R design goals

◆ Transparency◆ Reliability◆ Security/safety◆ Performance

◆ zero impact on performance◆ reasonable code footprint

◆ Maintainability

Linux Kernel Summit, November 2010 28 orenl@cs.columbia.eduLinux Kernel Summit, November 2010 28 orenl@cs.columbia.edu

Linux-C/R design goalsLinux-C/R design goals

◆ Transparency◆ Reliability◆ Security/safety◆ Performance◆ Maintainability

◆ next slide ...

Linux Kernel Summit, November 2010 29 orenl@cs.columbia.eduLinux Kernel Summit, November 2010 29 orenl@cs.columbia.edu

MaintainabilityMaintainability

◆ Placement of C/R code◆ Extensive test-suite◆ Positive experience so far◆ Impact on developers

Linux Kernel Summit, November 2010 30 orenl@cs.columbia.eduLinux Kernel Summit, November 2010 30 orenl@cs.columbia.edu

MaintainabilityMaintainability

◆ Placement of C/R code◆ generic code in kernel/checkpoint/...◆ most c/r code with or near subsystem

code so subsytem maintainers sees it◆ c/r is well documented

◆ Extensive test-suite◆ Positive experience so far◆ Impact on developers

Linux Kernel Summit, November 2010 31 orenl@cs.columbia.eduLinux Kernel Summit, November 2010 31 orenl@cs.columbia.edu

MaintainabilityMaintainability

◆ Placement of C/R code◆ Extensive test-suite

◆ test large list (>120) of scenarios◆ test before/during/after behavior

◆ Positive experience so far◆ Impact on developers

Linux Kernel Summit, November 2010 32 orenl@cs.columbia.eduLinux Kernel Summit, November 2010 32 orenl@cs.columbia.edu

MaintainabilityMaintainability

◆ Placement of C/R code◆ Extensive test-suite◆ Positive experience (2.6.27 → today)

◆ can ignore most kernel changes◆ mainly need to add features◆ minor changes to prior c/r code

● e.g. splice/pipe, syscalls #s, mm helpers

◆ Impact on developers

Linux Kernel Summit, November 2010 33 orenl@cs.columbia.eduLinux Kernel Summit, November 2010 33 orenl@cs.columbia.edu

MaintainabilityMaintainability

◆ Placement of C/R code◆ Extensive test-suite◆ Positive experience so far◆ Impact on developers

◆ understand what may affect c/r code◆ at least notify c/r people when needed◆ awareness will grow with exposure

Linux Kernel Summit, November 2010 34 orenl@cs.columbia.eduLinux Kernel Summit, November 2010 34 orenl@cs.columbia.edu

Design SummaryDesign Summary

◆ Save/restore state in-kernel◆ Checkpoint container/subtree/self◆ Image holds “user-visible” state◆ Userspace image conversion◆ Detailed error reporting

Linux Kernel Summit, November 2010 35 orenl@cs.columbia.eduLinux Kernel Summit, November 2010 35 orenl@cs.columbia.edu

CheckpointCheckpoint

(1) Freeze process hierarchy

(2) Save global data

(3) Save process hierarchy

(4) Save state of all tasks

(?) Filesystem snapshot

(5) Thaw/kill process hierarchy

in-kernel

Linux Kernel Summit, November 2010 36 orenl@cs.columbia.eduLinux Kernel Summit, November 2010 36 orenl@cs.columbia.edu

RestartRestart

(1) Create container

(?) Restore (stage) filesystem

(3) Create process hierarchy

(4) Restore state of all tasks

(5) Resume execution

in-kernel

Linux Kernel Summit, November 2010 37 orenl@cs.columbia.eduLinux Kernel Summit, November 2010 37 orenl@cs.columbia.edu

Current StateCurrent State

◆ Supported subsystems:◆ tasks (threads, signals, credentials, etc)◆ namespaces (all but mounts-ns)◆ sysvipc (shm, msg, sem)◆ files, dirs (regular, fifos/pipes, epoll,

event, simple devices)◆ sockets (unix, ipv4, ipv6)◆ security (smack, selinux labels)

Linux Kernel Summit, November 2010 38 orenl@cs.columbia.eduLinux Kernel Summit, November 2010 38 orenl@cs.columbia.edu

Current StateCurrent State

◆ What's missing◆ [reviewed] file locks, leases, owner◆ [reviewed] unlinked files/dirs◆ [wip] fanotify/inotify/dnotify◆ [wip] mounts, mount-ns◆ /proc filesystem◆ ptraced tasks◆ more devices

Linux Kernel Summit, November 2010 39 orenl@cs.columbia.eduLinux Kernel Summit, November 2010 39 orenl@cs.columbia.edu

Current StateCurrent State

◆ Supported architectures:◆ x86-32◆ x86-64◆ s390x◆ PowerPC◆ ARM

Linux Kernel Summit, November 2010 40 orenl@cs.columbia.eduLinux Kernel Summit, November 2010 40 orenl@cs.columbia.edu

Current StateCurrent State

◆ Code:◆ ~23K lines in total

● ~1200 lines documentation● ~600 lines per arch (x5)● ~8K lines in kernel/checkpoint/* (base)● ~7K lines “near-place” files (*/checkpoint.c)● ~2K lines “in-place” save/restore● ~2K lines for LSM

Linux Kernel Summit, November 2010 41 orenl@cs.columbia.eduLinux Kernel Summit, November 2010 41 orenl@cs.columbia.edu

Discussion ...Discussion ...

◆ Concrete path to mainline (mm/next?)◆ Exposure to subsystem maintainers ?◆ Image format tied to kernel version

(userspace conversion tools)

Many thanks to those who reviewed, tested, and provided suggestions !

● Web page: http://www.linux-cr.org/● Git tree(s): git://www.linux-cr.org/git/

top related