from c iel to firmament & dios a heavenly tale of not just clouds
DESCRIPTION
From C IEL to Firmament & DIOS a heavenly tale of not just clouds. Joint work with:. Steven Hand Anil Madhavapeddy Chris Smowton Steven Smith Derek Murray (MSR-SVC). Disclaimer . Recap: C IEL. [NSDI 2011]. A. B. G. M. M. M. R. R. R. Dynamic task graphs. G. a. M. M. T. x. - PowerPoint PPT PresentationTRANSCRIPT
From CIEL to Firmament & DIOS
a heavenly tale of not just clouds
Joint work with: Steven HandAnil MadhavapeddyChris SmowtonSteven SmithDerek Murray (MSR-SVC)
Disclaimer
Recap: CIEL[N
SDI 2
011]
A B
G
M
R
M
R R
M
Dynamic task graphs• Allow tasks to spawn more tasks
T
M RM R
G
a
b
x
Experiment from D. Murray,A distributed execution engine supportingdata-dependent control flow. PhD thesis, University of Cambridge,2011.
[interlude]polyglot CIEL
polyglot CIEL[u
npub
lishe
d]
Saving state – options
BLCR(process
checkpoint.)
Haskellmonads
lightweightheavyweight
hardware / OS level
VM migration Serializablecontinuations
application levelCIEL
no need for Skywriting!
JavaScala
HaskellStackless Python
OCaml(C with BLCR)
Experiment from D. Murray, C. Smowton, M. Schwarzkopf, S. Smith, A. Madhavapeddy. A polyglot approach to cloud programming. Unpublished, 2011.
Binomial options pricing
What‘s next?!
many-core clusters
heterogeneity
timespin on unmodified CIEL
seconds
number of cores
less is better
41.6x
1.04x
1.3x5.1x
rel. overhead
EnterFirmament
andDIOS
[Data-Intensive Operating System]
User code
CIEL
...Hardware ...
Execution Engine
Programming Model
Host OS ...
1st class exec. Skywriting2nd
Master W0 W1 Wn...
User code
Hardware
CIEL
...
Firmament: Coordination Engine
Programming Model1st class exec. Skywriting2nd
...DIOS
Firmament
multi-scale
heterogeneity-aware
How much heterogeneity is
there?
Google trace, machine platforms
CPU cores (normalized) Total RAM (normalized)
Google trace, machine specs
Google trace, platforms + specs
Google trace, machine attributes
FirmamentCluster knowledge base
• historic task resource usage historic task performance info machine informationEfficient runtime
[Storage? Networking? Transfer management?]
FirmamentIt’s real!
• ~2k LOC, basic tests run
ToDo (aka WIP):• knowledge base design & impl. scheduling algorithms interface to CIEL
User code
Hardware
CIEL
...
Firmament: Coordination Engine
Programming Model1st class exec. Skywriting2nd
...DIOS
DIOStopology-aware
interference-aware
lightweight OS
Heterogeneity [again!]
Many-core => intra-machine
communication = important!
Intel Core i7-2600K @ 3.40GHz (native)
Joint work with Steven Smith, Anil Madhavapeddy, and Chris Smowton; cf. “The case for reconfigurable I/O“ (RESoLVE 2012)
48-core AMD Opteron 6168 (native) (Xen)
Intel Xeon E5620 @ 2.40GHz (native)
Different physicalcore
Hyperthread
Joint work with Steven Smith, Anil Madhavapeddy, and Chris Smowton;
cf http://fable.io
Intel Core i7-2600K @ 3.40GHz (native)
Different physicalcore
Hyperthread
Joint work with Steven Smith, Anil Madhavapeddy, and Chris Smowton;
cf http://fable.io
AMD Opteron 6168@ 1.9 GHz (native)
Same MCM, same socket
Different MCM,different socket,
2-hop Hypertransport
Joint work with Steven Smith, Anil Madhavapeddy, and Chris Smowton;
cf http://fable.io
Topology-awareness
OS responsibility? Yes.
General case = hard!
Workload-awareness helps!
hwloc
Interference
#include <results>
Make the OS do exactly (and just) what is needed.
Dedicate resources instead of sharing them.
Lightweight
Shell Standard libsFilesystem
Multi-threading LockingConcurrency primitivesPre-emption
Process mgmt I/O mgmt IsolationResourcemultiplexing
Scheduling
Ta
b
x
Scheduling
Program ...
Firmament: Coordination Engine
...DIOS
DIOSPieces exist
• currently combining ;-)
WIP:• interference experiments related work reading group starting point? (Linux or Xen?)
BACKUP SLIDES
Binomial options pricing
200k (EC2)200k (MC)
400k (MC)
400k (EC2)
800k (EC2)
800k (MC)
higheris better
Numbers and experiment by Sören Bleikertz: http://openfoo.org/blog/redis-native-xen.html
0
5000
10000
15000
20000
25000
PING
PING (m
ulti b
ulk)
SETGET
INCR
LPUSH
LPOP
SADDSPOP
LPUSH
LRANGE (fi
rst 10
0)
LRANGE (fi
rst 30
0)
LRANGE (fi
rst 45
0)
LRANGE (fi
rst 60
0)
Xen stubdomLinux VMnative
Redis example