enzian: in stores now · adam turowski: barrelfish, linux, and freebsd os bringup, bmc software...
TRANSCRIPT
||David Cock, Systems Group
Department of Computer Science
ETH Zurich
David Cock, and the Enzian team
Systems Group
ETH Zurich Department of Computer Science
www.enzian.systems
Enzian: in stores now
||David Cock, Systems Group
Department of Computer Science
ETH Zurich
David Cock, and the Enzian team
Systems Group
ETH Zurich Department of Computer Science
www.enzian.systems
Enzian: in stores now soon (sigh)
||David Cock, Systems Group
Department of Computer Science
ETH Zurich
David Cock, and the Enzian team
Systems Group
ETH Zurich Department of Computer Science
www.enzian.systems
Enzian: in stores now soon (sigh)
||David Cock, Systems Group
Department of Computer Science
ETH Zurich
▪ Relevant academic systems software research
in an age of custom hardware.
▪ A computer designed for academic research,
and what we will use it for.
▪ How we actually built such a machine.
▪ How to get one.
02/10/2019Building Enzian 4
What Enzian is about:
||David Cock, Systems Group
Department of Computer Science
ETH Zurich
Acknowledgements
▪ Mohsen Owaida: ECI link bringup
▪ Adam Turowski: Barrelfish, Linux, and FreeBSD
OS bringup, BMC software lead
▪ Tobias Grosser: Application use cases
▪ Amit Kulkarni: Verilog hacking
▪ Reto Achermann: Trace processing, software
emulation, Barrelfish bringup
▪ Zeke Wang: DDR4 controllers
▪ David Sidler: FPGA network implementation
▪ Alexander Hedges: Simulation environment
▪ Nikita Lazarev: Interconnect protocol
specification and modelling
▪ Abishek Ramdas: ECI implementation
▪ Dario Korolija: FPGA Shell design
▪ Fabio Maschi: Verilog hacking
▪ Timothy Roscoe, Gustavo Alonso,
David Cock
02/10/2019Building Enzian 5
…and everyone else who has helped us so far!
||David Cock, Systems Group
Department of Computer Science
ETH Zurich
The usual bla, bla, bla ...
▪ Moore’s Law
▪ Dennar scaling, physical limits
▪ Multicore
▪ GPU, TPU, FPGA
▪ Data centers and the cloud
▪ ...
▪ Corollary: Hardware is changing
really fast (because they do not know
what to do with the transistors)(courtesy Gustavo Alonso)
02/10/2019Building Enzian 6
||David Cock, Systems Group
Department of Computer Science
ETH Zurich
Hardware today is complex and diverse
▪ Custom ASICs
▪ FPGAs
▪ On-chip accelerators
▪ It’s easier to build what you want
▪ High-end CAD systems
▪ Simulators and emulators
▪ FPGA development environments
▪ Rapid fabrication of boards and ASICs
▪ Big companies do
▪ HPE’s The Machine
▪ Oracle RAPID, SPARC M7
▪ Amazon F1
▪ Microsoft Catapult
▪ Google CloudTPU
▪ Baidu FPGA deployments
▪ appliances (e.g. PureStorage)
02/10/2019Building Enzian 7
Reconfigurable and custom hardware is where the action is
||David Cock, Systems Group
Department of Computer Science
ETH Zurich
Hardware today is complex and diverse
▪ Custom ASICs
▪ FPGAs
▪ On-chip accelerators
▪ It’s easier to build what you want
▪ High-end CAD systems
▪ Simulators and emulators
▪ FPGA development environments
▪ Rapid fabrication of boards and ASICs
▪ Big companies do
▪ HPE’s The Machine
▪ Oracle RAPID, SPARC M7
▪ Amazon F1
▪ Microsoft Catapult
▪ Google CloudTPU
▪ Baidu FPGA deployments
▪ appliances (e.g. PureStorage)
02/10/2019Building Enzian 8
Reconfigurable and custom hardware is where the action is
||David Cock, Systems Group
Department of Computer Science
ETH Zurich
Hardware today is complex and diverse
▪ Custom ASICs
▪ FPGAs
▪ On-chip accelerators
▪ It’s easier to build what you want
▪ High-end CAD systems
▪ Simulators and emulators
▪ FPGA development environments
▪ Rapid fabrication of boards and ASICs
▪ Big companies do
▪ HPE’s The Machine
▪ Oracle RAPID, SPARC M7
▪ Amazon F1
▪ Microsoft Catapult
▪ Google CloudTPU
▪ Baidu FPGA deployments
▪ appliances (e.g. PureStorage)
02/10/2019Building Enzian 9
Reconfigurable and custom hardware is where the action is
||David Cock, Systems Group
Department of Computer Science
ETH Zurich
Hardware today is complex and diverse
▪ Custom ASICs
▪ FPGAs
▪ On-chip accelerators
▪ It’s easier to build what you want
▪ High-end CAD systems
▪ Simulators and emulators
▪ FPGA development environments
▪ Rapid fabrication of boards and ASICs
▪ Big companies do
▪ HPE’s The Machine
▪ Oracle RAPID, SPARC M7
▪ Amazon F1
▪ Microsoft Catapult
▪ Google CloudTPU
▪ Baidu FPGA deployments
▪ appliances (e.g. PureStorage)
02/10/2019Building Enzian 10
Reconfigurable and custom hardware is where the action is
||David Cock, Systems Group
Department of Computer Science
ETH Zurich
Hardware today is complex and diverse
▪ Custom ASICs
▪ FPGAs
▪ On-chip accelerators
▪ It’s easier to build what you want
▪ High-end CAD systems
▪ Simulators and emulators
▪ FPGA development environments
▪ Rapid fabrication of boards and ASICs
▪ Big companies do
▪ HPE’s The Machine
▪ Oracle RAPID, SPARC M7
▪ Amazon F1
▪ Microsoft Catapult
▪ Google CloudTPU
▪ Baidu FPGA deployments
▪ appliances (e.g. PureStorage)
02/10/2019Building Enzian 11
Reconfigurable and custom hardware is where the action is
||David Cock, Systems Group
Department of Computer Science
ETH Zurich
Hardware today is complex and diverse
▪ Custom ASICs
▪ FPGAs
▪ On-chip accelerators
▪ It’s easier to build what you want
▪ High-end CAD systems
▪ Simulators and emulators
▪ FPGA development environments
▪ Rapid fabrication of boards and ASICs
▪ Big companies do
▪ HPE’s The Machine
▪ Oracle RAPID, SPARC M7
▪ Amazon F1
▪ Microsoft Catapult
▪ Google CloudTPU
▪ Baidu FPGA deployments
▪ appliances (e.g. PureStorage)
02/10/2019Building Enzian 12
Reconfigurable and custom hardware is where the action is
||David Cock, Systems Group
Department of Computer Science
ETH Zurich
▪ A research platform for system software
▪ No unrealistic commodity platforms
▪ Leapfrog cost-optimized specialized designs
▪ Optimized for exploration
▪ Not for unit cost or performance/watt
▪ Overengineered rel. products
▪ Flexibile and reconfigurable
▪ Viable rackscale building block
▪ Able to function as a variety of different
components at scale
02/10/2019Building Enzian 13
What if we had…
Feasible hardware design space
Available
COTS
hardware
Specialized product
hardware designs,
dictated by hardware vendors
Scope of most systems
software research
||David Cock, Systems Group
Department of Computer Science
ETH Zurich02/10/2019 14
Sketch: the basic building block
Large
server-class
SoC
High-end
FPGA
Native
Coherence
Lots of
network
bandwidth
Lots of
DDR
Lots of
network
bandwidth
Lots of DDR
and/or HBM
NVMe, PCIe NVMe, PCIe
Building Enzian
A realistic mid-
range server
machineA large and
fast FPGA
||David Cock, Systems Group
Department of Computer Science
ETH Zurich02/10/2019 15
Applications
Accelerator design
Large
server-class
SoC
Programmable
data
processing
acceleratorNative
Coherence
Lots of
DDR
Lots of DDR
and/or HBM
NVMe, PCIe NVMe, PCIe
Building Enzian
||David Cock, Systems Group
Department of Computer Science
ETH Zurich02/10/2019 16
Applications
Hardware/software co-synthesis
Large
server-class
SoC
Programmable
data
processing
accelerator
Lots of
DDR
Lots of DDR
and/or HBM
NVMe, PCIe NVMe, PCIe
Devi
ce
drive
r
Building Enzian
DSL
||David Cock, Systems Group
Department of Computer Science
ETH Zurich02/10/2019 17
Applications
Smart storage devices
Large
server-class
SoC
• Scans
• Indexing
• Prefetching
• etc.Native
Coherence
Lots of
DDR
Lots of DDR
and/or HBM
NVMe, PCIe NVMe, PCIe
Building Enzian
||David Cock, Systems Group
Department of Computer Science
ETH Zurich02/10/2019 18
Applications
Runtime verification
Large
server-class
SoC
• Program
trace
• Cache
transactions
• I/O operations• Power
monitoring
Lots of
DDR
Lots of DDR
and/or HBM
NVMe, PCIe NVMe, PCIe
Building Enzian
LTL
spec
||David Cock, Systems Group
Department of Computer Science
ETH Zurich
▪ TeSSLa runtime monitoring framework merged with
our runtime verification system by a great student
(Pirmin Schmid).
▪ Port to Enzian ongoing (trace via ECI).
▪ Joint project with Lübeck.
02/10/2019Building Enzian 19
Applications
Runtime verification
||David Cock, Systems Group
Department of Computer Science
ETH Zurich02/10/2019 20
But imagine:
50 of these in a rack
Building Enzian
Pro
gra
mm
ab
le S
witch
Pro
gra
mm
ab
le S
witch…
||David Cock, Systems Group
Department of Computer Science
ETH Zurich02/10/2019 21
But imagine:
50 of these in a rack
Building Enzian
Pro
gra
mm
ab
le S
witch
Pro
gra
mm
ab
le S
witch…
||David Cock, Systems Group
Department of Computer Science
ETH Zurich02/10/2019 22
But imagine:
50 of these in a rack
Building Enzian
Pro
gra
mm
ab
le S
witch
Pro
gra
mm
ab
le S
witch…
*results may vary
||David Cock, Systems Group
Department of Computer Science
ETH Zurich02/10/2019 23
Applications
Remote memory, and a cure for RDMA
Large
server-class
SoC
Native
Coherence
Lots of
DDR
Lots of DDR
and/or HBM
NVMe, PCIe NVMe, PCIe
• Remote
access
• Capability
verification
• Revocation• Node
sequestration Custom
memory
protocol
Building Enzian
||David Cock, Systems Group
Department of Computer Science
ETH Zurich02/10/2019 24
Applications
Offloading consensus
Large
server-class
SoC
PAXOS/
RAFT/
VSR/
…Native
Coherence
Lots of
DDR
Lots of DDR
and/or HBM
NVMe, PCIe NVMe, PCIe
Building Enzian
||David Cock, Systems Group
Department of Computer Science
ETH Zurich
▪ No company will build this for us
▪ We asked many: they all said either “No!”, or “Why? No!”
▪ Universities obviously can’t build real computers any more
▪ Sheer complexity of 18-layer 5Ghz board design
▪ Cost of design resources
02/10/2019Building Enzian 25
Obviously, this is an impossible dream
||David Cock, Systems Group
Department of Computer Science
ETH Zurich
▪ No company will build this for us
▪ We asked many: they all said either “No!”, or “Why? No!”
▪ Universities obviously can’t build real computers any more
▪ Sheer complexity of 18-layer 5Ghz board design
▪ Cost of design resources
Or not…
▪ Friends in the EE department who can help us build small boards
▪ Outsource final board design to contractors
▪ Get technical help from friends in industry
▪ High-end CAD tools available to EU universities at deep discounts
02/10/2019Building Enzian 26
Obviously, this is an impossible dream
||David Cock, Systems Group
Department of Computer Science
ETH Zurich02/10/2019Building Enzian 27
Enzian v1 (2017)
Adaptor
(in-house)
Very Expensive
Cable (donated)
Science Shoes™
(stylish)
||David Cock, Systems Group
Department of Computer Science
ETH Zurich02/10/2019Building Enzian 28
Enzian v2 (2018)
Cavium
ThunderX-1
48x ARMv8-a
processor
Xilinx
XCVU9P
UltraScale+
FPGA
24 lanes
ECI
30 GB/s
2x10GbE
32 GB DDR4
2x100GbE
8 GB DDR4
PCIe, USB, UART PCIe, USB, UART,
Cavium EBB88 Xilinx VCU118
PCIe x8
loopback
||David Cock, Systems Group
Department of Computer Science
ETH Zurich02/10/2019Building Enzian 29
Enzian v3 (2019)
eATX Board
ThunderX XCVU9PECI
4xDDR4
128GB @ 2133
QSFP+
QSFP28
QSFP28
QSFP28
PCIe x8
NVMe x4
PCIe x16
NVMe x4
4xDDR4
512GB @ 2133
64GB @ 2400
QSFP28QSFP+
IO ShieldIO Shield
4xSATA3
30GB/s
50-60GB/s 50-70GB/s
4x100Gb/s
Or
16x25Gb/s
2x40Gb/s
6GB/s
8G
B/s
16GB/s12G
B/s
4GB/s
NVMe x4
FMC
NVMe x4
||David Cock, Systems Group
Department of Computer Science
ETH Zurich02/10/2019Building Enzian 30
The First Live* Specimen*TBD
||David Cock, Systems Group
Department of Computer Science
ETH Zurich
▪ With your Enzian, you will get
free training wheels.
▪ You won’t have to reinvent
everything straight away.
▪ Seriously: these guys (Dario
Korolija, Gustavo Alonso, ...)
know what they’re doing.
02/10/2019Building Enzian 31
LynX: A Shell for Enzian
||David Cock, Systems Group
Department of Computer Science
ETH Zurich
▪ Turns out, sometimes engineers
don’t quite get around to writing
everything down.
▪ “The only thing that’ll every talk to
our chip is our chip.”
▪ ...
▪ sigh
Important note: the designers have
been super helpful.
02/10/2019Building Enzian 32
Lessons
Debugging a Fast Interconnect
||David Cock, Systems Group
Department of Computer Science
ETH Zurich02/10/2019Building Enzian 33
Lessons
Firmware Complexity
▪ Firmware is far more complex than it looked
▪ One regulator alone has hundreds of registers
▪ Hard-to-access, poorly-documented tools
||David Cock, Systems Group
Department of Computer Science
ETH Zurich02/10/2019Building Enzian 34
Lessons
Firmware Complexity
▪ Firmware is far more complex than it looked
▪ One regulator alone has hundreds of registers
▪ Hard-to-access, poorly-documented tools
||David Cock, Systems Group
Department of Computer Science
ETH Zurich
▪ Hardware is really complex – who knew?
▪ 18+-layer board, 5GHz…
▪ Weekly phone calls with the board designers
▪ Procuring chips
▪ Complex bureaucracies, POs, quotes,
▪ Lots of NDAs just to get documentation
▪ Friends have been super helpful
▪ Donations, technical help from Xilinx and Cavium
▪ Dream Chip, our board contractors
▪ Colleagues in Electrical Engineering at ETH
▪ Delays, delays, delays…
02/10/2019Building Enzian 35
Lessons
War stories
||David Cock, Systems Group
Department of Computer Science
ETH Zurich
Summary
Systems research needs its own hardware!
The mission:
▪ We need overengineered research platforms
▪ Our research should deliver options and techniques
The machine:
▪ Server-class SoC, large FPGA, plenty RAM & b/w
▪ Balanced, coherent system
Current status:
▪ Test boards exist (respin after testing).
▪ Larger volumes after that (if it works)
▪ Open source design as much as possible
Who wants one? Who wants 50? Come to our demo.
3602/10/2019Building Enzian
www.enzian.systems