application driven datacenter computing
DESCRIPTION
TRANSCRIPT
Applica'on-‐Driven Datacenter Compu'ng
Shiding Lin EDCS-HPCA, Shenzhen
2013/2/24
Let’s Start from the Search Engine…
Central Repository of
Web Pages
Inverted Index
Web Pages
Data Mining�
Index Building�
Web �
To Build a High-Throughput Storage System
Stream Block0
Block 1
…
Block X
In-Memory Records {<key, data>}
Log Block-N
Log Block-1 New stream
Block 0
Block 1
…
Block Y
Memory
Disk
Update Query
Dump
Commit
…
Log-‐based Structure Block I/O Batch Commit Stream R/W
Block
Block @disk0
Block @disk1
Block @diskN
<key, data>
Memory Disk
dump
…
Block …
A Big Virtual File by Blocks
To Build a High-Throughput Storage System
Maximize Parallelism NO RAID, Raw Disk Direct I/O Independent of FS
3-Layer Architecture of a Typical Storage System
Block
Base Stream Mod Stream
Table
Block Block Block Block Block …
Index Stream Patch Stream
To Make It Large-Scale
Which Layer to ParNNon, and the ReplicaNon Granularity? Complexity Data Exchange Traffic Reliability
Replication Scheme 1
3x Commit Cost Local I/O Only
Block
Base Stream
Mod Stream
Table
… … Block
Index Stream
Patch Stream
Replica 1
Block
Base Stream
Mod Stream
Table
… … Block
Index Stream
Patch Stream
Replica 2
Block
Base Stream
Mod Stream
Table
… … Block
Index Stream
Patch Stream
Replica 3
Replication Scheme 2
1x Commit Cost Network & Disk I/O
Base Stream
Block …
Replica 1
Block …
Replica 2
Block …
Replica 3
Mod Stream
Block …
Replica 1
Block …
Replica 2
Block …
Replica 3
Index Stream
Block …
Replica 1
Block …
Replica 2
Block …
Replica 3
Patch Stream
Block …
Replica 1
Block …
Replica 2
Block …
Replica 3
Map to Physical Architecture
Logical Layer Table Stream Block
Physical Boundary Datacenter Cluster Rack Node
Physical Layer Memory Flash Disk
What Are Changed?
Single-‐User MulN-‐Task à MulN-‐User Single-‐Task Scale & Cost Speed of Delivery
Software Architecture Principles in Datacenter
Layered à VerNcal Out-‐of-‐the-‐Box Datacenter as a Computer To Tolerate Component Failure
Hardware Architecture Principles in Datacenter
Dummy Control Logic Goes SoXware ReplicaNon/Checksum/Buffer Goes Global Programmable Expose All Interfaces Collect All Data
Hardware Architecture Principles in Datacenter
Modularized and Configurable Reduce All the Unnecessary Share All the Possible
Practice 1: Baidu SSD
Raw Channels No Shadow Buffer No Wear Leveling
Practice 2: Smart Disk Replacement
Failure and Repair Logs
Failure Model
Predict Failure Reduce False-‐Alarm
Practice 3: ARM Server
2U, 6 Nodes, 12 HDD/U Internal Network Switch