big data in hpc - ohio state...
TRANSCRIPT
1
Big Data in HPC
John ShalfLawrence Berkeley National Laboratory
Evolving Role of Supercomputing CentersEvolving Role of Supercomputing Centers
• TraditionalPillarsofscience• Theory:mathematicalmodelsofnature• Experiment:empiricaldataaboutnature
• Scientificcomputingconnectstheorytoexperiment– Computationalmodelsareusedtotesttheoriesinvolvingcomplexphenomenathat
cannotbematcheddirectlyagainstexperiments– Enablecomprehensionofcomplexexperimentaldatabydistillingcomplexdatadown
tounderstandableinformation• Thereisnoscientific“discovery”withoutatheoryaboutunderlyingmechanism• Thereisnoscientific“discovery”withoutexperimentalvalidationofmodel
TheoryMathematicalModels
ofNature
ExperimentEmpiricalDataabout
Nature
Simulation
Analysis
Mechanism
EvidenceConfirm/Violate
Breakthrough science will occur at the interface of observation and simulation
- 3 -
MaterialsandChemistry
EnvironmentandClimate
Subsurfacescience Combustion
Cosmology
Newresearchchallengestomaketheseworktogether
Data Intensive Arch for 2017 (as imagined in 2012)
- 4 -
DataIntensiveArchComputeIntensiveArch
++5TF/socket1-2sockets
5TF/sock8+sockets
64-128GBHMCorStack
1-4TBAggregateChained-HMC
1TB/s
.5-1TBCDIMM(opt.)
200TB/s5-10TBMemoryClassNVRAM
10-100TBSSDCacheorlocalFSn.a.
OrganizedforBurstBuffer
1-10PBDist.Obj.DB(e.g whamcloud)
�/�!(root)!
�Dataset0�!type,space! �Dataset1�!
type, space!�subgrp�!
�time�=0.2345!
�validity�=None!
�author�=JoeBlow!
�Dataset0.1�!type,space! �Dataset0.2�!
type,space!
Spatially-orientede.g.3D-5DTorus
50GB/s/node10TB/s/rack
100TB/s
50GB/sinject10TB/sbisect All-to-Alloriented
e.g.Dragonflyor3T
~1%nodesforStorageGateways
~10-20%nodesforStorageGateways
~1%nodesforIPGateways 40GBeEthernettoDirectfromeachnode
Compute Node! I/O Server!
Compute Node!
Compute Node!
. . .!I/O Server!
Compute Node!
Disks!
Disks!
Disks!
Disks!Metadata Server (MDS)!
Interconnect"Fabric!
RAID!Couplet!
RAID!Couplet!
50GB/sinject0.5TB/saggregate
4GB/spernode
Compute Node! I/O Server!
Compute Node!
Compute Node!
. . .!I/O Server!
Compute Node!
Disks!
Disks!
Disks!
Disks!Metadata Server (MDS)!
Interconnect"Fabric!
RAID!Couplet!
RAID!Couplet!
I/O Server!
. . .!
Compute
On-PackageDRAM
CapacityMemory
On-node-Storage
In-RackStorage
Interconnect
GlobalSharedDisk
Off-SystemNetwork
Data Intensive Arch for 2017 (as imagined in 2012)
- 5 -
DataIntensiveArchComputeIntensiveArch
++5TF/socket1-2sockets
5TF/sock8+sockets
64-128GBHMCorStack
1-4TBAggregateChained-HMC
1TB/s
.5-1TBCDIMM(opt.)
200TB/s5-10TBMemoryClassNVRAM
10-100TBSSDCacheorlocalFSn.a.
OrganizedforBurstBuffer
1-10PBDist.Obj.DB(e.g whamcloud)
�/�!(root)!
�Dataset0�!type,space! �Dataset1�!
type, space!�subgrp�!
�time�=0.2345!
�validity�=None!
�author�=JoeBlow!
�Dataset0.1�!type,space! �Dataset0.2�!
type,space!
Spatially-orientede.g.3D-5DTorus
50GB/s/node10TB/s/rack
100TB/s
50GB/sinject10TB/sbisect All-to-Alloriented
e.g.Dragonflyor3T
~1%nodesforStorageGateways
~10-20%nodesforStorageGateways
~1%nodesforIPGateways 40GBeEthernettoDirectfromeachnode
Compute Node! I/O Server!
Compute Node!
Compute Node!
. . .!I/O Server!
Compute Node!
Disks!
Disks!
Disks!
Disks!Metadata Server (MDS)!
Interconnect"Fabric!
RAID!Couplet!
RAID!Couplet!
50GB/sinject0.5TB/saggregate
4GB/spernode
Compute Node! I/O Server!
Compute Node!
Compute Node!
. . .!I/O Server!
Compute Node!
Disks!
Disks!
Disks!
Disks!Metadata Server (MDS)!
Interconnect"Fabric!
RAID!Couplet!
RAID!Couplet!
I/O Server!
. . .!
Compute
On-PackageDRAM
CapacityMemory
On-node-Storage
In-RackStorage
Interconnect
GlobalSharedDisk
Off-SystemNetwork
Goal:MaximumComputationalDensityandlocalbandwidthforgivenpower/costconstraint.
Maximizesbandwidthdensitynearcompute
Goal:MaximumDataCapacityandglobalbandwidthforgivenpower/costconstraint.
Bringmorestoragecapacitynearcompute(orconverselyembedmore
computeintothestorage).
Requiressoftwareandprogrammingenvironment
supportforsuchaparadigmshift
Old/New Conception of Cloud/Datacenters(Simplified Conceptual Model of Interconnect Convergence)
6
TheDatacenter/Cloud
OldConceptionDesignedforexternallyfacingTCP/IP
Nearly100%Std.TCP/IPethernet insideandout
DMZRouter90+%traffic
Backbone
Old/New Conception of Cloud/Datacenters(Simplified Conceptual Model of Interconnect Convergence)
7
TheDatacenter/Cloud
OldConceptionDesignedforexternallyfacingTCP/IP
Nearly100%Std.TCP/IPethernet insideandout
DMZRouter90+%traffic
Backbone
Cloud/Datacenter80+%oftraffic
High-PerformanceFluidCenterLowOverhead,HighBandwidth,Semi-custominternalinterconnect
CrunchyTCP/IPExterior
NewConceptionNeedtoHandleInternalDataMining/Processing
Designfor80+%internaltraffic