pneo ma r k hr e l d e petaflop applications working group – january 16, 2004
Post on 15-Jan-2016
215 Views
Preview:
TRANSCRIPT
pNeo
M a
r
k
H r
e
l
d
e
PetaFLOP Applications Working Group – January 16, 2004
p
Motivations• Simulation of Epiform Activity
in the Neocortex– understand and fix seizures
• pNeo– from PGENESIS pNeocortex– Streamlined– Compiled– Customized – Instrument and profile
• Performance model– Computation– Communication– System overhead
• Scaling to Billion cell Millions of steps– 10K to 100K nodes
Nano Intro Neuro
• Hodgkin-Huxley Model• Simulation layout
– Multiple compartment cells– Multiple cell types– Multiple classes of dense interconnect– Parallel partitioning
Modeling
IS
Soma
Na K
Spike
Ex
Inh
IS
Soma
Na K
Spike
Ex
Inh
Soma
Na K
Spike
Ex
Soma
Na K
Spike
Ex
Superficial Pyramidal Deep Pyramidal Basket Chandelier
9 ms 12 ms 6 ms 6 ms
Execution time* per step per cell• Integration of state in each compartment• Minor housekeeping• One spike connection
•dt = 0.00001 sec (10 usec)•0.1 sec => 10 sec•10,000 steps => 1 M steps
* 400 MHz Pentium 2 times
Wiring Diagram and Heartbeat
Excitatory Connections Inhibitory ConnectionsDetail from a slice of human focal neocortex
Real neural activity Corresponding simulated activity
Interconnect and Partitioning
• ~8000 connections per cell (within factor of few)
• 30 ns to process a spike event
• Cell grids– 6 cell types– 5 m spacing typ.– ~ 105 cells / mm2
• Connection template– Several conn. types– Annular with hole– 500 m, 5 m– 10% probability, e.g.
• Processor partitioning– Memory limited w/current
simulator– up to ~400 cells per node
PseudocodeTIME = 0; /* TIME LOOP */ do { foreach Object in MODEL { segment.INIT(); } Synchronize; ExchangeData; foreach SpikingObject in MODEL { if (potential >= threshhold) foreach SynapticConnection { CallEventAction(msg->dst); } } foreach ChannelObject in MODEL { foreach ContactPotential { Adjust(V); } integrate; } foreach CompartmentObject in MODEL { foreach ContactPotential { Adjust(Vm, Rm); } integrate; } step TIME; } while ( TIME < TOTAL_SIM_TIME )
What I’ve Been Doing
• pNeocortex running on PGENESIS– aggregate: timing and memory
• GENESIS and custom instrumentation
– Chiba and Jazz [PVM]– 256 nodes– 100K neurons (30K unit cells)
• Serial pNeo (sNeo?)– component: timing, memory
• gprof, custom instrumentation
– special configurations– modeling
Scaling with Problem SizeSimulation Time (16x16 Processors)
10
100
1000
10000
100000
1000 10000 100000
Number of Cells
Sim
Tim
e [
se
co
nd
s]
N4
Memory Model
• Objects– by class
• Synaptic Connections– objects– Messages
• Aggregation– neuron types
• Superficial Pyramidal• Deep Pyramidal• Basket cells• Chandelier cells
Execution Time Model
• Model building or loading• Course phases of simulation loop
– Integration– Spike– Communication
• Fine grain model– Compartment– Cell– Spike Event
Projections
• Million Steps– 0.1 ms
• 10K Nodes
• Spike Rate– 1E-3 / spikegen / step
• Connectivity– 12E+3 / spikegen
Optimizations: MEMORY USE• Save memory by generating connection lists
on the fly each time they are needed (seeded algorithm).
• Save memory by compressing connection sublists.– Large number of connections for a relatively small
number of cells (per node) says there's a lot of redundancy in the connection patterns or sub patterns.
Optimizations: EXECUTION TIME• It looks like the time to process spike events
is the dominant contributor.– Streamlining this would improve execution time for
extremely large runs.– This goal is at odds with memory saving methods
above: computation (replacing lists) might take more time rather than less time to process connection lists.
QUESTIONS
• Why do the timing for S_Pyr only and D_Pyr only not add up to the timing for BOTH?
• Why is there a Tfreebee term to adjust for the very low first spike step in modeltiming runs?
• What's a good way to measure or estimate firing rate so that it can be used in the model?
• Is there a memory leak: Why does memory used increase during the simulation?
pNeo: Next Steps
• Limits– Memory is limiting current size of simulation per node– Communication dominates time at present
• PVM => Ethernet => Slow
– Computation hot spots (?)
• Redemptive tactics– Light weight connections
• Tighten up or compress data structures• Construct on the fly?
– Myranet• pNeo => MPI => Fast
– Detailed performance analysis
• Parallel version
top related