pneo ma r k hr e l d e petaflop applications working group – january 16, 2004

PetaFLOP Applications Working Group – January 16, 2004

Motivations• Simulation of Epiform Activity

in the Neocortex– understand and fix seizures

• pNeo– from PGENESIS pNeocortex– Streamlined– Compiled– Customized – Instrument and profile

• Performance model– Computation– Communication– System overhead

• Scaling to Billion cell Millions of steps– 10K to 100K nodes

Nano Intro Neuro

• Hodgkin-Huxley Model• Simulation layout

– Multiple compartment cells– Multiple cell types– Multiple classes of dense interconnect– Parallel partitioning

Modeling

Superficial Pyramidal Deep Pyramidal Basket Chandelier

9 ms 12 ms 6 ms 6 ms

Execution time* per step per cell• Integration of state in each compartment• Minor housekeeping• One spike connection

•dt = 0.00001 sec (10 usec)•0.1 sec => 10 sec•10,000 steps => 1 M steps

* 400 MHz Pentium 2 times

Wiring Diagram and Heartbeat

Excitatory Connections Inhibitory ConnectionsDetail from a slice of human focal neocortex

Real neural activity Corresponding simulated activity

Interconnect and Partitioning

• ~8000 connections per cell (within factor of few)

• 30 ns to process a spike event

• Cell grids– 6 cell types– 5 m spacing typ.– ~ 105 cells / mm2

• Connection template– Several conn. types– Annular with hole– 500 m, 5 m– 10% probability, e.g.

• Processor partitioning– Memory limited w/current

simulator– up to ~400 cells per node

PseudocodeTIME = 0; /* TIME LOOP */ do { foreach Object in MODEL { segment.INIT(); } Synchronize; ExchangeData; foreach SpikingObject in MODEL { if (potential >= threshhold) foreach SynapticConnection { CallEventAction(msg->dst); } } foreach ChannelObject in MODEL { foreach ContactPotential { Adjust(V); } integrate; } foreach CompartmentObject in MODEL { foreach ContactPotential { Adjust(Vm, Rm); } integrate; } step TIME; } while ( TIME < TOTAL_SIM_TIME )

What I’ve Been Doing

• pNeocortex running on PGENESIS– aggregate: timing and memory

• GENESIS and custom instrumentation

– Chiba and Jazz [PVM]– 256 nodes– 100K neurons (30K unit cells)

• Serial pNeo (sNeo?)– component: timing, memory

• gprof, custom instrumentation

– special configurations– modeling

Scaling with Problem SizeSimulation Time (16x16 Processors)

100000

1000 10000 100000

Number of Cells

Memory Model

• Objects– by class

• Synaptic Connections– objects– Messages

• Aggregation– neuron types

• Superficial Pyramidal• Deep Pyramidal• Basket cells• Chandelier cells

Execution Time Model

• Model building or loading• Course phases of simulation loop

– Integration– Spike– Communication

• Fine grain model– Compartment– Cell– Spike Event

Projections

• Million Steps– 0.1 ms

• 10K Nodes

• Spike Rate– 1E-3 / spikegen / step

• Connectivity– 12E+3 / spikegen

Optimizations: MEMORY USE• Save memory by generating connection lists

on the fly each time they are needed (seeded algorithm).

• Save memory by compressing connection sublists.– Large number of connections for a relatively small

number of cells (per node) says there's a lot of redundancy in the connection patterns or sub patterns.

Optimizations: EXECUTION TIME• It looks like the time to process spike events

is the dominant contributor.– Streamlining this would improve execution time for

extremely large runs.– This goal is at odds with memory saving methods

above: computation (replacing lists) might take more time rather than less time to process connection lists.

QUESTIONS

• Why do the timing for S_Pyr only and D_Pyr only not add up to the timing for BOTH?

• Why is there a Tfreebee term to adjust for the very low first spike step in modeltiming runs?

• What's a good way to measure or estimate firing rate so that it can be used in the model?

• Is there a memory leak: Why does memory used increase during the simulation?

pNeo: Next Steps

• Limits– Memory is limiting current size of simulation per node– Communication dominates time at present

• PVM => Ethernet => Slow

– Computation hot spots (?)

• Redemptive tactics– Light weight connections

• Tighten up or compress data structures• Construct on the fly?

– Myranet• pNeo => MPI => Fast

– Detailed performance analysis

• Parallel version

pneo ma r k hr e l d e petaflop applications working group – january 16, 2004

Documents

a bird's-eye view of petavision, the world's first...

hr nav+ lms & e- learning

my hr & e-timesheets

e-time reporting: hr review guide - rfsuny.org

nvidia/ibm to build two coral 100+ petaflop supercomputers...

blue gene: a vision for protein science using a petaflop...

all companies hr e

podnice hr e

issue 2 e-newsletter of hr trends group hr trends ·...

win win hr consultancy e brochure

e ropean hr - aidp.it hr director aidp_layou… · alessia...

overcoming the barriers to sustained petaflop...

university of california, los...

hpc environment management: new challenges in the petaflop...

r e s e a r c hr e p o r t

ctigocb 20 hr aa+ (e)

1 agile e hr

hr trends e-newsletter

c-hr e-brochure - toyota k.motors · c-hr e-brochure...

hr-e® lcd, hr-e® lcd 4-20 · hr-e lcd the hr-e lcd is a...