Transcript

ParallelVisualizationonLeadershipComputingResources

Oneapproachtomeetingtheincreasingdemandsforanalysisandvisualizationistoperformmoreofthesetasksonsupercomputerstraditionallyreservedforsimulations.Thiscanleadtoincreasedperformance,reducedcost,andtighterintegrationofanalysisandvisualizationincomputationalscience.Ourteamisdevelopingandoptimizingvisualization

algorithmstoscaletoarchitecturesatover10,000cores.Performanceisanalyzedforultrascaleapplications.

AlgorithmsParallelizingalgorithmssuchasvolumerenderingintopipelines,consistingeachofmanycores,maximizes

performanceonthesearchitectures.

TomPeterka,RobRoss(ANL),Han‐WeiShen(OSU),Kwan‐LiuMa(UCD),WesleyKendall(UTK),HongfengYu(SNL)

Analyses

ApplicationsFromastrophysicstoclimatemodeling,weareworkingone‐on‐onewithscientistsandtheirdatatomeettheir

large‐scalevisualizationrequirements.

ArchitecturesAtscalesoftensofthousandsofcores,visualizationalgorithmsneedtobetunedtospecificarchitectures,sowestudysystemsindetail,I/Oforexample.

AU.S.DepartmentofEnergylaboratorymanagedbyUChicagoArgonne,LLC

TheArgonneLeadershipComputingFacility’sIBMBlueGene/P(left)andtheOakRidgeLeadershipComputingFacility’sCrayXT4(right).ImagescourtesyALCFandNCCS.

Architectural diagramofthetheBlueGene/PI/Osystem

ComputenodesGatewaynodes

Commoditynetwork

Storagenodes

Enterprisestorage

BG/Ptree

5.1GB/s

Ethernet

10Gb/s

Infiniband

16Gb/s

SerialATA

3Gb/s

Volumerenderingofangularmomentumandstreamlinesofshockwavevectorfieldinsupernovadataset,courtesyofJohnBlondin.

Timelagbetweenfirstsnowfallandgreen‐upinMODISdataset,courtesyofNASA.

Parallelvolumerenderingconsistsofparallelreadingofthe

datasetintheI/Ostage,softwareraycastingin

therenderingstage,andblendingindividual

imagesinthecompositingstage.

ThecostofI/Oinrenderingatimeseriescanbemaskedbyvisualizingmultipletimestepsinparallelpipelines.Eachofthepipelinesbelowisfurtherparallelizedamongmultiplenodes.Theforwarderdaemonrunsontheloginnodeandserializesfinalresults.

Scalabilityoveravarietyofdata,image,andsystemsizes.

Therelativepercentageoftimeinthestagesofvolumerenderingasafunctionofsystemsize.

DifferentnumbersofprocessesperformingI/Ocanaffectoverallwritingperformance.

!

!

!

!

!

50 100 200 500 1000 2000 5000 20000

510

20

50

100

200

500

1000

Volume Rendering End!to!End Performance

Number of Processes

Tota

l F

ram

e T

ime (

s)

! 4480^3 data, 4096^2 image

2240^3 data, 2048^2 image

1120^3 data, 1024^2 image

Aggregateandcomponentresultsareanalyzedtodeterminebottlenecks.BecauseultrascalevisualizationisdominatedbyI/O,ourteamdevotesconsiderableefforttoitsstudy,bothfromsystemsandapplicationperspectives.

Top Related