experience with the jülich hps regatta h+ cluster · forschungszentrum jülich ibm p690+ frame 32...

22
Forschungszentrum Jülich Forschungszentrum Jülich Experience with the Jülich HPS Regatta H+ Cluster K.Wolkersdorfer@fz K.Wolkersdorfer@fz - - juelich.de juelich.de

Upload: others

Post on 30-Jul-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Experience with the Jülich HPS Regatta H+ Cluster · Forschungszentrum Jülich IBM p690+ Frame 32 CPUs Power4+ 1.7 GHz 128 GB Memory 567 MHz (latency: 252 cycles) Cache Internal

Forschungszentrum JülichForschungszentrum Jülich

Experience with the Jülich HPS Regatta H+ Cluster

[email protected]@fz--juelich.dejuelich.de

Page 2: Experience with the Jülich HPS Regatta H+ Cluster · Forschungszentrum Jülich IBM p690+ Frame 32 CPUs Power4+ 1.7 GHz 128 GB Memory 567 MHz (latency: 252 cycles) Cache Internal

Forschungszentrum JülichForschungszentrum JülichJump Architecture

2 Login Nodes2 Login Nodes 39 Compute Nodes39 Compute NodesIBM Regatta H+ SM P ClusterIBM Regatta H+ SM P Cluster

8.9 8.9 TFlopsTFlops PeakPeak5.6 5.6 TFlopsTFlops LINPACKLINPACK55 TByteTByte MemoryMemory5050 TByteTByte DiskDisk

SANSANSAN

2 TSM 2 TSM PartitionsPartitions

4 I/O Partitions4 I/O PartitionsNetwork + HPS (2 Adapter/4links per node)Network + HPS (2 Adapter/4links per node)

Storage DevicesStorage Devices

Page 3: Experience with the Jülich HPS Regatta H+ Cluster · Forschungszentrum Jülich IBM p690+ Frame 32 CPUs Power4+ 1.7 GHz 128 GB Memory 567 MHz (latency: 252 cycles) Cache Internal

Forschungszentrum JülichForschungszentrum JülichIBM p690+ Frame

32 CPUs Power4+32 CPUs Power4+1.7 GHz1.7 GHz

128 GB Memory128 GB Memory567 MHz (latency: 252 cycles)567 MHz (latency: 252 cycles)

CacheCacheInternal L1 cacheInternal L1 cache

64/32 KB64/32 KB instrinstr./data (per CPU)./data (per CPU)Shared L2 cacheShared L2 cache

1.5 MB per chip (101.5 MB per chip (10--12 cycles)12 cycles)Shared L3 cacheShared L3 cache

512 MB per frame (92512 MB per frame (92--100 cycles)100 cycles)

GFlopsGFlops: : 4 x 1.7 x 32 4 x 1.7 x 32 ≈≈ 218218

218 x 41 218 x 41 framesframes ≈≈ 8.9 TF8.9 TF

Page 4: Experience with the Jülich HPS Regatta H+ Cluster · Forschungszentrum Jülich IBM p690+ Frame 32 CPUs Power4+ 1.7 GHz 128 GB Memory 567 MHz (latency: 252 cycles) Cache Internal

Forschungszentrum JülichForschungszentrum JülichIBM p690+ Frame

Page 5: Experience with the Jülich HPS Regatta H+ Cluster · Forschungszentrum Jülich IBM p690+ Frame 32 CPUs Power4+ 1.7 GHz 128 GB Memory 567 MHz (latency: 252 cycles) Cache Internal

Forschungszentrum JülichForschungszentrum Jülichp690+: Memory Affinity Feature

Memory Affinity (together with AIX 5.2)Memory Affinity (together with AIX 5.2)Best performance, if data is physically close to processorsBest performance, if data is physically close to processorsDefault Default random memory allocation is not optimalis not optimalBetter with Better with vmo - option: option: round robin memory allocationMuch better with: Much better with: export MEMORY_AFFINITY=MCMBest after PTFBest after PTF--7: 7: export MP_TASK_AFFINITY=MCM

Affinity PartitionsAffinity PartitionsLogical frame partitioning into segments of physical proximityLogical frame partitioning into segments of physical proximityE.g. partitioning of a frame into 4 squaresE.g. partitioning of a frame into 4 squaresPros: Pros: Good memory performanceGood memory performance

Cons:Cons:More logical nodes (AIX images)More logical nodes (AIX images)Higher complexity of the external networkHigher complexity of the external networkSmaller granularity for shared memory programming modelsSmaller granularity for shared memory programming models

Page 6: Experience with the Jülich HPS Regatta H+ Cluster · Forschungszentrum Jülich IBM p690+ Frame 32 CPUs Power4+ 1.7 GHz 128 GB Memory 567 MHz (latency: 252 cycles) Cache Internal

Forschungszentrum JülichForschungszentrum Jülichp690+: Memory Cards

Memory CardsMemory CardsCards up to 8GB size have 1 memory port onlyCards up to 8GB size have 1 memory port onlyCards of 16GB size or greater have 2 memory portsCards of 16GB size or greater have 2 memory portsIBM:IBM:

Stream benchmark runs 15% fasterStream benchmark runs 15% fasteron systems with 2 port memory cardson systems with 2 port memory cards

FZJ:FZJ:Performance measurements with selected user benchmarks Performance measurements with selected user benchmarks show performance gains on 16GB cards show performance gains on 16GB cards (compared to 8GB cards) in the 5% (compared to 8GB cards) in the 5% -- 9% range9% range

Most important is a full population of memory cardsMost important is a full population of memory cards

Page 7: Experience with the Jülich HPS Regatta H+ Cluster · Forschungszentrum Jülich IBM p690+ Frame 32 CPUs Power4+ 1.7 GHz 128 GB Memory 567 MHz (latency: 252 cycles) Cache Internal

Forschungszentrum JülichForschungszentrum Jülichp690+: Large Page Feature

Large Pages (together with AIX 5.2)Large Pages (together with AIX 5.2)Performance improvements for memory intensive applicationsPerformance improvements for memory intensive applicationsLP performance improvements result from LP performance improvements result from

reduced translation reduced translation lookaside lookaside buffer (TLB) misses due tobuffer (TLB) misses due tothe TLB being able to map a larger virtual memory rangethe TLB being able to map a larger virtual memory rangeImproved memory Improved memory prefetching prefetching by eliminating the needby eliminating the needto restart to restart prefetch prefetch operations on 4KB boundariesoperations on 4KB boundaries

p690 architecture supports 2 sizes: 4K and 16M p690 architecture supports 2 sizes: 4K and 16M (Reboot req.)(Reboot req.)HPS Switch needs 109 LPsHPS Switch needs 109 LPs for a 128GB p690+ for a 128GB p690+ after PTF7after PTF7FZJ application experience(before PTF7):FZJ application experience(before PTF7):

A few benchmarks gain up to 12% performanceA few benchmarks gain up to 12% performanceSome (user) applications lose significantly (up to 70%)Some (user) applications lose significantly (up to 70%)

After PTF7 we could measure almost no gains with LPsAfter PTF7 we could measure almost no gains with LPsFZJ current decision: FZJ current decision: LP disabled for general useLP disabled for general use

Page 8: Experience with the Jülich HPS Regatta H+ Cluster · Forschungszentrum Jülich IBM p690+ Frame 32 CPUs Power4+ 1.7 GHz 128 GB Memory 567 MHz (latency: 252 cycles) Cache Internal

Forschungszentrum JülichForschungszentrum Jülichp690+: LPAR Feature

Logical Partition (LPAR) currently required for HPSLogical Partition (LPAR) currently required for HPSResourcesResources

ProcessorsProcessorsMemoryMemoryI/O Slots (incl. HPS adapter slots)I/O Slots (incl. HPS adapter slots)

Each LPAR has its own operating system image (AIX).Each LPAR has its own operating system image (AIX).LPARsLPARs are grouped according to their functions:are grouped according to their functions:

Compute partitionsCompute partitionsParallel applicationsParallel applications

Login partitionsLogin partitionsInteractive user logins, program development, data handling etc.Interactive user logins, program development, data handling etc.

I/O partitionsI/O partitionsDisk peripherals for global file systems (GPFS)Disk peripherals for global file systems (GPFS)

TSM partitionsTSM partitionsTape peripherals for backup and hierarchical storage (HSM)Tape peripherals for backup and hierarchical storage (HSM)

Page 9: Experience with the Jülich HPS Regatta H+ Cluster · Forschungszentrum Jülich IBM p690+ Frame 32 CPUs Power4+ 1.7 GHz 128 GB Memory 567 MHz (latency: 252 cycles) Cache Internal

Forschungszentrum JülichForschungszentrum Jülichp690+: LPAR Feature

LoginLogin

I/OI/O

I/OI/O

TSMTSM

LoginLogin

I/OI/O

I/OI/OI/O

TSMTSM

ComputeCompute ComputeCompute. . .. . .

39 Compute Nodes39 Compute Nodes

Compute: Compute: 32 CPUs, 128 GB Memory32 CPUs, 128 GB Memory

Login: Login: 20 CPUs, 80 GB Memory20 CPUs, 80 GB Memory

I/O: 4I/O: 4 CPUs, 16 GB MemoryCPUs, 16 GB Memory

TSM:TSM: 4 CPUs, 16 GB4 CPUs, 16 GB

Page 10: Experience with the Jülich HPS Regatta H+ Cluster · Forschungszentrum Jülich IBM p690+ Frame 32 CPUs Power4+ 1.7 GHz 128 GB Memory 567 MHz (latency: 252 cycles) Cache Internal

Forschungszentrum JülichForschungszentrum JülichUpgrading to HPS Service Pack 7 (PTF 7)

Previously PTF 3 installedPreviously PTF 3 installedAll IBM software and firmware components were All IBM software and firmware components were affected at the same time (let alone compiler + TSM) affected at the same time (let alone compiler + TSM) This major upgrade took 3 days and 4 hours (mid July)This major upgrade took 3 days and 4 hours (mid July)Problems with Problems with RSCTRSCT cost us about 1 daycost us about 1 dayEverything else went very smoothly thanks toEverything else went very smoothly thanks to

our 2 IBM our 2 IBM SEsSEs (M. (M. HenneckeHennecke and Ch. and Ch. KrafftKrafft))and permanent involvement of IBM Poughkeepsie (POK)and permanent involvement of IBM Poughkeepsie (POK)(B. (B. HeldkeHeldke, Jan , Jan RanckRanck--Gustafson acting for all involved)Gustafson acting for all involved)

IBMsIBMs ‘‘scrubscrub’’ procedure identified 2 bad Riser cardsprocedure identified 2 bad Riser cardsBenchmark slot was planned for performance tests Benchmark slot was planned for performance tests before production restarted before production restarted

Page 11: Experience with the Jülich HPS Regatta H+ Cluster · Forschungszentrum Jülich IBM p690+ Frame 32 CPUs Power4+ 1.7 GHz 128 GB Memory 567 MHz (latency: 252 cycles) Cache Internal

Forschungszentrum JülichForschungszentrum JülichProblems after PTF 7

HPSHPSMuch more entries in Much more entries in HMCsHMCs: /: /varvar//hschsc/log/FNM_/log/FNM_RecovRecov.log.logLots of Lots of ‘‘SYSPLANAR0: SCAN_ERROR_CHRPSYSPLANAR0: SCAN_ERROR_CHRP’’entries in error log of many nodesentries in error log of many nodesMPI program dies within MPI program dies within ‘‘lapilapi_init_init’’ with with rcrc(656)(656)(not reproducible)(not reproducible)FNM software still has to be restartedFNM software still has to be restartedafter Riser card replacements (fixed in PTF 8)after Riser card replacements (fixed in PTF 8)

HMCHMCIBM.IBM.ServiceRM ServiceRM dies after 5 seconds (fixed in PTF 8?)dies after 5 seconds (fixed in PTF 8?)During Upgrade eth0 and eth1 were swapped on all 12 During Upgrade eth0 and eth1 were swapped on all 12 HMCsHMCscausing loss of connectivitycausing loss of connectivityMost HW problems are still not reported through SFPMost HW problems are still not reported through SFP

LL job did not start on 2 nodes (kernel deadlock ?)LL job did not start on 2 nodes (kernel deadlock ?)‘‘NTBL_RESOURCE_BUSYNTBL_RESOURCE_BUSY’’ errpterrpt--entries entries

Page 12: Experience with the Jülich HPS Regatta H+ Cluster · Forschungszentrum Jülich IBM p690+ Frame 32 CPUs Power4+ 1.7 GHz 128 GB Memory 567 MHz (latency: 252 cycles) Cache Internal

Forschungszentrum JülichForschungszentrum JülichOutput from FZJ LINKTEST program

Page 13: Experience with the Jülich HPS Regatta H+ Cluster · Forschungszentrum Jülich IBM p690+ Frame 32 CPUs Power4+ 1.7 GHz 128 GB Memory 567 MHz (latency: 252 cycles) Cache Internal

Forschungszentrum JülichForschungszentrum JülichPerformance Gains after PTF 7

The values given here are The values given here are firstfirst measurementsmeasurements

FZJ FZJ –– LINKTESTLINKTEST and and PMB PMB ((Pallas MPI Benchmark)Pallas MPI Benchmark)Bandwidth: 550 MB/s Bandwidth: 550 MB/s 1420 MB/s (Factor: 2.58)1420 MB/s (Factor: 2.58)(1 link, 1MB Messages, MPI ping(1 link, 1MB Messages, MPI ping--pong)pong)Latency: 10.34 us Latency: 10.34 us 6.54 us 6.54 us (Factor: 0.63) (Factor: 0.63) (2 nodes with 1 processor used)(2 nodes with 1 processor used)Latency: ~40Latency: ~40--50 us 50 us ~10~10--12 us (Factor: ~0.25)12 us (Factor: ~0.25)(2 nodes with 32 processors used)(2 nodes with 32 processors used)

QCD - Quantum Chromodynamics, BiCGstabFactor 0.89 without MP_TASK_AFFINITYFactor 0.89 without MP_TASK_AFFINITYFactor 0.57 with MP_TASK_AFFINITYFactor 0.57 with MP_TASK_AFFINITY

CPMD CPMD -- CarCar--ParrinelloParrinello Molecular Dynamics (DFT)Molecular Dynamics (DFT)Factor 0.90 without MP_TASK_AFFINITYFactor 0.90 without MP_TASK_AFFINITYFactor 0.70 with MP_TASK_AFFINITYFactor 0.70 with MP_TASK_AFFINITY

Page 14: Experience with the Jülich HPS Regatta H+ Cluster · Forschungszentrum Jülich IBM p690+ Frame 32 CPUs Power4+ 1.7 GHz 128 GB Memory 567 MHz (latency: 252 cycles) Cache Internal

Forschungszentrum JülichForschungszentrum JülichPerformance Gains after PTF 7 (cont.)

Not in every case Not in every case MP_TASK_AFFINITYMP_TASK_AFFINITY helps muchhelps muchbut we decided to set in in the user but we decided to set in in the user profile profile anywayanywayScaLAPACKScaLAPACK -- Eigenvectors of real full sym. MatrixEigenvectors of real full sym. Matrix

Factor 0.84 without MP_TASK_AFFINITYFactor 0.84 without MP_TASK_AFFINITYFactor 0.83 with MP_TASK_AFFINITYFactor 0.83 with MP_TASK_AFFINITY

Chiral Chiral -- QuantumQuantum Chromodynamics DiracChromodynamics Dirac OperatorOperatorFactor 0.88 without MP_TASK_AFFINITYFactor 0.88 without MP_TASK_AFFINITYFactor 0.88 with MP_TASK_AFFINITYFactor 0.88 with MP_TASK_AFFINITY

Trace Trace -- Solute Transport in SoilSolute Transport in Soil--AquiferAquifer--SystemsSystemsFactor 0.99 without MP_TASK_AFFINITYFactor 0.99 without MP_TASK_AFFINITYFactor 0.98 with MP_TASK_AFFINITYFactor 0.98 with MP_TASK_AFFINITY

DMMD DMMD -- Distributed Memory MolecularDistributed Memory MolecularFactor 0.98 without MP_TASK_AFFINITYFactor 0.98 without MP_TASK_AFFINITYFactor 0.97 with MP_TASK_AFFINITYFactor 0.97 with MP_TASK_AFFINITY

Page 15: Experience with the Jülich HPS Regatta H+ Cluster · Forschungszentrum Jülich IBM p690+ Frame 32 CPUs Power4+ 1.7 GHz 128 GB Memory 567 MHz (latency: 252 cycles) Cache Internal

Forschungszentrum JülichForschungszentrum JülichMajor TSM/HSM Problem

Tivoli Tivoli –– HSM V5.2.2.0 installedHSM V5.2.2.0 installedEarly June we saw more and more migrated user data Early June we saw more and more migrated user data due to massive usage of our 50TByte due to massive usage of our 50TByte filesystemsfilesystemsProblemProblem: Users experienced : Users experienced hangs during hangs during ‘‘recallrecall’’ of of migrated datamigrated data (Job finally was killed by (Job finally was killed by LLs LLs time limit)time limit)Very soon lots of users were affectedVery soon lots of users were affectedThe first action plan created more problems:The first action plan created more problems:Lots of user files were now in an invalid state and had Lots of user files were now in an invalid state and had to be repaired, which was successfulto be repaired, which was successful APARAPAR IC41576IC41576Master and slave recall daemons were in a Master and slave recall daemons were in a deadlockdeadlockOn July, 09 we got a first On July, 09 we got a first efixefix for this major problemfor this major problemBy all means install By all means install efixefix for for APAR IC41579APAR IC41579

Page 16: Experience with the Jülich HPS Regatta H+ Cluster · Forschungszentrum Jülich IBM p690+ Frame 32 CPUs Power4+ 1.7 GHz 128 GB Memory 567 MHz (latency: 252 cycles) Cache Internal

Forschungszentrum JülichForschungszentrum JülichMore TSM/HSM Problems

Still problems during access of migrated files from Still problems during access of migrated files from parallel programparallel program

Some parallel jobs were aborted due to hanging recallsSome parallel jobs were aborted due to hanging recallsAfter LL kills the job there areAfter LL kills the job there arehanging (hanging (‘‘exitingexiting’’) processes together with) processes together withNTBL_SOURCE_BUSY entries in NTBL_SOURCE_BUSY entries in errpterrpt every 10 secevery 10 sec

Design Deficiencies:Design Deficiencies:No logging of HSM actions (migration and recall)No logging of HSM actions (migration and recall)When problems arise we never were informed byWhen problems arise we never were informed byHSM logs or actions but by our usersHSM logs or actions but by our usersAll user commands (All user commands (dlsdls, , dsmmigratedsmmigrate, , dsmrecalldsmrecall))are possible on one node ONLY are possible on one node ONLY

HSM seems not yet to be ready for cluster operationsHSM seems not yet to be ready for cluster operations

Page 17: Experience with the Jülich HPS Regatta H+ Cluster · Forschungszentrum Jülich IBM p690+ Frame 32 CPUs Power4+ 1.7 GHz 128 GB Memory 567 MHz (latency: 252 cycles) Cache Internal

Forschungszentrum JülichForschungszentrum JülichLDAP

We made the Decision in 2003 to try LDAPWe made the Decision in 2003 to try LDAPInstalledInstalled ldapldap.server and .server and ldapldap.client 5.1.0.101.client 5.1.0.101Early 2004 we had some problemsEarly 2004 we had some problems

“getgroup” resolutionLDAP caching problemsMapping uids i.e. in llq

It turned out all problems were in bos.rte.securityWe very quickly received efixes thanks to Tom Lu from AIX security development (Austin) Install APAR IY55980: bos.rte.security.5.2.0.32Currently very stable without any problems

Page 18: Experience with the Jülich HPS Regatta H+ Cluster · Forschungszentrum Jülich IBM p690+ Frame 32 CPUs Power4+ 1.7 GHz 128 GB Memory 567 MHz (latency: 252 cycles) Cache Internal

Forschungszentrum JülichForschungszentrum JülichPaging

Sometimes users allocate more memory than availableSometimes users allocate more memory than available(By mistake of finger or thought)(By mistake of finger or thought)Problem: When a 128 GB node starts pagingProblem: When a 128 GB node starts paging

the system very quickly is as dead as a dodothe system very quickly is as dead as a dodo(not a single keystroke may be possible)(not a single keystroke may be possible)LL keyword LL keyword ““ConsumableMemoryConsumableMemory”” doesndoesn’’t do the jobt do the job(See XXL(See XXL--Requirement: 1.2.4)Requirement: 1.2.4)Worse: Even if LL (or somebody else) is able to kill the Worse: Even if LL (or somebody else) is able to kill the process and all memory is free again, the paging area stays process and all memory is free again, the paging area stays full (because root processes paged out wonfull (because root processes paged out won’’t be paged in) t be paged in) and the system stays in a hung state (and the system stays in a hung state (““fork not possiblefork not possible””) ) PMR 80238,033,724PMR 80238,033,724IBM IBM Feature 383936Feature 383936 and and Defect 417453Defect 417453 Not in AIX 5.2Not in AIX 5.2

Page 19: Experience with the Jülich HPS Regatta H+ Cluster · Forschungszentrum Jülich IBM p690+ Frame 32 CPUs Power4+ 1.7 GHz 128 GB Memory 567 MHz (latency: 252 cycles) Cache Internal

Forschungszentrum JülichForschungszentrum JülichPaging (cont.)

The only solution is: The only solution is: Try to avoid paging by all meansTry to avoid paging by all meansFZJ circumvention:FZJ circumvention:Since Since APAR IY49357 APAR IY49357 data and stack limits work in AIX 5.2data and stack limits work in AIX 5.2(Thanks to Luke Browning, IBM Austin and his team)(Thanks to Luke Browning, IBM Austin and his team)

In LL submit filter we insert by default for each processor:In LL submit filter we insert by default for each processor:

data_limit = 3gbdata_limit = 3gbstack_limit = 0.5gbstack_limit = 0.5gb

This sums up to 112 GB for our 128GB nodeThis sums up to 112 GB for our 128GB nodeand this just seems to be alrightand this just seems to be alright> 95% of our users had to change nothing> 95% of our users had to change nothingOthers have to deal with these 2 additional LL control cardsOthers have to deal with these 2 additional LL control cardswhile we control in the filter the sum not to exceed 112GBwhile we control in the filter the sum not to exceed 112GB

Page 20: Experience with the Jülich HPS Regatta H+ Cluster · Forschungszentrum Jülich IBM p690+ Frame 32 CPUs Power4+ 1.7 GHz 128 GB Memory 567 MHz (latency: 252 cycles) Cache Internal

Forschungszentrum JülichForschungszentrum JülichFZJ Conclusion

Smooth installation of cluster in planned timeSmooth installation of cluster in planned time--frameframeHPS 100% stable (some Riser card replacements) HPS 100% stable (some Riser card replacements) HPS has (after PTF 7) nearly the promised performanceHPS has (after PTF 7) nearly the promised performancebut large applications running on many nodes require even but large applications running on many nodes require even smaller latencies!smaller latencies!

Nearly no HW problemsNearly no HW problemsAIX 5.2 + GPFS very stableAIX 5.2 + GPFS very stableAIX needs more focus on resource limitationsAIX needs more focus on resource limitationstoto fulfillfulfill our Service Level agreements with our HPC usersour Service Level agreements with our HPC users

Hardly any problems porting programs from CRAY T3EHardly any problems porting programs from CRAY T3E1GB memory for the CWS is not sufficient 1GB memory for the CWS is not sufficient 4GB4GB

Page 21: Experience with the Jülich HPS Regatta H+ Cluster · Forschungszentrum Jülich IBM p690+ Frame 32 CPUs Power4+ 1.7 GHz 128 GB Memory 567 MHz (latency: 252 cycles) Cache Internal

Forschungszentrum JülichForschungszentrum JülichFZJ Conclusion (cont.)

Memory Affinity is very important for p690+ Systems!Memory Affinity is very important for p690+ Systems!What are What are IBMs IBMs plans here for even larger SMP nodes?plans here for even larger SMP nodes?Large Pages are currently not useful for our users!Large Pages are currently not useful for our users!A more dynamic approach is neededA more dynamic approach is neededwithout any static reconfiguration (reboot)!without any static reconfiguration (reboot)!The upgrade from one HPS Service Pack to nextThe upgrade from one HPS Service Pack to nextis too complicated and takes too much time is too complicated and takes too much time (Especially RSCT needs some attention!)(Especially RSCT needs some attention!)The time frame it takes the HPC stack owners to adopt a The time frame it takes the HPC stack owners to adopt a new release of AIX is too long (for 5.3 about a year!)new release of AIX is too long (for 5.3 about a year!)Some Some LPPs LPPs are not yet fully cluster enabled (TSM/HSM)!are not yet fully cluster enabled (TSM/HSM)!Will IBM still focus on scientific HPC in the future?Will IBM still focus on scientific HPC in the future?What about announcing small SMP nodes first for Power 5?What about announcing small SMP nodes first for Power 5?

Page 22: Experience with the Jülich HPS Regatta H+ Cluster · Forschungszentrum Jülich IBM p690+ Frame 32 CPUs Power4+ 1.7 GHz 128 GB Memory 567 MHz (latency: 252 cycles) Cache Internal

Forschungszentrum JülichForschungszentrum JülichJump

http://http://jumpdocjumpdoc.fz.fz--juelich.dejuelich.de