the k computer system overview - fujitsu

21
the K computer System overview Atsuya Uno System Development Team, Development Group Next-Generation Supercomputer R&D Center RIKEN (The Institute of Physical and Chemical Research) SC'11

Upload: votuyen

Post on 14-Feb-2017

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The K computer System Overview - Fujitsu

the K computer System overview

Atsuya Uno

System Development Team, Development GroupNext-Generation Supercomputer R&D CenterRIKEN (The Institute of Physical and Chemical Research)

SC'11

Page 2: The K computer System Overview - Fujitsu

K computer, “京” ,is …

• “京 [kei]” is a nickname of the next-generation supercomputer system

– The name was chosen from public applications in July last year.

– “京” stands for a unit corresponding to 10 P, which is also the performance target of our project.

• No.1 system on the latest TOP500 list.– 10.51 PFlops in LINPACK benchmark

(peak : 11.28PFlops)– Efficiency : 93.2%– Execution time : 29hours 28minutes.– Power consumption : 12.66 MW

SC'11 (1)

Page 3: The K computer System Overview - Fujitsu

Design targets (in 2006) and Schedule• 10 peta-FLOPS in LINPACK benchmark• Peta-FLOPS sustained performance in real applications• Low power consumption system• Highly reliable system

SC'11 (2)

FY2008 FY2009 FY2010 FY2011FY2007FY2006 FY2012

Tuning and improvementTuning and

improvementSystem Prototype, evaluation

Prototype, evaluationDetailed designDetailed design

Conceptualdesign

Conceptualdesign

Production, installation, and adjustment

Production, installation, and adjustment

the present

Computerbuilding

Researchbuilding

ConstructionConstructionDesignDesign

ConstructionConstructionDesignDesignBuild

ings

Page 4: The K computer System Overview - Fujitsu

SC'11 (3)

System Overview of the K computer

Page 5: The K computer System Overview - Fujitsu

System Configuration

SC'11 (4)

Local file system(11PB~)

Global file system(30PB~)

Control and Management NW

K computerFront-endServers

Internet

IO Nodes

The K computerCompute nodes

Tofu interconnect(6D mesh / torus network)

Pre/PostProcessing

Servers

User

Global IO NWSystem

ManagementServers

SystemControlServers

SystemConfiguration

Job / UserManagement

Number of CPUsTotal memory capacity

> 80K> 1PB

Page 6: The K computer System Overview - Fujitsu

Compute Nodes

SC'11 (5)

• CPU : SPARC64TM VIIIfx (8 core)– 128 Gflops– 2.0 GHz

• Memory : 16 GB

• Network : Tofu interconnect(6D mesh / torus network)

– provide logical 3D torus network for each job– 5 GB/s x 2 bandwidth of each link

(Image of 3D torus)© Fujitsu Limited.

SPARC64TM VIIIfx

© Fujitsu Limited.

Compute node

CPU: 128GFLOPS(8 cores)

CoreSIMD(4FMA)

16GFlopsCore

SIMD(4FMA)16GFlops

CoreSIMD(4FMA)

16GFlops

CoreSIMD(4FMA)

16GFlops

CoreSIMD(4FMA)

16GFlopsCore

SIMD(4FMA)16GFlops

CoreSIMD(4FMA)

16GFlops

L2$: 6MB

64 GB/s

CoreSIMD(4FMA)

16GFLOPS

MEM: 16 GB

5 GB/s x 2 (peak) 5 GB/s x 2 (peak)

5 GB

/s x 2(peak)

5 GB

/s x 2 (peak)

Page 7: The K computer System Overview - Fujitsu

CPU – SPARC64TM VIIIfx

SC'11 (6)

• Superscalar Multi-core processor (8 cores)– 128 GFLOPS (16 GFLOPS/core)

• 2 GHz– 256 FP registers (double precision) / core– 2 SIMD (FMA x 4) / core

• Shared 6MB L2$ (12way)– Software controllable cache (Sector cache)

• Memory throughput– L2$ - Main memory : 0.5B/FLOP (64GB/s)

• Hardware barrier– High-speed synchronization of on-chip cores

• Power consumption– 58W at 30℃ by water-cooling

You can get more detailed information from ‘SPARC64TM VIIIfx Extensions’ (http://img.jp.fujitsu.com/downloads/jp/jhpc/sparc64viiifx-extensions.pdf)

Specification

Performance (peak) 128 GFLOPS (16 GFLOPS x 8 cores)

Core 8

Clock 2.0 GHz

Floating-pointExecution units

(Core spec)

FMA x 4 (2 SIMD)DIVIDE x 2COMPARE x 2

Floating-point register (64bit) : 256General purpose register (64bit) : 188

CacheL1I$ : 32 KB (2way)L1D$ : 32 KB (2way)L2$ : Shared 6 MB (12way)

Memory throughput 64 GB/s (0.5B/F)

45nm CMOS processChip size : 22.7mm x 22.6mm# of transistors : 760MPower : 58W (TYP, 30℃), Water Cooling

© Fujitsu Limited.

Page 8: The K computer System Overview - Fujitsu

Tofu (Torus fusion) Interconnect• High communication performance and fault-tolerant network• Network topology : 6D mesh / torus network

– Each node has 10 links (5 GB/s x 2 bandwidth)• 4 links for making a basic unit (2x3x2 mesh / torus) and 6 links for XYZ link (XZ:torus / Y:mesh)

– Multi-path routing by a combination of XYZ link and 2x3x2 mesh / torus network enables to make a detour of faulty nodes

• From user’s point of view, network topology of the job is 1,2 or 3D torus network.– This torus network is dynamically configured when the nodes are assigned to the job.

SC'11 (7)

Z+

Z-

X- X+

Y+

Y-

XY

ZBasic unit

(2x3x2 mesh / torus)

Neighboring basic units are connected by 12 links.

Basic units form 3D torus.

Page 9: The K computer System Overview - Fujitsu

System environments• OS: Linux based OS on compute nodes

• File system : FEFS (Fujitsu Exabyte File System) based on Lustrefile system

– Two-level large-scale distributed file system • Local file system for temporary files used by jobs• Global file system for user’s permanent files

– Staging functions• Stage-in

Input files on the global file system which are used in a job are copied on the local file system.

• Stage-out Output files generated during a job execution are moved back to the global file system after the job finished.

• Batch job-oriented system– Interactive environments are available for debugging.

SC'11

Local File System

Global File System

Compute nodes

Inputfiles

Filestaging

Outputfiles

IO

(8)

Page 10: The K computer System Overview - Fujitsu

Programming environments• Programing languages and compilers

– Fortran 95 and subset of Fortran 2003, XPFortran– C99 and C++2003 including several extensions to GNU C and C++– Optimized compilers to obtain full capabilities of SPARC64TM VIIIfx

• SIMD, 256 FP registers, programmable L2 cache (sector cache), and so on. – OpenMP 3.0 is supported

• MPI library based on MPI-2.1 specification– Low-latency and high-throughput– Collective communications are optimized for the Tofu interconnect

• Bcast, Allgather, Alltoall, and Allreduce

• Numerical libraries– BLAS, LAPACK, FFTW, Fujitsu scientific numerical library SSL II, and so on will be

provided.

SC'11 (9)

Page 11: The K computer System Overview - Fujitsu

Hardware Implementation

SC'11 (10)

• Compute rack houses :– System board x 24 (top:12, bottom:12)– IO system board x 6– Power supply unit x 9– Service processor board x 2– System disk (RAID 5)

• Peak performance : 12.3TF / Rack• Total memory : 1.5TB / Rack• Weight: 1,500Kg (max.)

System Board x 12

Pipes forWater-cooling

2060mm

796mm

IO System Board x 6

System Disk

Service Processor board x 2

Rack

System Board x 12

© Fujitsu Limited.

SPARC64TM VIIIfx(4 CPUs / 1 SB)

ICC (Tofu NW LSI)(4 ICCs / 1 SB)

© Fujitsu Limited.

DDR3 Memory(16 GB x 4 / 1 SB)

Power SupplyUnit x 9

Water-coolingmodule

System Board(24 boards / 1 rack)

Page 12: The K computer System Overview - Fujitsu

Hardware view

SC'11 (11)

System Board (SB)

Node×4

512 GFLOPS64 GB

Compute Rack

SB×24IOSB×6

12.3 TFLOPS1.5 TB

Node

CPU×1ICC×1Memory

128 GFLOPS16 GB

Racks

Compute Rack×8Disk Rack×2

98.4 TFLOPS12 TB

Full System

Compute Rack × > 800

> 10 PFLOPS> 1 PB

Compute RackDisk Rack

Page 13: The K computer System Overview - Fujitsu

Image of the K computer

SC'11 (12)

864 racks are housed in the computer room.

Page 14: The K computer System Overview - Fujitsu

SC'11 (13)

Facilities for the K computer

Page 15: The K computer System Overview - Fujitsu

Location of the K computer in Kobe

SC'11

450km (280miles) west from Tokyo

(14)

Tokyo

KobeKyoto

AICS (Advanced Institute for Computational Science) was established at RIKEN in July, 2010.

Page 16: The K computer System Overview - Fujitsu

Layout of buildings

SC'11 (15)

Computer

Building

Substation

Chillers

Research

Building

Page 17: The K computer System Overview - Fujitsu

Buildings and Cooling System

SC'11

Research Building• Six-story above ground and one below• Floor area : ~1,800 m2, ~9,000 m2 (total)

Computer Building• Three-story above ground and one below• Floor area: ~4,300 m2, ~10,500 m2 (total)

(16)

Chillers (Area : 1900m2)

Absorption Refrigerating Chiller

x 4

CentrifugalWater Chiller

x 3

CGS (5MW)x 2

Substation (Area : 200m2)

30MW (max.)77,000V (receiving)

→ 6,600V

Page 18: The K computer System Overview - Fujitsu

Cooling System• K computer is cooled by air and water

SC'11 (17)

Parts Temperature

Water CPU and ICC Input : ~15℃Output : ~17℃

Air the others Room temp. : ~20℃……

……

Heat exchanger Air handlers

Compute racks

……

Absorption Refrigerating Chillers &

Centrifugal Water Chillers

SB

Page 19: The K computer System Overview - Fujitsu

Installation of the K computer

SC'11 (18)

Period of Installation : 2010.9 ~ 2011.860

m

50m

First 8 racks installed on 30th Sep 2010.

• No pillar– flexible arrangement of computer racks – easy installation of network cables

The computer room on the third floorof the computer building.

Last racks were installed at the end of Aug. 2011.

Page 20: The K computer System Overview - Fujitsu

Installation of the K computer (movie)

SC'11 (19)

Page 21: The K computer System Overview - Fujitsu

SC'11

A photo in the early evening

Thank you for your attention !

(20)

AICS Booth #112