perspective on extreme scale computing in china depei qian sino-german joint software institute...
Post on 25-Dec-2015
217 Views
Preview:
TRANSCRIPT
Perspective on Extreme Scale Computing in China
Depei Qian
Sino-German Joint Software Institute (JSI)
Beihang UniversityCo-design 2013, Guilin, Oct. 29, 2013
Outline Related R&D programs in China HPC system development Application service environment Applications
Related R&D programs in China
Related R&D programs in China
HPC-related R&D Under NSFC
NSFC Key initiative “Basic algorithms for high
performance scientific computing and computable modeling”
2011-2018 180 million RMB Basic algorithms and high efficient
implementation Computable modeling Verification by solving domain problems
HPC-related R&D Under 863 program
3 Key projects in the last 12 years High performance computer and core software (2002-2005) High productivity computer and Grid service environment
(2006-2010) High productivity computer and application environment
(2011-2016) 3 Major projects
Multicore/many-core programming support (2012-2015) High performance parallel algorithms and parallel coupler
development for earth systems study (2010-2013) HPC software support for earth system modeling (2010-
2013)
HPC-related R&D Under 973 program
973 program High performance scientific computing Large scale scientific computing Aggregation and coordination
mechanisms in virtual computing environment
Highly efficient and trustworthy virtual computing environment
There is no national long-term R&D program on extreme scale computing
Coordination between different programs needed
Shift of 863 program emphasis
1987: Intelligent computers, following the 5th generation computer program in Japan
1990: from intelligent computers to high performance parallel computers
1999: from individual HPC system to the national HPC environment
2006: from high performance computers to high productivity computers
History of HPC development under 863 program
1990: parallel computers identified as priority topic of the 863 program National Intelligent Computer R&D Center
established 1993: Dawning 1, 640MIPS, SMP 1995: Dawning 1000, 2.5GFlops, MPP
Downing company established in 1995 1996: Dawning 1000A, cluster system
First product-oriented system of Dawning 1998: Dawning 2000, 100GFlops, cluster
History of HPC development under 863 program
2000: Dawning 3000, 400GFlops, cluster, First system commercialized
2002: Lenovo DeepComp 1800, 1TFlops, cluster Lenovo entered the HPC market
2003: Lenovo DeepComp 6800, 5.3TFlops, cluster
2004: Dawning 4000A, 11.2TFlops
History of HPC development under 863 program
2008: Lenovo DeepComp 7000
150TFlops, Heterogeneous cluster Dawning 5000A
230TFlops, cluster 2010:
Dawning 6000 3PFlops, Heterogeneous system CPU+GPU
TH-1A 4.7PFlops, Heterogeneous CPU+GPU
2011: Sunway-Bluelight
IPFlops+100TFlops Based on domestic processor
2013: TH-2
Heterogeneous system with CPU+MIC
863 key projects on HPC and Grid: 2002-2010
“High performance computer and core software” 4-year project, May 2002 to Dec. 2005 100 million Yuan funding from the MOST More than 2Χ associated funding from local
government, application organizations, and industry
Major outcomes: China National Grid (CNGrid) “High productivity Computer and Grid
Service Environment” Period: 2006-2010 (extended to now) 940 million Yuan from the MOST and more than
1B Yuan matching money from other sources
Current 863 key project
“High productivity computer and application environment” 2011-2015 (2016) 1.3B YUAN investment secured Develop leading level high performance
computers Transfer CNGrid into an application service
environment Develop parallel applications in selected areas
Projects launched The first round of projects launched in 2011 High productivity computer (1)
100PF by the end of 2015 HPC applications (6)
Fusion simulation Simulation for aircraft design Drug discovery Digital media Structural mechanics for large machinery Simulation of electro-magnetic environment
Parallel programming framework (1) Application service environment will be supported in
the second round Emphasis on application service support Technologies for new mode of operation
HPC system developmentHPC system development
Major challenges Power consumption Performance obtained by the applications Programmability Resilience Major obstacles
memory walls Power walls I/O walls …
Power consumption The limiting factor to implementation
of extreme scale computers Impossible to increase performance by
expanding system scale only Cooling of the system is difficult and
affects reliability of the system Energy cost is a heavy burden and
prevent acceptance of extreme scale computers by end users
Performance obtained by applications Systems installed at general purpose
computing centers Serving a large population of users supporting a wide range of applications
LinPack is not everything Need to be efficient for both general-
purpose and special-purpose computing Need to support both computing-intensive
and data-intensive applications
Programmability Must handle
Concurrency/locality Heterogeneity of the system Legacy programs porting
Lower the skill requirement for application developers
Resilience
Very short MTBF for extreme scale systems
Long-time continuous operation System must self-heal/recover from
hardware faults/failures System must detect and tolerate
errors in software
Constrained design principle We must set strong constrains to the
extreme scale system implementation Power consumption
50GF/W or less before 2020 5GF/W in 2015
Systems scale <100,000 processors <200 cabinets
Cost <300 million dollars (or <2 B YUAN)
We can only design and implement extreme scale system with those constrains
How to address the challenges?
Architectural support Technology innovation Hardware and software coordination
Architectural support Using the most appropriate architecture to achieve
the goal Making trade-offs between performance, power
consumption, programmability, resilience, and cost Hybrid architecture (TH-1A & TH-2)
General purpose + high density computing (GPU or MIC) HPP architecture (Dawning 6000/Loonson)
Enable different processors to co-exist Support global address space Multi-level of parallelism
Multi-conformation and Multi-scale adaptive architecture (SW/BL)
Cluster implemented with Intel processor for supporting commercial software
Homogeneous system implemented with domestic multicore processors for computing-intensive applications
Support parallelism at different levels
Classification of current major architectures
Classifying architectures using “homogeneity/heterogeneity” and “CPU only/CPU+Accelerator”
Homo-/Hetero refers to the ISA
CPU only CPU+Acc
Homogeneous
SequoiaK-computerSunway/BL
StampedeTH-2
Heterogeneous
Dawning 6000/HPP (AMD+Loonson)
TH-1ADawning 6000/Nebulae, Tsubame 2.0
Comparison of different architectures
power performance Programmability/productivity
resilience
Homo/CPU only
poor/fair
good/excellent
good/good
vary
Heter/CPU only
poor good fair/fair vary
Homo/CPU+ACC
fair good/excellent
good/poor?
vary
Heter/CPU+ACC
good good/excellent
fair/poor? vary
TH-1A architecture Hybrid system architecture
Computing sub-system Service sub-system Communication networks Storage sub-system Monitoring and diagnosis sub-system
Storage sub-systemStorage sub-system
Compute sub-systemCompute sub-system Service sub-systemService
sub-system
Communication sub-systemCommunication sub-system
CPU+
GPU
CPU+
GPU
CPU+
GPU
CPU+
GPU
CPU+
GPU
CPU+
GPU
Operation node
Operation node
MDSMDSOSSOSS OSSOSS OSSOSSOSSOSS
…
…
CPU+
GPU
CPU+
GPU
CPU+
GPU
CPU+
GPU
Operation node
Operation nodeM
onitor and diagnosis sub-system
Monitor and
diagnosis sub-system
Dawning/Loonson HPP (Hyper Parallel Processing) architecture
Hyper node composed of AMD and Loonson processors
Separation of OS & appl. processors
Multiple interconnect H/W global synchronization
RTs
APPCPUs
MEMs
OS
OSCPU
MEM
HPPController
I/O
RTs
APPCPUs
MEMs
OS
OSCPU
MEM
I/O
DATA OS
Int Int
GlobalSync
Hypernode Hypernode
HPPController
OS
CPU
MEM
I/O
Int
...
OS
CPU
MEM
I/O
OS
CPU
MEM
I/O
Sunway BlueLight Architecture
Global I /O Network
IO nodes
System manage
System Servi ce
Offline StorageOnline storage Nearline Storage
Login nodes
National Grid
Cloud services
I nternet
Remote users
Storage manager
Subnetwork manager
Consol e
DataBase Servi ce
Job manage nodes
FirewallFirewall I ntranet
Remote users
Local users
Local users
VPNVPN
Securi ty Servi ce
TCP/ IP network
Data Center
Blue Light Compter
Technology innovations Innovation at different levels
Device Component system
New processor architectures Heter. Many-core, accelerators, re-configurable
Address memory wall new memory devices 3D stacking New cache architectures
High performance interconnect All optical network Silicon photonics
High density system design Low power design
CPU SW1600
Release time Aug,2010
Processor cores 16
Peak performance 140.8GFlops@1.1GHz
Clock frequency 0.975~1.1GHz
Process generation 65nm
Power 35~70W
a general-purpose multi-core processor power efficient, achieve 2.0GFlops/W Next generation processor is under development
SW1600 processor features
SparcV9, 16 cores, 4 SIMD 40nm, 1.8GHz Performance: 144GFlops Typical power: ~65W
FT-1500 CPU
Similar ISA, different ALU
2 Intel Ivy Bridge CPU + 3 Intel Xeon Phi
16 Registered ECC DDR3 DIMMs, 64GB
3 PCI-E 3.0 with 16 lanes
PDP Comm. Port Dual Gigabit LAN Peak Perf. :
3.432Tflops
GDDR5Memory
GDDR5Memory
MICMIC
CPUCPU
CPUCPU
QPI
PCHPCHDMI16X PCIE
IPMB
CPLDCPLD16X PCIE
16X PCIE
GEGEPDPPDP 16X PCIE
Comm. PortDual Gigabit LAN
Heterogeneous Compute Node (TH-2)
Interconnection network (TH-2)
576-port Switch 0
576-port Switch 12
Compute node
Compute node
Fat-tree topology using 13 576-port top level switches
Optical-electronic hybrid transport tech.
Proprietary network protocol
High radix router ASIC: NRC Feature size: 90nm Die size: 17.16mm x 17.16mm Package: FC-PBGA 2577 pins Throughput of single NRC:
2.56Tbps Network interface ASIC: NIC
Same Feature size and package
Die size: 10.76mm x 10.76mm 675 pins, PCI-E G2 16X
Interconnection network(TH-2)
High density system design (SW/BL)
computing node Basic element, one processor +memory
node complex High density assembly, 2 computing nodes+network interface
Supernode 256 nodes (processors), tightly coupled interconnect
cabinet 1024 computing nodes (4 supernodes)
Multi/many-core
processor
Computing node
Node complexsupernode
system
Low power design
Low power design at different levels Low power processors Low power interconnect High efficient cooling High efficient power supply
Low power management Fine-grain real-time power consumption monitor System status sensing Multi-layer power consumption control
Low power programming Default system tools like debugging and tuning? Code power consumption modeling Sampling the code power consumption as code performance Feedback to programming
Power supply (SW/BL)
DC UPS Conversion
efficiency 77% Highly reliable Power monitoring
associated
SW-3
AC1
AC2
DCUPS DCDC
板级电源 核心器件
两路交流输入"N+1"热备份
DC300V
AC380V
DCUPS
TDK-Lambda+Vin
-Vin
SGCNT
+V+V+V-V-V-V
TDK-Lambda+Vin
-Vin
SGCNT
+V+V+V-V-V-V
12V
REC
12V主电源 DC/ DC 电源SW- 5
AC240V DC300V DC12V 0. 9V
10KV 配电
可控整流N+1备份
300 W
TDK-Lambda+Vin
-Vin
SGCNT
+V+V+V-V-V-V
众核处理器
SW
双路切换
12相变换
AC1
AC2AC
OFF
变配电部分 一次电源机舱二次电源
高压移相变压
10KV 240V:
1输入
2输入
E: \ SZ7_xxx \ PROTEL\ SZ_VI I _DY. ddb - Document s\ SZ7\ P_Chai n_02. Sch工程
AC10KV
AC10KV
AC10KV
SW-3
SW-3
TDK-Lambda+Vin
-Vin
SGCNT
+V+V+V-V-V-V
TDK-Lambda+Vin
-Vin
SGCNT
+V+V+V-V-V-V
TDK-Lambda+Vin
-Vin
SGCNT
+V+V+V-V-V-V
DC12V
TDK-Lambda+Vin
-Vin
SGCNT
+V+V+V-V-V-V
DC12V
DC300V
SW-3
SW-3
SW-3
SW-3
SW-3
SW-3
DC12V
DC12V
DPNC
FHCA
DC12V输入 TDK-Lambda+Vin
-Vin
SGCNT
+V+V+V-V-V-V
主电源
DPNC
DPNC
DPNC
"4+1"
标 准 化
拟 制
审 核
批 准 第 张 张共文件路 径: :日 期
1 2 3 4 5 6 7 8
D
C
B
A
E: \ SZ6_906 _02\ \ \ 90602 . ddb - Document s\ \ Sheet 15. Sch工程 机房建 设 原理图 配电总 图 总图 6- Feb- 2013
幅面A3
E
F
SJT011.82.00 DL
15
906 02工程 分系统
配电总图17
YJ V4*120+75
UPS1500 KVA
W01
W02
W03
W18
W19
. . . .
1000A 母线1000A 母线
外围设备配电
WPD1
外围设备100KW
W21
W22
W23
W38
W39
. . . .
WPD2
外围设备100KW
K101
K102
K103
K109
K110
. . . .
WPD3
外围设备90KW
UPD1
UJ PD1
双电源转换
YJ V4*120+75
UPS2500 KVA
W01
W02
W03
W18
W19
. . . .
1000A 母线1000A 母线
WPD4
外围设备100KW
W21
W22
W23
W38
W39
. . . .
WPD5
外围设备100KW
K101
K102
K103
K109
K110
. . . .
WPD6
外围设备90KW
UPD2
UJ PD2
双电源转换
BU1
BU2
BU3
BU4
BU5
BU6
9401
1040
1
1040
2
9402
14/ E2 14/ E5 14/ E6 14/ E2
Efficient Cooling (TH-2)
Close-coupled chilled water cooling Customized Liquid Cooling Unit
High Cooling Capacity: 80kW
Use city cooling system
to supply cooling water to LCUs
Efficient Cooling (SW/BL)
Water cooling to the board (node complex)
Energy-saving Environment-friendly
High room temperature Low noise
HW/SW coordination Using combination of hardware and software
technologies to address the technical issues Achieving performance while maintaining
flexibility Compilation support Parallel programming framework Performance tools HW/SW coordinated reliability measures
User level checkpointing Redundancy based reliability measure
Software stack of TH-2
Features Support C, Fortran and SIMD
extension Libc for computing kernel Support storage hierarchy Programming model for many-core
acceleration Collaborative cache date prefetch Instruction prefetch optimization Static/dynamic instruction scheduling
optimization
异构融合的基础编译器
前端
C C++ Fortran SIMD
常规优化
针对异构众核优化
中间表示转换与代码生成
众核线程调度优化
运算核心机器描述
运算核心汇编代码生成
运算控制核心机器描述
运算控制核心汇编代码生成
加速线程支撑库
线程调度/控制线程创建/回收
异步/掩码支持中断/异常管理
运算控制核心加速线程库
运算控制核心基础库
异构程序加载器
纯运算控制核心模式 异构混合模式 纯运算核心模式
编程模型与优化
协同
访存优化
多层次寄存器分配优化
动静结合的调度优化
数据访问指令代理优化
热点函数重排与垫塞
轻量级局存动态分配优化
汇编器 链接器 反汇编器
任务执行中断触发数据传输线程识别
运算核心加速线程库
运算核心基础库
过程间优化
循环嵌套优化
全局优化
……
SBM
D cache
Compiler for many-core
Basic math lib based on many-core structure
Basic function lib SIMD extended function lib Fortran function lib
Technical features Standard function call
interface Customized optimization Support accuracy analysis
基础数学库系统
基础函数库 SIMD扩展函数库 Fortran函数库
双曲函数
性能优化浮点异常控制 精度控制
基础算法
指数函数
对数函数
数值运算函数
数值处理函数
判断类函数
贝塞儿函数
误差函数
SIMD算法
三角函数
ISO C99 数
学函数接口规范
IEEE 754 标
准
Basic math lib for many-core
Technical features Unified architecture for
heterogeneous many-cores Low overhead
virtualization High efficient resource
management
Parallel OS
Covering program development, testing, tuning, parallelization and code translation
Collaborative tuning framework
Tolls for parallelism analysis and parallelization
Integrated translation tools for multiple source codes
自动管理
项目管理 文件管理 模板管理 配置管理
开发场景
编辑器 编译管理 执行管理 环境管理
并行调试
多种编程模型调试
一体化调优
算法语言调优
并行语言调优 基础语言调优
协同开发调优框架
帮 助 系 统
应用服务支撑
应用服务中间件 用户 管理授权 容器 数据
编译执行服务
并行/基础编译
作业执行管理
调试服务 调优服务
参数化调优 数据采集
迭代优化 联合优化
策略优化 自动向量化
静态调试模式 动态调试模式
并
行
应
用
开
发
平
台
引擎服务
模型插件 实例管理
SWGDB
微架构级命令环境
自动SIMD向量化一体化调优模型
性能监测
扩展功能
并行识别与自动并行化 二进制翻译
Parallel application development platform
Parallel programming framework
Hide the complexity of programming millions of cores
Integrate high efficient implementation of fast parallel algorithms
Provide efficient data structures and solver libraries
Support software engineering concept for code extensibility.
SupercoSupercomputermputer
ApplicApplicationsations
middlmiddlewareeware
Peta-scale flops 100P flops
Program wall :Think parallelWrite sequential
100 times
High Performance Computing Applications Infrastructure
Materials, Climate, nuclear energy…
Infrastructure: Four types computing
Structured
Mesh
Finite Element
Unstructured
Mesh
CombinatoryGeometry
HPC JAUMIN : J Adaptive Unstructured
Meshes applications INfrastructure
并行自适应非结构网格支撑软件框架
JCOGIN : J mesh-free COmbinatory Geometry INfrastructure并行三维无网格组合几何计算支撑软件框架
JASMIN :( J Adaptive Structured
Meshes applications INfrastructure ) 并行自适应结构网格支撑软件框架
PHG : Parallel Hierarchical Grid infrastructure
并行自适应有限元计算软件平台
Reliability design
High-quality components, strict screening test Water cooling to prolong the lifetime of
components High density assembly, reduce the length of
wires, improve data transfer reliability Multiple error correction codes to deal with
instantaneous errors Redundant design for memory, computing
node, networks, I/O, power supply, and water cooling
Hardware monitoring (SW/BL)
Basis for reliability, availability, maintainability of the system Monitor major
components Maintenance Diagnosis Dedicated
management network
1
2Diag
RPS
PWR
LED Mode
3
4
5
6
7
8
9
10
11
12
13
14
15
16
5 6 7 8 9 10 11 12 13 14 15 161 2 3 4
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
21 22 23 24 25 26 27 28 29 30 31 3217 18 19 20
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
49 50
51 5248
37 38 39 40 41 42 43 44 45 46 47 4833 34 35 36 PowerConnect 3048
环境监控
应急系统
系统控制台
万兆主交换机
系统控制台
监控交换机SD
PowerEdge4350
………
以太网交换模块
维护服务卡
维护控制器
维护服务卡
维护服务卡
……
CPU
CPU
以太网交换
CPU
CPU
以太网控制器ARM
FPGA
BMC并口串口
……
CPU
CPU
以太网交换
CPU
CPU
以太网控制器ARM
FPGA
BMC并口串口
……
CPU
CPU
以太网交换
CPU
CPU
以太网控制器ARM
FPGA
BMC并口串口
...
I BA Swi tch I BA Swi tch
以太网控制器ARM
FPGA
BMC并口串口
...
I BA Swi tch I BA Swi tch
以太网控制器ARM
FPGA
BMC并口串口
...
I BA Swi tch I BA Swi tch
以太网控制器ARM
FPGA
BMC并口串口
以太网交换模块
维护服务卡
维护控制器
计算超节点
计算超节点
计算超节点
互连网络插件
互连网络插件
互连网络插件
High availability (SW/BL)
SW/HW coordinated multi-level fault-tolerant architecture
Local fault suppression, fault isolation, fault components replacement, fault recovering
应用层
受控容错手段
控制层
系统信息库
基础支撑
硬件系统
用户应用
主动容错被动容错保留恢复
作业回卷
作业降级
开工容错
服务修复
双机接管 RAC
主动迁移
容错总控 接插件环境 容错接插件
故障数据 预警信息 容错策略
系统维护 心跳检测 RAS
节点 网络
Delivered system: TH-1A
Tianhe: Galaxy in Chinese Hybrid arch. :CPU & GPU Peak performance 4.7PF Linpack 2.57PF Power consumption 4.04MW
Items Items Configuration Configuration
ProcessorsProcessors 14,336 XEON CPUs + 7,168 nVIDIA GPUs + 2,048FT CPUs14,336 XEON CPUs + 7,168 nVIDIA GPUs + 2,048FT CPUs
MemoryMemory 262TB in total262TB in total
InterconnectInterconnect Proprietary high-speed interconnect networkProprietary high-speed interconnect network
StorageStorage Global shared parallel storage system, 2PBGlobal shared parallel storage system, 2PB
RacksRacks 120 Compute racks+14 Storage racks + 6 Communication racks120 Compute racks+14 Storage racks + 6 Communication racks52
Delivered system: Delivered system: Dawning 6000
Hybrid system Service unit
(Nebula) 3PF peak
performance 1.27PF Linpack
performance 2.6 MW
Computing unit Experiment on
using Loonson processor
Delivered system: Delivered system: Sunway BlueLight
Installed in September, 2011 at the National Supercomputing Center in Jinan.
Implemented completely with domestic 16-core ShenWei processor SW1600
8704 ShenWei processors in total Peak performance: 1.07PFlops (with 8196 processor) Linpack performance: 796TFlops (with 8196 processor) Power consumption: 1074KWatt. (with Linpack execution)
Delivered system: TH-2
TH-2 specifications
Hybrid Architecture Xeon CPU & Xeon Phi
Application service environment
China National Grid (CNGrid) 14 sites
SCCAS (Beijing, major site) SSC (Shanghai, major site ) NSC-TJ (Tianjin) NSC-SZ (Shenzhen) NSC-JN (Jinan) Tsinghua University (Beijing) IAPCM (Beijing) USTC (Hefei) XJTU (Xi’an) SIAT (Shenzhen) HKU (Hong Kong) SDU (Jinan) HUST (Wuhan) GSCC (Lanzhou)
The CNGrid Operation Center (based on SCCAS)
CNGrid sitesCPU/GPU
Storage
SCCAS 157TF/300TF
1.4PB
SSC 200TF 600TB
NSC-TJ 1PF/3.7PF 2PB
NSC-SZ 716TF/1.3PF
9.2PB
NSC-JN 1.1PF 2PB
THU 104TF/64TF 1PB
IAPCM 40TF 80TB
USTC 10TF 50TB
XJTU 5TF 50TB
SIAT 30TF/200TF 1PB
HKU 23TF/7.7TF 130TB
SDU 10TF 50TB
HUST 3TF 22TB
GSCC 13TF/28TF 40TB
THUTHUIAPCMIAPCM NSCTJNSCTJ
NSCJNNSCJN
SSCSSC
USTCUSTC
NSCSZNSCSZ
HKUHKUSIATSIAT
HUSTHUST
SCCASSCCAS
GSCCGSCC
XJTUXJTU
SDUSDU
CNGrid GOS Architecture
Tomcat(Apache)+Axis, GT4, gLite, OMII
Dynamic DeployService
CA Service
System Mgmt Portal
Hosting Environment
Core
System
Tool/App
Message Service
Agora
User Mgmt Res MgmtAgora Mgmt
Naming
HPCG App & Mgmt Portal
GSML Browser
ServiceControllerOther
RController
BatchJob mgmt
MetaScheduleAccount mgmt
File mgmt
metainfo mgmt
HPCG Backend
Resource Space
GOS System Call (Resource mgmt,Agora mgmt, User mgmt, Grip mgmt, etc)GOS Library (Batch, Message, File, etc)
Other Domain Specific Applications
Grip Runtime
Grip Instance MgmtSecurity
Res AC & Sharing
Other 3rd software &
tools
Java J2SE
GridWorkflowDataGrid
IDE Compiler
GSML Composer
GSML Workshop.
Debugger
Grip
Gsh & cmd tools
VegaSSH
Cmd Line Tools
DB ServiceWork Flow
Engine
Grid Portal, Gsh+CLI, GSML Workshop and Grid Apps
OS (Linux/Unix/Windows)
PC Server (Grid Server)
J2SE(1.4.2_07, 1.5.0_07)
Tomcat(5.0.28) +Axis(1.2 rc2)
Axis Handlers for Message Level Security
Core, System and App Level Services
CNGrid GOS deployment
CNGrid GOS deployed on 11 sites and some application Grids
Support heterogeneous HPCs: Galaxy, Dawning, DeepComp
Support multiple platforms Unix, Linux, Windows
Using public network connection, enable only HTTP port
Flexible client Web browser Special client GSML client
CNGrid: Resources
14 sites >3PF
aggregated computing power
>15PB storage
CNGrid: Service and Users
>450 services >2800 users
China commercial Aircraft Corp
Bao Steel automobile institutes of
CAS universities ……
CNGrid : applications
Supporting >700 projects 973, 863, NSFC, CAS Innovative, and
Engineering projects
Application Villages Support domain applications
Industrial product design optimization New drug discovery Digital media
Introducing Cloud Computing concept CNGrid—as IaaS and partially PaaS Application villages—as SaaS and partially
PaaS Build up business models for HPC
applications
Applications
CNGrid applications
Grid applications Drug Discovery Weather forecasting Scientific data Grid and its application in research Water resource Information system Grid-enabled railway freight Information system Grid for Chinese medicine database applications HPC and Grid for Aerospace Industry (AviGrid) National forestry project planning, monitoring and
evaluation
HPC applications Computational chemistry Computational Astronomy Parallel program for large fluid machinery design Fusion ignition simulation Parallel algorithms for bio- and pharmacy applications Parallel algorithms for weather forecasting based on
GRAPES 10000+ core scale simulation for aircraft design Seismic imaging for oil exploration Parallel algorithm libraries for PetaFlops systems
China’s status in the related fields
Significant progress in developing HPC systems and HPC service environment
Lack of long-term strategic study and plan Still far behind in many aspects
Lack of kernel technologies Processors, memory, interconnect, system software,
algorithms… Especially weak in applications Need multi-disciplinary research Shortage in cross-disciplinary talents
Sustainable development is crucial Lack of regular budget for e-Infrastructure Always competing funding with other disciplines
Pursuing international Cooperation
We wish to cooperate with international HPC communities Joint work on grand challenge problems
Climate change New energy Environment protection Disaster mitigation
Jointly address challenges towards Extreme scale systems
Low power system design and implementation Performance obtained by applications Heterogeneous system programming Resilience of large systems
Thank you!
top related