high performance computing cluster oscar team member jin wei, pengfei xuan cpsc 424/624 project (...
TRANSCRIPT
![Page 1: High Performance Computing Cluster OSCAR Team Member Jin Wei, Pengfei Xuan CPSC 424/624 Project ( 2011 Spring ) Instructor Dr. Grossman](https://reader036.vdocuments.us/reader036/viewer/2022062717/56649e5e5503460f94b585b7/html5/thumbnails/1.jpg)
High Performance Computing ClusterOSCAR
Team MemberJin Wei , Pengfei
Xuan
CPSC 424/624 Project ( 2011 Spring )
InstructorDr. Grossman
![Page 2: High Performance Computing Cluster OSCAR Team Member Jin Wei, Pengfei Xuan CPSC 424/624 Project ( 2011 Spring ) Instructor Dr. Grossman](https://reader036.vdocuments.us/reader036/viewer/2022062717/56649e5e5503460f94b585b7/html5/thumbnails/2.jpg)
Outline
Installation2
Management
3
Security4
Administration5
Backgroud1
![Page 3: High Performance Computing Cluster OSCAR Team Member Jin Wei, Pengfei Xuan CPSC 424/624 Project ( 2011 Spring ) Instructor Dr. Grossman](https://reader036.vdocuments.us/reader036/viewer/2022062717/56649e5e5503460f94b585b7/html5/thumbnails/3.jpg)
?
DIY Supercomputer
HPC = Computer + Network
+ OS + Management
Software
![Page 4: High Performance Computing Cluster OSCAR Team Member Jin Wei, Pengfei Xuan CPSC 424/624 Project ( 2011 Spring ) Instructor Dr. Grossman](https://reader036.vdocuments.us/reader036/viewer/2022062717/56649e5e5503460f94b585b7/html5/thumbnails/4.jpg)
Background Introduction
Clemson Palmetto
12,392 cores92.48 TeraFlops
TOP1:Tianhe-1A (China)
186, 368 cores4,701 TeraFlops
![Page 5: High Performance Computing Cluster OSCAR Team Member Jin Wei, Pengfei Xuan CPSC 424/624 Project ( 2011 Spring ) Instructor Dr. Grossman](https://reader036.vdocuments.us/reader036/viewer/2022062717/56649e5e5503460f94b585b7/html5/thumbnails/5.jpg)
HPC Network Topology
3 Set of NetworksManagementParallel ComputingStorage
Centralized Storage
![Page 6: High Performance Computing Cluster OSCAR Team Member Jin Wei, Pengfei Xuan CPSC 424/624 Project ( 2011 Spring ) Instructor Dr. Grossman](https://reader036.vdocuments.us/reader036/viewer/2022062717/56649e5e5503460f94b585b7/html5/thumbnails/6.jpg)
Installation
Easy ManagementBatch OS installBatch software install
![Page 7: High Performance Computing Cluster OSCAR Team Member Jin Wei, Pengfei Xuan CPSC 424/624 Project ( 2011 Spring ) Instructor Dr. Grossman](https://reader036.vdocuments.us/reader036/viewer/2022062717/56649e5e5503460f94b585b7/html5/thumbnails/7.jpg)
ManagementCluster Management
Partition a cluster into multiple logical computersMaps logical computers (clusters) onto servers (nodes)Multiple independent OS configurationsManages and monitors logical computer (clusters) statusCluster status to management system
Job scheduling and managementManages and monitors operating system instances (nodes)Node status to management system
System Management Management of overall system configuration
Redundant management servers with automatic failoverDesigned to anticipate and tolerate failures
![Page 8: High Performance Computing Cluster OSCAR Team Member Jin Wei, Pengfei Xuan CPSC 424/624 Project ( 2011 Spring ) Instructor Dr. Grossman](https://reader036.vdocuments.us/reader036/viewer/2022062717/56649e5e5503460f94b585b7/html5/thumbnails/8.jpg)
Management Server ManagementAutomatic discovery of server hardwareRemote server control (Power On/Off, Cycle)Scalable fast diskless or data-less booting for large node count systemsServer redundancy and failoverProvides server status to the management system
Network Management Automatic discovery of interconnect hardware
Multiple interconnect fabric topologiesRedundant paths and networksLoad balancing and failoverNetwork status to the management system
Storage Management Scalable root file systems for diskless or data-less nodes Multiple global storage configurations High BW to secondary storage for data and check pointing Provides server status to the management system
![Page 9: High Performance Computing Cluster OSCAR Team Member Jin Wei, Pengfei Xuan CPSC 424/624 Project ( 2011 Spring ) Instructor Dr. Grossman](https://reader036.vdocuments.us/reader036/viewer/2022062717/56649e5e5503460f94b585b7/html5/thumbnails/9.jpg)
Security Control Model
![Page 10: High Performance Computing Cluster OSCAR Team Member Jin Wei, Pengfei Xuan CPSC 424/624 Project ( 2011 Spring ) Instructor Dr. Grossman](https://reader036.vdocuments.us/reader036/viewer/2022062717/56649e5e5503460f94b585b7/html5/thumbnails/10.jpg)
Administration ( C3 Tool Suite ) cexec: executes any standard command on all
cluster nodes e.g. cexec mkdir /tmpckill: terminates a user specified process on all cluster nodes e.g. ckill my_program_abccget: retrieves files or directories from all cluster nodescpush: distribute files or directories to all cluster nodescpushimage: update the system image on all cluster nodes using an image captured by the System Imager toolcrm: remove files or directories from all cluster nodescshutdown: shutdown or restart all cluster nodescnum: returns a node range number based on node namecname: returns node names based on node rangesclist: returns all clusters and their type in a configuration file
'Cluster Command & Control' (C3)
![Page 11: High Performance Computing Cluster OSCAR Team Member Jin Wei, Pengfei Xuan CPSC 424/624 Project ( 2011 Spring ) Instructor Dr. Grossman](https://reader036.vdocuments.us/reader036/viewer/2022062717/56649e5e5503460f94b585b7/html5/thumbnails/11.jpg)
Other Administration Tools
System Installation Suite (SIS) : Install the client nodes. SIS also provides the
database from which OSCAR obtains its cluster configuration information. The main concept to understand about SIS is that it is an image based install tool. An image is basically a copy of all the files that get installed on a client. This image is stored on the server and can be accessed for customizations or updates. You can even chroot into the image and perform builds.
Switcher Environment Manager: Provide a simple mechanism to allow users to
manipulate their environment
![Page 12: High Performance Computing Cluster OSCAR Team Member Jin Wei, Pengfei Xuan CPSC 424/624 Project ( 2011 Spring ) Instructor Dr. Grossman](https://reader036.vdocuments.us/reader036/viewer/2022062717/56649e5e5503460f94b585b7/html5/thumbnails/12.jpg)
References
[1] http://svn.oscar.openclustergroup.org/trac/oscar/
wiki/InstallGuideIntroduction.[2] M.J. Brim, T.G. Mattson, "OSCAR: Open
Source Cluster Application Resources"..[3] B.Luethke, S. Scott and T. Naughton,
"OSCAR Cluster Administration With C3".[4] C3, http://www.csm.ornl.gov/torc/C3
![Page 13: High Performance Computing Cluster OSCAR Team Member Jin Wei, Pengfei Xuan CPSC 424/624 Project ( 2011 Spring ) Instructor Dr. Grossman](https://reader036.vdocuments.us/reader036/viewer/2022062717/56649e5e5503460f94b585b7/html5/thumbnails/13.jpg)
Question?