san diego supercomputer center at the university of california, san diego teragrid coordination...

14
SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO TeraGrid Coordination Meeting June 10, 2010 TeraGrid Forum Meeting June 16, 2010

Upload: alfred-jennings

Post on 25-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

SAN DIEGO SUPERCOMPUTER CENTER

at the UNIVERSITY OF CALIFORNIA, SAN DIEGO

TeraGrid Coordination MeetingJune 10, 2010

TeraGrid Forum MeetingJune 16, 2010

SAN DIEGO SUPERCOMPUTER CENTER

at the UNIVERSITY OF CALIFORNIA, SAN DIEGO

The Gordon Sweet Spot

Data Mining

• De novo genome assembly from sequencer reads & analysis of galaxies from cosmological simulations and observations.

• Federations of databases and Interaction network analysis for drug discovery, social science, biology, epidemiology, etc.

Predictive Science

• Solution of inverse problems in oceanography, atmospheric science, & seismology.

• Modestly scalable codes in quantum chemistry & structural engineering.

Large Shared Memory; Low Latency, Fast Interconnect; Fast I/O system

SAN DIEGO SUPERCOMPUTER CENTER

at the UNIVERSITY OF CALIFORNIA, SAN DIEGO

The Usual (HPC)Suspects are, well, suspect.

SAN DIEGO SUPERCOMPUTER CENTER

at the UNIVERSITY OF CALIFORNIA, SAN DIEGO

Typical HPC I/O has very little Random I/O – which is a sweet spot for SSD’s and Data Intensive Computing

• For example, NERSC study * of 50 applications found:

• Random access is rare for HPC applications; the I/O access is dominated by Sequential operations.

• Applications I/O dominated by append-only writes• The majority of applications have adopted a one-file-per-processor approach

to disk-I/O where each process of a parallel applications writes to its own separate file rather than using parallel/shared I/O API’s to write from all of the processors into a single file.

* Source: Characterizing and Predicting the I/O Performance of HPC Applications Using a Parameterized Synthetic Benchmark (Shalf, et al, SC ‘08)

SAN DIEGO SUPERCOMPUTER CENTER

at the UNIVERSITY OF CALIFORNIA, SAN DIEGO

Data Intensive WorkshopOctober 26-29, 2010

• Identify "Grand Challenges" in data-intensive science across a broad range of topics

• Identify applications and disciplines that will benefit from Gordon's unique architecture and capabilities

• Invite potential users of Gordon to speak and participate • Make leaders in data-intensive science aware of what SDSC is doing in

this space • Raise awareness among disciplines poorly served by current HPC

offerings • Better understand Gordon's niche in the data-intensive cosmos and

potential usage modes• Logistics:

• ~100 attendees; @SDSC; incl. 1-day hands-on; plenary speakers; astronomy, geoscience, neuroscience, physics, engineering, social science, and data-related technologies

SAN DIEGO SUPERCOMPUTER CENTER

at the UNIVERSITY OF CALIFORNIA, SAN DIEGO

Gordon Highlights• 245TF; 1024 Nodes; 64GB/node (64TB)

• Sandy Bridge processor• Dual socket• Core count TBD• 8 flops/clock/core via AVX instruction set

• 256TB Enterprise Intel SSD via 64 Nehalem/Westmere I/O Nodes (4TB per node)

• Dual rail, QDR 3D torus IB Interconnect

• Shared memory supernodes via ScaleMP vSMP Foundation• 32 Compute nodes/supernode• 128 node version launching in fall• Message passing between supernodes coming

• 4PB Data Oasis Disk

SAN DIEGO SUPERCOMPUTER CENTER

at the UNIVERSITY OF CALIFORNIA, SAN DIEGO

Gordon Supernode Architecture

• 32 Appro GreenBlade• Dual processor Intel Sandy

Bridge• 240 GFLOPS• 64 GB/node• # Cores TBD

• 2 Appro IO nodes/32 SN• Intel SSD drives

• 4 TB ea.• 560,000 IOPS

• ScaleMP vSMP virtual shared memory• 2 TB RAM aggregate

(64GBx32)• 8 TB SSD aggregate

(256GBx32)

240 GFComp.Node

64 GBRAM

240 GFComp.Node

64 GBRAM

4 TB SSDI/O Node

vSMP memory virtualization

SAN DIEGO SUPERCOMPUTER CENTER

at the UNIVERSITY OF CALIFORNIA, SAN DIEGO

Project Milestones• Dash is now a TeraGrid resource

• Allocation processes• Allocated users• Account setup• Application Environment

• 16-Way vSMP Acceptance Approved• SDSC is becoming a flash center of excellence in HPC. Working

closely with Dr. Steve Swanson in UCSD’s Center for Magnetic Recording Research (CMRR)

• Education, Outreach and Training• Data Intensive Workshop set for October 26-29 at SDSC.• NVM Workshop at UCSD in April• SC ‘10 Papers submitted• TeraGrid 2010 papers, tutorial, BOF submitted• Data intensive use cases being developed

SAN DIEGO SUPERCOMPUTER CENTER

at the UNIVERSITY OF CALIFORNIA, SAN DIEGO

Production Dash as of April 1

• Two 16 node virtual clusters• SSD-only

• 16 node; Nehalem, dual socket 8 core; 48GB ; 1 TB SSD (16)• SSD’s are local to the nodes• Standard queues available

• vSMP + SSD• 16 nodes, Nehalem , dual socket, 8 core, 48GB; 960GB SSB (15)• SSD’s are local to the nodes• Treated as a single shared resource

• GPFS-WAN

• Additional 32 nodes will be brought online after the vSMP 32-way acceptance testing in July

SAN DIEGO SUPERCOMPUTER CENTER

at the UNIVERSITY OF CALIFORNIA, SAN DIEGO

Gordon Timeline

SAN DIEGO SUPERCOMPUTER CENTER

at the UNIVERSITY OF CALIFORNIA, SAN DIEGO

The Road Ahead

• Understanding data intensive applications and how they can benefit from Gordon’s unique architecture

• Identifying new user communities• Education, Outreach and Training• Managing to the schedule and milestones• Track and assess flash technology developments• Education, Outreach and Training• I/O performance• Parallel file systems• InfiniBand/3D torus routing• Individual roles and responsibilities• Systems management processes• Education, Outreach and Training• Staffing ramp-up in October• Have fun doing this!

SAN DIEGO SUPERCOMPUTER CENTER

at the UNIVERSITY OF CALIFORNIA, SAN DIEGO

TeraGrid Support has been Instrumental

• Diane Baxter• Jeff Bennett• Leo Carson• Larry Diegel• Jerry Greenberg• Dave Hart• Jiahua He• Eva Hocks• Tom Hutton• Arun Jagatheesen• Adam Jundt

• Richard Moore• Mike Norman• Wayne Pfeiffer• Susan Rathbun• Scott Sakai• Allan Snavely• Mark Sheddon• Shawn Strande• Mahidhar Tatineni

• And many others…

SAN DIEGO SUPERCOMPUTER CENTER

at the UNIVERSITY OF CALIFORNIA, SAN DIEGO

SDSC’s Summer Education Program• TeacherTech summer workshops http://education.sdsc.edu/teachertech

• Conference of New Teachers in Genomics• Modeling Instruction in High School Physics: An Introduction• Introduction to Adobe Photoshop and the World of Digital Art• TeacherTECH Begins a Collaboration with UCSD-TV – Tune In!

• Newton’s Laws of Gravity: From the Celestial to the Terrestrial• Earthquake Science: Beyond Static Images and Flat Maps

• Student summer workshops http://education.sdsc.edu/teachertech/index.php?module=ContentExpress&func=display&ceid=18

• Exploring the World of Digital Art and Design• Introduction to Matlab: An Interactive Visual Math Experience• UCSD Biotechnology Academy• "Full Color Heroes" in Digital Art & Design: Comic Book Coloring!• 2D – 3D Insani-D!• 3D Photography: Experience It!• Photography + Photoshop = Fun!• Exploring Digital Photography and the Wonders of Photoshop• Introduction to Maya and 3D Modeling

SAN DIEGO SUPERCOMPUTER CENTER

at the UNIVERSITY OF CALIFORNIA, SAN DIEGO

SDSC’s Summer Education Program (cont.)

• Research Experience For High School Students (REHS) (21 students) http://education.sdsc.edu/teachertech/index.php?module=ContentExpress&func=display&ceid=37

• Supercomputer-based Workflow for Managing Large Biomedical Images • Refinement of Data Mining Software and Application to Space Plasmas for Data

Analysis and Visualization • Sonification of UCSD Campus Energy Consumption• Visualization and 3D Content Creation• The Cooperative Association for Internet Data Analysis Web Development Intern • Documentation Assistant – Health Info Databases Project