enabling knowledge discovery in a virtual universe
DESCRIPTION
Enabling Knowledge Discovery in a Virtual Universe. Harnessing the Power of Parallel Grid Resources for Astrophysical Data Analysis. Jeffrey P. Gardner Andrew Connolly Cameron McBride. Pittsburgh Supercomputing Center University of Pittsburgh Carnegie Mellon University. (happy scientist). - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Enabling Knowledge Discovery in a Virtual Universe](https://reader034.vdocuments.us/reader034/viewer/2022052401/5681484f550346895db56334/html5/thumbnails/1.jpg)
Enabling Knowledge Discovery in a Virtual
Universe
Harnessing the Power of Parallel Grid Resources for Astrophysical
Data AnalysisJeffrey P. GardnerJeffrey P. GardnerAndrew ConnollyAndrew ConnollyCameron McBrideCameron McBride
Pittsburgh Supercomputing CenterPittsburgh Supercomputing CenterUniversity of PittsburghUniversity of Pittsburgh
Carnegie Mellon UniversityCarnegie Mellon University
![Page 2: Enabling Knowledge Discovery in a Virtual Universe](https://reader034.vdocuments.us/reader034/viewer/2022052401/5681484f550346895db56334/html5/thumbnails/2.jpg)
How to turn simulation output into scientific knowledge
Step 1: Run simulation
Step 2: Analyze simulationon workstation
Step 3: Extract meaningfulscientific knowledge
(happy scientist)Using 300 processors:(circa 1995)
![Page 3: Enabling Knowledge Discovery in a Virtual Universe](https://reader034.vdocuments.us/reader034/viewer/2022052401/5681484f550346895db56334/html5/thumbnails/3.jpg)
How to turn simulation output into scientific knowledge
Step 1: Run simulation
Step 2: Analyze simulationon server (in serial)
Step 3: Extract meaningfulscientific knowledge
(happy scientist)Using 1000 processors:(circa 2000)
![Page 4: Enabling Knowledge Discovery in a Virtual Universe](https://reader034.vdocuments.us/reader034/viewer/2022052401/5681484f550346895db56334/html5/thumbnails/4.jpg)
How to turn simulation output into scientific knowledge
Step 1: Run simulation
Step 2: Analyze simulationon ???
(unhappy scientist)Using 4000+ processors:(circa 2006)
X
![Page 5: Enabling Knowledge Discovery in a Virtual Universe](https://reader034.vdocuments.us/reader034/viewer/2022052401/5681484f550346895db56334/html5/thumbnails/5.jpg)
Mining the Universe can be (Computationally) Expensive
The size of simulations is no longer limited by computational power
It is limited by the parallelizability of data analysis tools
This situation, will only get worse in the future.
![Page 6: Enabling Knowledge Discovery in a Virtual Universe](https://reader034.vdocuments.us/reader034/viewer/2022052401/5681484f550346895db56334/html5/thumbnails/6.jpg)
How to turn simulation output into scientific knowledge
Step 1: Run simulation
Step 2: Analyze simulationon ???
Using 100,000 processors?:(circa 2012)
X
By 2012, we will have machines that will have many hundreds of thousands of cores!
![Page 7: Enabling Knowledge Discovery in a Virtual Universe](https://reader034.vdocuments.us/reader034/viewer/2022052401/5681484f550346895db56334/html5/thumbnails/7.jpg)
The Challenge of Data Analysis in a Multiprocessor Universe
Parallel programs are difficult to write! Steep learning curve to learn parallel programming
Parallel programs are expensive to write! Lengthy development time
Parallel world is dominated by simulations: Code is often reused for many years by many
people Therefore, you can afford to invest lots of time
writing the code. Example: GASOLINE (a cosmology N-body
code) Required 10 FTE-years of development
![Page 8: Enabling Knowledge Discovery in a Virtual Universe](https://reader034.vdocuments.us/reader034/viewer/2022052401/5681484f550346895db56334/html5/thumbnails/8.jpg)
The Challenge of Data Analysis in a Multiprocessor Universe
Data Analysis does not work this way: Rapidly changing scientific
inqueries Less code reuse
Simulation groups do not even write their analysis code in parallel!
Data Mining paradigm mandates rapid software development!
![Page 9: Enabling Knowledge Discovery in a Virtual Universe](https://reader034.vdocuments.us/reader034/viewer/2022052401/5681484f550346895db56334/html5/thumbnails/9.jpg)
How to turn observational data into scientific knowledge
Step 1: Collect data
Step 2: Analyze dataon workstation
Step 3: Extract meaningfulscientific knowledge
(happy astronomer)
![Page 10: Enabling Knowledge Discovery in a Virtual Universe](https://reader034.vdocuments.us/reader034/viewer/2022052401/5681484f550346895db56334/html5/thumbnails/10.jpg)
The Era of Massive Sky Surveys
Paradigm shift in astronomy: Sky Surveys Available data is growing at a much
faster rate than computational power.
![Page 11: Enabling Knowledge Discovery in a Virtual Universe](https://reader034.vdocuments.us/reader034/viewer/2022052401/5681484f550346895db56334/html5/thumbnails/11.jpg)
Good News for “Data Parallel” Operations
Data Parallel (or “Embarrassingly Parallel”): Example:
1,000,000 QSO spectra Each spectrum takes ~1 hour to reduce Each spectrum is computationally
independent from the others There are many workflow
management tools that will distribute your computations across many machines.
![Page 12: Enabling Knowledge Discovery in a Virtual Universe](https://reader034.vdocuments.us/reader034/viewer/2022052401/5681484f550346895db56334/html5/thumbnails/12.jpg)
Tightly-Coupled Parallelism(what this talk is about)
Data and computational domains overlap
Computational elements must communicate with one another
Examples: Group finding N-Point correlation functions New object classification Density estimation
![Page 13: Enabling Knowledge Discovery in a Virtual Universe](https://reader034.vdocuments.us/reader034/viewer/2022052401/5681484f550346895db56334/html5/thumbnails/13.jpg)
The Challenge of Astrophysics Data Analysis in a
Multiprocessor Universe
Build a library that is: Sophisticated enough to take care
of all of the nasty parallel bits for you.
Flexible enough to be used for your own particular astrophysics data analysis application.
Scalable: scales well to thousands of processors.
![Page 14: Enabling Knowledge Discovery in a Virtual Universe](https://reader034.vdocuments.us/reader034/viewer/2022052401/5681484f550346895db56334/html5/thumbnails/14.jpg)
The Challenge of Astrophysics Data Analysis in a Multiprocessor Universe
Astrophysics uses dynamic, irregular data structures:
Astronomy deals with point-like data in an N-dimensional parameter space
Most efficient methods on these kind of data use space-partitioning trees.
The most common data structure is a kd-tree.
![Page 15: Enabling Knowledge Discovery in a Virtual Universe](https://reader034.vdocuments.us/reader034/viewer/2022052401/5681484f550346895db56334/html5/thumbnails/15.jpg)
Challenges for scalable parallel application development:
Things that make parallel programs difficult to write Thread orchestration Data management
Things that inhibit scalability: Granularity (synchronization) Load balancing Data locality
![Page 16: Enabling Knowledge Discovery in a Virtual Universe](https://reader034.vdocuments.us/reader034/viewer/2022052401/5681484f550346895db56334/html5/thumbnails/16.jpg)
Overview of existing paradigms: GSA
There are existing globally shared address space (GSA) compilers and libraries: Co-Array Fortran UPC ZPL Global Arrays
The Good: These are quite simple to use. The Good: Can manage data locality well. The Bad: Existing GSA approaches tend not
to scale very well because of fine granularity.
The Ugly: None of these support irregular data structures.
![Page 17: Enabling Knowledge Discovery in a Virtual Universe](https://reader034.vdocuments.us/reader034/viewer/2022052401/5681484f550346895db56334/html5/thumbnails/17.jpg)
Overview of existing paradigms: GSA
There are other GSA approaches that do lend themselves to irregular data structures: e.g. Linda (tuple-space)
The Good: Almost universally flexible The Bad: These tend not to scale even
worse than the previous GSA approaches. Granularity is too fine
![Page 18: Enabling Knowledge Discovery in a Virtual Universe](https://reader034.vdocuments.us/reader034/viewer/2022052401/5681484f550346895db56334/html5/thumbnails/18.jpg)
Challenges for scalable parallel application development:
Things that make parallel programs difficult to write Thread orchestration Data management
Things that inhibit scalability: Granularity Load balancing Data locality
GSA
![Page 19: Enabling Knowledge Discovery in a Virtual Universe](https://reader034.vdocuments.us/reader034/viewer/2022052401/5681484f550346895db56334/html5/thumbnails/19.jpg)
Overview of existing paradigms: RMI
rmi_broadcast(…, (*myFunction));
RMI layer
RMI layer
myFunction()
RMI Layer
myFunction()
RMI Layer
myFunction()
RMI Layer
myFunction()
Proc. 0 Proc. 1 Proc. 2 Proc. 3
MasterThread
Computational Agenda
myFunction() is coarsely grained
“Remote Method Invocation”
![Page 20: Enabling Knowledge Discovery in a Virtual Universe](https://reader034.vdocuments.us/reader034/viewer/2022052401/5681484f550346895db56334/html5/thumbnails/20.jpg)
Challenges for scalable parallel application development:
Things that make parallel programs difficult to write Thread orchestration Data management
Things that inhibit scalability: Granularity Load balancing Data locality
RMI
![Page 21: Enabling Knowledge Discovery in a Virtual Universe](https://reader034.vdocuments.us/reader034/viewer/2022052401/5681484f550346895db56334/html5/thumbnails/21.jpg)
N tropy: A Library for Rapid Development of kd-tree Applications
No existing paradigm gives us everything we need.
Can we combine existing paradigms beneath a simple, yet flexible API?
![Page 22: Enabling Knowledge Discovery in a Virtual Universe](https://reader034.vdocuments.us/reader034/viewer/2022052401/5681484f550346895db56334/html5/thumbnails/22.jpg)
N tropy: A Library for Rapid Development of kd-tree Applications
Use RMI for orchestration Use GSA for data management
![Page 23: Enabling Knowledge Discovery in a Virtual Universe](https://reader034.vdocuments.us/reader034/viewer/2022052401/5681484f550346895db56334/html5/thumbnails/23.jpg)
A Simple N tropy Example:N-body Gravity Calculation
Cosmological “N-Body”simulation•100,000,000 particles•1 TB of RAM
100 million light years
Proc 0 Proc 1 Proc 2
Proc 5Proc 4Proc 3
Proc 6 Proc 7 Proc 8
![Page 24: Enabling Knowledge Discovery in a Virtual Universe](https://reader034.vdocuments.us/reader034/viewer/2022052401/5681484f550346895db56334/html5/thumbnails/24.jpg)
A Simple N tropy Example:N-body Gravity Calculation
ntropy_Dynamic(…, (*myGravityFunc));
N tropy master layer
N tropy thread service layer
myGravityFunc()
N tropy thread service layer
myGravityFunc()
N tropy thread service layer
myGravityFunc()
N tropy thread service layer
myGravityFunc()
Proc. 0 Proc. 1 Proc. 2 Proc. 3
MasterThread
Computational Agenda
Particles on which to calculate gravitational force
P1 …P2 Pn…
![Page 25: Enabling Knowledge Discovery in a Virtual Universe](https://reader034.vdocuments.us/reader034/viewer/2022052401/5681484f550346895db56334/html5/thumbnails/25.jpg)
A Simple N tropy Example:N-body Gravity Calculation
Cosmological “N-Body”simulation•100,000,000 particles•1 TB of RAM
100 million light years
To resolve the To resolve the gravitationalgravitational force on any force on any single particle single particle requires the requires the entire datasetentire dataset
To resolve the To resolve the gravitationalgravitational force on any force on any single particle single particle requires the requires the entire datasetentire dataset
Proc 0 Proc 1 Proc 2
Proc 5Proc 4Proc 3
Proc 6 Proc 7 Proc 8
![Page 26: Enabling Knowledge Discovery in a Virtual Universe](https://reader034.vdocuments.us/reader034/viewer/2022052401/5681484f550346895db56334/html5/thumbnails/26.jpg)
A Simple N tropy Example:N-body Gravity Calculation
N tropy thread service layer
myGravityFunc()
N tropy thread service layer
myGravityFunc()
N tropy thread service layer
myGravityFunc()
N tropy thread service layer
myGravityFunc()
Proc. 0 Proc. 1 Proc. 2 Proc. 3
N tropy GSA layer N tropy GSA layer N tropy GSA layer N tropy GSA layer
0
43
1
2
76
5
1110
8
9
1413
12
![Page 27: Enabling Knowledge Discovery in a Virtual Universe](https://reader034.vdocuments.us/reader034/viewer/2022052401/5681484f550346895db56334/html5/thumbnails/27.jpg)
N tropy Performance Features
GSA allows performance features to be provided “under the hood”: Interprocessor data caching
< 1 in 100,000 off-PE requests actually result in communication.
RMI allows further performance features Dynamic load balacing
Workload can be dynamically reallocated as computation progresses.
![Page 28: Enabling Knowledge Discovery in a Virtual Universe](https://reader034.vdocuments.us/reader034/viewer/2022052401/5681484f550346895db56334/html5/thumbnails/28.jpg)
N tropy Performance10 million particlesSpatial 3-Point3->4 Mpc
No interprocessor data cache,No load balancing
Interprocessor data cache,No load balancing
Interprocessor data cache,Load balancing
![Page 29: Enabling Knowledge Discovery in a Virtual Universe](https://reader034.vdocuments.us/reader034/viewer/2022052401/5681484f550346895db56334/html5/thumbnails/29.jpg)
Why does the data cache make such a huge difference?
myGravityFunc()
Proc. 0
0
43
1
2
76
5
1110
8
9
1413
12
![Page 30: Enabling Knowledge Discovery in a Virtual Universe](https://reader034.vdocuments.us/reader034/viewer/2022052401/5681484f550346895db56334/html5/thumbnails/30.jpg)
N tropy “Meaningful” Benchmarks
The purpose of this library is to minimize development time!
Development time for:1. Parallel N-point correlation function
calculator 2 years -> 3 months
2. Parallel Friends-of-Friends group finder
8 months -> 3 weeks
![Page 31: Enabling Knowledge Discovery in a Virtual Universe](https://reader034.vdocuments.us/reader034/viewer/2022052401/5681484f550346895db56334/html5/thumbnails/31.jpg)
Conclusions
Most approaches for parallel application development rely on a single paradigm Inhibits scalability Inhibits generality
Almost all current HPC programs are written in MPI (“paradigm-less”): MPI is a “lowest common denominator”
upon which any paradigm can be imposed.
![Page 32: Enabling Knowledge Discovery in a Virtual Universe](https://reader034.vdocuments.us/reader034/viewer/2022052401/5681484f550346895db56334/html5/thumbnails/32.jpg)
Conclusions
Many “real-world” problems, especially those involving irregular data structures, demand a combination of paradigms
N tropy provides: Remote Method Invocation (RMI) Globally Share Addressing (GSA)
![Page 33: Enabling Knowledge Discovery in a Virtual Universe](https://reader034.vdocuments.us/reader034/viewer/2022052401/5681484f550346895db56334/html5/thumbnails/33.jpg)
Conclusions
Tools that selectively deploy several parallel paradigms (rather than just one) may be what are needed to parallelize applications that use irregular/adaptive/dynamic data structures.
More Information: Go to Wikipedia and seach “Ntropy”