nagiza f. samatova (ornl) srikanth yoginath (ornl) guruprasad kora (ornl) chongle pan (utk/ornl)

34
SDM Cente r Advancing, integrating and deploying efficient statistical computing to high-throughput scientific applications Nagiza F. Samatova (ORNL) Srikanth Yoginath (ORNL) Guruprasad Kora (ORNL) Chongle Pan (UTK/ORNL) SDM AHM @ NC State October 5-7, 2005 Contact : Nagiza Samatova, [email protected]

Upload: isanne

Post on 11-Jan-2016

75 views

Category:

Documents


0 download

DESCRIPTION

Advancing, integrating and deploying efficient statistical computing to high-throughput scientific applications. Nagiza F. Samatova (ORNL) Srikanth Yoginath (ORNL) Guruprasad Kora (ORNL) Chongle Pan (UTK/ORNL). SDM AHM @ NC State October 5-7, 2005. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Nagiza F. Samatova (ORNL) Srikanth Yoginath (ORNL) Guruprasad Kora (ORNL) Chongle Pan (UTK/ORNL)

SDMCenter

Advancing, integrating and deploying efficient statistical computing to high-

throughput scientific applications

• Nagiza F. Samatova (ORNL)

• Srikanth Yoginath (ORNL)

• Guruprasad Kora (ORNL)

• Chongle Pan (UTK/ORNL)

SDM AHM @ NC StateOctober 5-7, 2005

Contact: Nagiza Samatova, [email protected]

Page 2: Nagiza F. Samatova (ORNL) Srikanth Yoginath (ORNL) Guruprasad Kora (ORNL) Chongle Pan (UTK/ORNL)

SDMCenter

Outline

About Parallel R (pR) Motivation About R and its parallelization efforts Task and data parallelism with Parallel R (pR)

Initial/Future Deployment across Applications GIS data analysis with GRASS and pR Clustered Climate Regimes using pR Quantitative Proteomics in Biology using pR High Energy Physics (HEP) with ROOT, R and pR

Towards SDM Components Integration Kepler scenario with pR CCA scenario with pR Web Services scenario with pR

Future Extensions Generic Parallel Agent for ease of pR extension Task Parallel Engine optimized for data-intensive analyses

Page 3: Nagiza F. Samatova (ORNL) Srikanth Yoginath (ORNL) Guruprasad Kora (ORNL) Chongle Pan (UTK/ORNL)

SDMCenter

Tera-(Flop & Byte) Analyses Could Be Routine for Scientific Applications But…

HitsAlgorithmic Complexity:

Calculate means O(n)

Calculate FFT O(n log(n))

Calculate PCA O(r • c)

Hierarchical clust. O(n2)

Algorithmic Complexity:

Calculate means O(n)

Calculate FFT O(n log(n))

Calculate PCA O(r • c)

Hierarchical clust. O(n2)

Climate Now: 20-40TB per simulated year 5 yrs: 100TB/yr 5-10PB/yr

Astrophysics Now and 5 yrs: Can soak up anything!

Fusion Now: 100Mbytes/15min 5 yrs: 1000Mbytes/2 min

1Tflop/sec

Page 4: Nagiza F. Samatova (ORNL) Srikanth Yoginath (ORNL) Guruprasad Kora (ORNL) Chongle Pan (UTK/ORNL)

SDMCenter

Statistical Computing with R

About R (http://www.r-project.org/):

• R is an Open Source (GPL), most widely used programming environment for statistical analysis and graphics; similar to S.

• Provides good support for both users and developers.

• Highly extensible via dynamically loadable add-on packages.

• Originally developed by Robert Gentleman and Ross Ihaka.

Towards Enabling Parallel Computing in R:

> …> dyn.load( “foo.so”) > .C( “foobar” )> dyn.unload( “foo.so” )

> …> dyn.load( “foo.so”) > .C( “foobar” )> dyn.unload( “foo.so” )

> library(mva)> pca <- prcomp(data)> summary(pca)

> library(mva)> pca <- prcomp(data)> summary(pca)

• Rmpi (Hao Yu): R interface to LAM-MPI.

• rpvm (Na Li and Tony Rossini): R interface to PVM; requires knowledge of parallel programming.

• snow (Luke Tierney): general API on top of message passing routines to provide high-level (parallel apply) commands; mostly demonstrated for embarrassingly parallel applications .

> library (rpvm)> .PVM.start.pvmd ()> .PVM.addhosts (...)> .PVM.config ()

> library (rpvm)> .PVM.start.pvmd ()> .PVM.addhosts (...)> .PVM.config ()

snow API

Page 5: Nagiza F. Samatova (ORNL) Srikanth Yoginath (ORNL) Guruprasad Kora (ORNL) Chongle Pan (UTK/ORNL)

SDMCenter

Motivation behind Parallel R (pR)

Ideal Programming Requirements: Be able to use existing high level (i.e. R) code Require minimal extra efforts for parallelizing Have Identical/similar (presumably easy-to-use) interface to R’s Be able to test codes in sequential settings Provide efficient and scalable (in terms of problem size and

number of processors) performance

Page 6: Nagiza F. Samatova (ORNL) Srikanth Yoginath (ORNL) Guruprasad Kora (ORNL) Chongle Pan (UTK/ORNL)

SDMCenter

Providing Task and Data Parallelism in pR

• Likelihood Maximization.

• Re-sampling schemes: Bootstrap, Jackknife, etc.

• Animations

• Markov Chain Monte Carlo (MCMC).– Multiple chains.– Simulated Tempering: running parallel chains at different “temperature“ to improve mixing.

Task-parallel analyses:

• k-means clustering

• Principal Component Analysis (PCA)

• Hierarchical (model-based) clustering

• Distance matrix, histogram, etc. computations

Data-parallel analyses:

Page 7: Nagiza F. Samatova (ORNL) Srikanth Yoginath (ORNL) Guruprasad Kora (ORNL) Chongle Pan (UTK/ORNL)

SDMCenter

Parallel R (pR) Distribution

http://www.ASPECT-SDM.org/Parallel-RReleases History:

• pR enables both data and task parallelism (includes task-pR and RScaLAPACK) (version 1.8.1)

• RScaLAPACK provides R interface to ScaLAPACK with its scalability in terms of problem size and number of processors using data parallelism (release 0.5.1)

• task-pR achieves parallelism by performing out-of-order execution of tasks. With its intelligent scheduling mechanism it attains significant gain in execution times (release 0.2.7)

•pMatrix provides a parallel platform to perform major matrix operations in parallel using ScaLAPACK and PBLAS Level II & III routines

Also: Available for download from R’s CRAN web site (www.R-Project.org) with 37 mirror sites in 20 countries

Page 8: Nagiza F. Samatova (ORNL) Srikanth Yoginath (ORNL) Guruprasad Kora (ORNL) Chongle Pan (UTK/ORNL)

SDMCenter

About GRASS (grass.itc.it)

• GRASS (Geographic Resources Analysis Support System) is a raster/vector GIS, image processing system, and graphics production.

• GRASS contains over 350 programs and tools to render maps and images on monitor and paper; manipulate raster, vector, and sites data; process multi spectral image data; create, manage, and store spatial data.

• It is Free (Libre) Software/Open Source released under GNU GPL.

About GRASS (grass.itc.it)

• GRASS (Geographic Resources Analysis Support System) is a raster/vector GIS, image processing system, and graphics production.

• GRASS contains over 350 programs and tools to render maps and images on monitor and paper; manipulate raster, vector, and sites data; process multi spectral image data; create, manage, and store spatial data.

• It is Free (Libre) Software/Open Source released under GNU GPL.

Page 9: Nagiza F. Samatova (ORNL) Srikanth Yoginath (ORNL) Guruprasad Kora (ORNL) Chongle Pan (UTK/ORNL)

SDMCenter

Page 10: Nagiza F. Samatova (ORNL) Srikanth Yoginath (ORNL) Guruprasad Kora (ORNL) Chongle Pan (UTK/ORNL)

SDMCenter

• Subtract background noise from data• Generate Covariance Chromatogram• Apply Savitzky-Golay Smoother• Calculate cut-off for search• Find Window with Max. SN ratio• …..

Page 11: Nagiza F. Samatova (ORNL) Srikanth Yoginath (ORNL) Guruprasad Kora (ORNL) Chongle Pan (UTK/ORNL)

SDMCenter

Ratio Calculations with Parallel R

:::::::::::chroList<-list.files(pattern="*.chro");cat ("Chro", "samSN", "refSN", "PPCSN", "HR", "PCA", "PCASN", file="Pratio-Peptide.txt");

for (i in 1:length(chroList)) { currResult [i] Rratio(filename=chroList[i]); }for (i in 1:length(chroList)) { cat (chroList[i], currResult$samSN, … file="Pratio-Peptide.txt"); }:::::::::::

:::::::::::chroList<-list.files(pattern="*.chro");cat ("Chro", "samSN", "refSN", "PPCSN", "HR", "PCA", "PCASN", file="Pratio-Peptide.txt");

for (i in 1:length(chroList)) { currResult [i] Rratio(filename=chroList[i]); }for (i in 1:length(chroList)) { cat (chroList[i], currResult$samSN, … file="Pratio-Peptide.txt"); }:::::::::::

Serial Version:

:::::::chroList<-list.files(pattern="*.chro");cat ("Chro", "samSN", "refSN", "PPCSN", "HR", "PCA", "PCASN", file="Pratio-Peptide.txt");

PE ( for (i in 1:length(chroList)) { currResult [i] Pratio(filename=chroList[i]); } )for (i in 1:length(chroList)) { cat (chroList[i], currResult$samSN, … file="Pratio-Peptide.txt"); }:::::::::::::

:::::::chroList<-list.files(pattern="*.chro");cat ("Chro", "samSN", "refSN", "PPCSN", "HR", "PCA", "PCASN", file="Pratio-Peptide.txt");

PE ( for (i in 1:length(chroList)) { currResult [i] Pratio(filename=chroList[i]); } )for (i in 1:length(chroList)) { cat (chroList[i], currResult$samSN, … file="Pratio-Peptide.txt"); }:::::::::::::

Parallel Version:

Page 12: Nagiza F. Samatova (ORNL) Srikanth Yoginath (ORNL) Guruprasad Kora (ORNL) Chongle Pan (UTK/ORNL)

SDMCenter

Initial Feedback is Positive!

Page 13: Nagiza F. Samatova (ORNL) Srikanth Yoginath (ORNL) Guruprasad Kora (ORNL) Chongle Pan (UTK/ORNL)

SDMCenter

R in High Energy Physics (HEP, Root)(Adam Lyon,Fermi Lab)

library(lattice)d=read.table("data.dat",head=T)w=data[data$event=="OpenFile",]w$min = w$dur/60.0bwPlot( fromStation ~ min | station,data=w, subset=(min<60), xlab= "Minutes",main="Wait Time for …" )

modpollux=nls(speed ~ alpha*(1-alpha*beta/(alpha*beta+mb)), data= client[pollux,], start=c(alpha=2.0, beta=0.50), trace=T)

Page 14: Nagiza F. Samatova (ORNL) Srikanth Yoginath (ORNL) Guruprasad Kora (ORNL) Chongle Pan (UTK/ORNL)

SDMCenter

R in High Energy Physics (HEP, Root)

In HEP, we typically run successive skims to reduce the data size (601 TB down to 100s of Meg or a few Gig)

R seems to want to hold everything in memory If data is small enough, bring it into R If can reduce data to something R can hold, bring that subset of data into

R -- have the full power of R If can't even do above, then have some R apparatus to read in data one

row at a time and update an R object (e.g. histograms) [But you don't get the full power of R]

Rprof significantly sped up fit (replace ^) Lessons learned:

We explore using R, a statistical analysis package from the statistics community, in an HEP enviornment

R has already proven useful for analyzing monitoring and benchmarking data

We have ideas on how R can be used to read large datasets We've done some "proof of principle" studies of physics analysis with R As we learn more about R, we expect to be more surprised at its

capabilities

(Adam Lyon,Fermi Lab)

Page 15: Nagiza F. Samatova (ORNL) Srikanth Yoginath (ORNL) Guruprasad Kora (ORNL) Chongle Pan (UTK/ORNL)

SDMCenter

Options for R and Root Interfacing

no interest from R community in non-I/O functions of Root

In order of work required :

1) R and Root remain separate-- use the more appropriate tool for the task. Use text files to communicate between the two if necessary.

2) Root loads R's math and low level statistical libraries as shared objects

Minimalist approach for some functionality

Some access to the math and statistics C code functions from R

These C functions take basic C types, so no translation necessary

But: no upper level functions written in the R language available

3) R and Root remain separate, but: R package to read Root Trees

directly into R data frames. Still use best tool for particular task Now easier to get HEP data into R

4) Allow calling of selected high level R functions from within Root

Root runs the R interpreter translation is necessary

R functions: understand Root objects Root: understand R return objects

Expose only some R functions may reduce amount of translation

(Adam Lyon,Fermi Lab)

Page 16: Nagiza F. Samatova (ORNL) Srikanth Yoginath (ORNL) Guruprasad Kora (ORNL) Chongle Pan (UTK/ORNL)

SDMCenter

More Advanced Integration Options

5) R prompt from the Root prompt R needs seamless knowledge of

objects in current Root session At end of R session, new R variables

translated into Root objects Root runs the R interpreter Translation for all types of Root

variables into R and all types of R variables returned to Root.

A major undertaking

6) Root prompt from within R Harder than 5: R is C but Root is C++ I don't see much interest in this

Things get interesting starting at 3)

I have a version 0.0.1 prototype for reading Root trees into R.

Required for all options above 3. I’ll try to work on this as time permits

Both Root and R interface to Python Translate with Python as

intermediary? Not sure if that's performant enough

(Adam Lyon,Fermi Lab)

Page 17: Nagiza F. Samatova (ORNL) Srikanth Yoginath (ORNL) Guruprasad Kora (ORNL) Chongle Pan (UTK/ORNL)

SDMCenter

KEPLER/R/pR Integration

RScript “Xread.table(DataFileName)

library(RScaLAPACK)

asva.prcomp(X)

summary(a)

plot(a$loadings)

RScript “Xread.table(DataFileName)

library(RScaLAPACK)

asva.prcomp(X)

summary(a)

plot(a$loadings)

RExpression actor by Dan Higgins, Kepler/SEEKRExpression actor by Dan Higgins, Kepler/SEEK

…process=runtime.exec(Command);System.out.println(process)…

…process=runtime.exec(Command);System.out.println(process)…

Kepler has a mathematical operations actor RExpression to virtually access/plot any function/graphs from R/pR.

Page 18: Nagiza F. Samatova (ORNL) Srikanth Yoginath (ORNL) Guruprasad Kora (ORNL) Chongle Pan (UTK/ORNL)

SDMCenter

CCA-based Component Integration

Might be used if there is a need: To effectively develop/deploy/reuse parallel R (pR). To expose the power of pR and R to the outside world. To easily access task and data parallel modules of pR by

applications written in different languages. To integrate pR modules with other technologies of SDM

center (e.g. SciRUN2). To provide a usage model that hides implementation

details. To have plug-n-play flexibility in deployment.

CCACommon Component Architecture

Page 19: Nagiza F. Samatova (ORNL) Srikanth Yoginath (ORNL) Guruprasad Kora (ORNL) Chongle Pan (UTK/ORNL)

SDMCenter

algorithms.sidldriver.sidl

BabelCompiler

Driver_SVD_ImplSVDFunction_Impl libsvd.so

BabelRuntime

RLibrary

1. Define component’s public interface in SIDL.

2. Generate native language stub code using Babel

3. Add in function and driver implementation.

4. Compile and link with Babel and R libraries runtime libraries.

Four Easy Steps:

Componentizing Singular Value Decomposition function of pR/R

Lots of THANKS to Jim Kohl & Wael Elwasif for their help!!!

Page 20: Nagiza F. Samatova (ORNL) Srikanth Yoginath (ORNL) Guruprasad Kora (ORNL) Chongle Pan (UTK/ORNL)

SDMCenter

algorithms.sidldriver.sidl

package algorithms version 1.0 {

class SVDFunction implements function.FunctionPort, gov.cca.Component {

void rsvd( in int iSize );

// gov.cca.Component methods:

void setServices(in gov.cca.Services servicesHandle) throws gov.cca.CCAException;

}

}

package drivers version 1.0 {

class SVDDriver implements gov.cca.ports.GoPort, gov.cca.Component {

int go();

void setServices(in gov.cca.Services services) throws gov.cca.CCAException;

}

}

1. Create SVD’s interface in SIDL

Page 21: Nagiza F. Samatova (ORNL) Srikanth Yoginath (ORNL) Guruprasad Kora (ORNL) Chongle Pan (UTK/ORNL)

SDMCenter

BabelCompiler

babel –s C algorithms.sidl

babel –s C driver.sidl

… /* Driver_SVD_Impl */

struct algorithms_SVD__data *psvd;

/* Access private data structure */

pd = algorithms_SVD__get_data(self);

algorithms_SVDPort function;

/* Get the function pointer. */

iReturn = function->rsvd(i);

2-3. Generate Native Language Stub Code using Babel & Add Implementations

... /* SVDFunction - rsvd() */

/* postscript() */

PROTECT(fun = Rf_findFun(Rf_install("postscript"), R_GlobalEnv));

eval(e, R_GlobalEnv);

...

/* svd(rnorm(iSize),1,1) */

fun = Rf_findFun(Rf_install("svd"), R_GlobalEnv);

...

/* svd( 1:10, 1, 1 )*/

ret = eval(e, R_GlobalEnv);

...

fun = Rf_findFun(Rf_install("plot"), R_GlobalEnv);

...

/* plot( svd$u )*/

eval(e, R_GlobalEnv);

No R scripts – Direct C Efficiency!

Page 22: Nagiza F. Samatova (ORNL) Srikanth Yoginath (ORNL) Guruprasad Kora (ORNL) Chongle Pan (UTK/ORNL)

SDMCenter

SVDFunction

SVDPortSVDPortGoPort

Driver

Ccaffeine

Ready for Deployment of R’s SVD!

Page 23: Nagiza F. Samatova (ORNL) Srikanth Yoginath (ORNL) Guruprasad Kora (ORNL) Chongle Pan (UTK/ORNL)

SDMCenter

Web Service based Integration of pR

Request-Response ports

defined by WSDL

Task-Parallel job

Parallel-solve

Parallel-kmeans

Parallel-R [pR] Service

Task-pR

RScaLAPACK

Other parallellibraries through

Generic PA

InternetInternet

Custom or

Native RInterface

Biology

Custom or

Native RInterface

Climate

Custom or

Native RInterface

Fusion

Might be used if there is a need: To provide easy access to HP parallel

statistical computations. To make pR a loosely coupled component

that can be dynamically composed. To initiate & control parallel analyses from

a less powerful machine.

Page 24: Nagiza F. Samatova (ORNL) Srikanth Yoginath (ORNL) Guruprasad Kora (ORNL) Chongle Pan (UTK/ORNL)

SDMCenter

SVD.wsdl JavaCompiler

SVDService_impl libsvdServer.so

AxisRuntime Library

RLibrary

1. Create SVD’s interface in WSDL

2. Develop and deploy the SVD Service Object (libsvdServer.so).

3. Develop the SVD Client Interface (libSVDClient.so).

Three Major Stages:

Enabling Parallel R as a Web Service using Apache-AxisC++: SVD Example

libsvdClient.soRSVDInterface_Impl

ab c c

c

Page 25: Nagiza F. Samatova (ORNL) Srikanth Yoginath (ORNL) Guruprasad Kora (ORNL) Chongle Pan (UTK/ORNL)

SDMCenter

<!-- Message types: define input and output types   --> - <wsdl:message name="GetSVDRequest">  <wsdl:part name=“inputVectort" type="xsd:double" /> </wsdl:message>- <wsdl:message name="GetSVDResponse">  <wsdl:part name="SVDOut" type="xsd:double" /> </wsdl:message> <!-- Port type: define a SOAP operations   --> - <wsdl:portType name="SVDService">- <wsdl:operation name="GetSVD" parameterOrder=“inputVector">  <wsdl:input message="local:GetSVDRequest" name="GetSVDRequest" />   <wsdl:output message="local:GetSVDResponse" name="GetSVDResponse" />   </wsdl:operation> </wsdl:portType> <!-- Binding: Bind a location in this service to an operation   --> - <wsdl:binding name="SVDSoapBinding" type="local:SVDService">  <wsdlsoap:binding style="rpc" transport="http://schemas.xmlsoap.org/soap /http" /> <wsdl:operation name="GetSVD">  <wsdlsoap:operation soapAction="SVDService#GetSVD" /> - <wsdl:input name="GetSVDRequest">  <wsdlsoap:body encodingStyle="http://schemas.xmlsoap.org/soap/encoding/" namespace="http://localhost/axis/SVD" use="encoded" /></wsdl:input>- <wsdl:output name="GetSVDResponse">  <wsdlsoap:body encodingStyle="http://schemas.xmlsoap.org/soap/encoding/" namespace="http://localhost/axis/SVD" use="encoded" />   </wsdl:output> </wsdl:operation> </wsdl:binding> <!-- Service name: define the name and URL of this service   --> - <wsdl:service name="SVDService">- <wsdl:port binding="local:SVDSoapBinding" name="SVDService">  <wsdlsoap:address location="http://localhost/axis/SVDService" />   </wsdl:port> </wsdl:service> </wsdl:definitions>

1. Create SVD’s interface in WSDL

Page 26: Nagiza F. Samatova (ORNL) Srikanth Yoginath (ORNL) Guruprasad Kora (ORNL) Chongle Pan (UTK/ORNL)

SDMCenter

java org.apache.axis.wsdl.wsdl2ws.WSDL2Ws -lc++ -sserver –oserver ./SVD.wsdl

gcc –shared –o libsvdServer.so *.cpp -lR -L$AXISCPP_DEPLOY/lib

SVDService.cpp…xsd__double SVDService: :GetSVD(xsd__double Value[]) {……./* svd(rnorm(iSize),1,1) */fun = Rf_findFun(Rf_install("svd"), R_GlobalEnv);.../* svd( 1:10, 1, 1 )*/ret = eval(e, R_GlobalEnv);...return (svd$d)}

AxisServiceException.cpp

AxisServiceException.h

SVDService.h

Server related cpp and corresponding header files

libsvdServer.so

SVD.wsdl

2. Develop the SVD Service Object

Comments:• WSDL file is processed to create cpp files.

• The cpp file functions are modified to compute R svd on the input data.

• A shared library libsvdServer.so is built using gcc.

•This library is used by Apache-axis to service the user’s request.

a

b

c c

Page 27: Nagiza F. Samatova (ORNL) Srikanth Yoginath (ORNL) Guruprasad Kora (ORNL) Chongle Pan (UTK/ORNL)

SDMCenter

java org.apache.axis.wsdl.wsdl2ws.WSDL2Ws -lc++ -sclient –oclient ./SVD.wsdl

gcc –shared –o libsvdClient.so *.cpp -lR -L$AXISCPP_DEPLOY/lib

RSVDInterface.cpp….SEXP RSVDInterface(SEXP InputVector) { SVDService srv; SEXP retval; int size; …. PROTECT(allocVector(REALSXP,size); …. retVal =srv.GetSVD(InputVector) ; UNPROTECT(1); return retVal;}

SVDService.h

Server related cpp and corresponding header files

libsvdClient.so

SVD.wsdl

2. Develop the SVD Client Interface

Comments:• WSDL file is processed to create cpp files•A RSVDInterface.cpp file is created to get the input and service detail from R and request for required service. •A shared library libsvdClient.so is built using gcc.•This library is loaded into R environment.•The service is requested through R interface.

a

b

c c

SVDService.cpp

Page 28: Nagiza F. Samatova (ORNL) Srikanth Yoginath (ORNL) Guruprasad Kora (ORNL) Chongle Pan (UTK/ORNL)

SDMCenter

pR Service:<service name=“SVDService" provider="CPP:RPC" description="Simple SVD Service"> <parameter name="allowedMethods" value="GetSVD "/> <parameter name="className" value="/usr/local/axiscpp_deploy/lib/libServersvd.so" /> </service>

•Above XML statements are added in axis “server.wsdd” file. •It specifies axis to deploy a service called SVDService and accept calls for method GetSVD

R Client: Meanwhile the R user •Loads the libSVDClient.so into his environment•Calls the higher-level R-function to access the SVDService.

N-tier Architecture for pR-Service and R-Client Interaction

libS

VD

Clien

t.so

> dyn.load(libSVDClient.so)> RSVDInterface(x)

R-Client Environment

Axis-WebServiceModule

libSVDServer.so

APACHE Web Server801

23

ResultsResults 4

InputVector

pR Service

Page 29: Nagiza F. Samatova (ORNL) Srikanth Yoginath (ORNL) Guruprasad Kora (ORNL) Chongle Pan (UTK/ORNL)

SDMCenter

Extensibility of Parallel R (pR)

REnvironment

REnvironment

ParallelAgent

ParallelAgent

Third Party Parallel Codes

ScaLAPACK

Matrix

Parallel k-means

Alok’s Data Mining

RScaLAPACK

pMatrix

pAlok

•Define R function parameters & returns

•Map R functions to defined function interfaces

•Define the function interfaces

•Set parallel environment limits for your functions

•Define data distribution function (Optional)

•Convert your MPI/PVM routine(s) into a set of functions.

•Create a shared library of your functions.

•Place it in a predefined location.

C/Fortran MPI

Robject

Page 30: Nagiza F. Samatova (ORNL) Srikanth Yoginath (ORNL) Guruprasad Kora (ORNL) Chongle Pan (UTK/ORNL)

SDMCenter

Generic Parallel Agent for Ease of Extension

Features: It is an interface that allows third party MPI routines to be called

from within ParallelR environment. Users can readily plug-in and use their MPI codes along with

ParallelR provided routines. It is MPI implementation independent. No knowledge of R or ParallelR is required to get MPI codes

included in to ParallelR. A simple XML schema for generating interface definitions. Different classes of MPI routines can be included in to the

ParallelR environment. Status:

Completed the design phase

Goal: To provide a framework to easily add different parallel MPI algorithms to the R-environment.

Page 31: Nagiza F. Samatova (ORNL) Srikanth Yoginath (ORNL) Guruprasad Kora (ORNL) Chongle Pan (UTK/ORNL)

SDMCenter

Task Parallel Engine Optimized for Data-Intensive Analyses Tasks

Jointly with Dr. Xiaosong Ma (NC State) and her student, Jain Li

Current Progress: Research phase, semi-complete design document

Goal: To provide a generic task-parallel engine that would be capable of optimizied tasks scheduling, monitoring, and resource mapping.

Page 32: Nagiza F. Samatova (ORNL) Srikanth Yoginath (ORNL) Guruprasad Kora (ORNL) Chongle Pan (UTK/ORNL)

SDMCenter

Parallelization and Scheduling

Parallelize independent tasks – task parallelism Task itself can be parallelized – data parallelism Overall goal – Given a system, minimize parallel R’s

execution time

Task 1Task 2

Task 3

Task 4

Task 5

Page 33: Nagiza F. Samatova (ORNL) Srikanth Yoginath (ORNL) Guruprasad Kora (ORNL) Chongle Pan (UTK/ORNL)

SDMCenter

Problems and Challenges

To detect independent tasks Data dependency analysis (NP-hard)

To schedule independent tasks Granularity Load balance Data locality

To perform resource allocation Data locality Minimize communications or I/O

masterworker 2

worker 1

worker n

.

.

.

assign

assign

assign

collect

collect

collect comm

comm

comm

Scheduling & Resource allocation problem (NP-hard):

Given a sequence of tasks t1, t2, …, tn, each task has data size di and requires ci computation time and may require more than one processor. There is network latency li to transfer data size di and overhead oi to initialize the transfer. Our objective is to minimize total execution time of tasks sequence while preserving dependence relations and subject to P processors available.

Page 34: Nagiza F. Samatova (ORNL) Srikanth Yoginath (ORNL) Guruprasad Kora (ORNL) Chongle Pan (UTK/ORNL)

SDMCenter

Integrated Data & Task Parallelism

To parallelize a single task, use third-party package (e.g., SUIF, Polaris)

Task scheduling subject to limited resources, e.g., worker processors

Need to estimate task runtime Performance database Runtime performance monitoring