advanced parallel primitives in spm.python for inheriting ......parallelism: the management of a...
TRANSCRIPT
Advanced Parallel Primitives in SPM.Pythonfor Inheriting Fault-Tolerance, and Scalable
Processing of Data and Graphs
Minesh B. Aminmamin @ mbasciences.com
http://www.mbasciences.com
HPC Advisory Council / Stanford Workshop 2011
Stanford University, CA
Dec 7, 2011
© 2011 MBA Sciences, Inc. All rights reserved.
Problem Statement
... exploiting parallelism using libraries
Problem Statement
... exploiting parallelism using frameworks
libraries
Problem Statement
... exploiting parallelism using parallel primitives
frameworks
libraries
Problem Statement
... exploiting parallelism using parallel primitives
frameworks
libraries
Clone
CloneRepeat
PartitionAggregate
Decentralized
PartitionAggregate
Centralized
PartitionList
PartitionDAG
{● Single, self-contained parallel environment● Patented Technology ...
Enable any OpenMPI application to inherit support for:
● Fault tolerance● Timeout● Detection of deadlocks
Partition/OpenMPI
Suites of parallel primitives to process data and graphsin parallel
Partition/HybridFlow
Problem Statement
... exploiting parallelism using parallel primitives
frameworks
libraries
Clone
CloneRepeat
PartitionAggregate
Decentralized
PartitionAggregate
Centralized
PartitionList
PartitionDAG
{● Single, self-contained parallel environment● Patented Technology ...
Enable any OpenMPI application to inherit support for:
● Fault tolerance● Timeout● Detection of deadlocks
Partition/OpenMPI
Suites of parallel primitives to process data and graphsin parallel
Partition/HybridFlow
Problem Statement
... exploiting parallelism using parallel primitives
frameworks
libraries
Clone
CloneRepeat
PartitionAggregate
Decentralized
PartitionAggregate
Centralized
PartitionList
PartitionDAG
{● Single, self-contained parallel environment● Patented Technology ...
Enable any OpenMPI application to inherit support for:
● Fault tolerance● Timeout● Detection of deadlocks
Partition/OpenMPI
Suites of parallel primitives to process data and graphsin parallel
Partition/HybridFlow
Terminology: ”Exploiting Parallelism”
Parallelism: The management of a collection of serial tasks
Management: The policies by which:● tasks are scheduled,
● premature terminations are handled,
● preemptive support is provided,
● communication primitives are enabled/disabled, and
● the manner in which resources are obtained andreleased
Serial Tasks: Are classified in terms of either:● Coarse grain ... where tasks may not communicate
prior to conclusion, or
● Fine grain ... where tasks may communicate priorto conclusion.
Management policies codify how serial tasks areto be managed ... independent of what they may be
Terminology: ”Exploiting Parallelism”
Parallelism: The management of a collection of serial tasks
Management: The policies by which:● tasks are scheduled,
● premature terminations are handled,
● preemptive support is provided,
● communication primitives are enabled/disabled, and
● the manner in which resources are obtained andreleased
Serial Tasks: Are classified in terms of either:● Coarse grain ... where tasks may not communicate
prior to conclusion, or
● Fine grain ... where tasks may communicate priorto conclusion.
Management policies codify how serial tasks areto be managed ... independent of what they may be
Terminology: ”The Big Picture”
Question: Is exploiting parallelism {easyhard
} ?
Terminology: ”The Big Picture”
Question: Is exploiting parallelism {easyhard
} ?
Terminology: ”The Big Picture”
Question: Is exploiting parallelism {easyhard
} ?What makes
Terminology: ”The Big Picture”
Question: Is exploiting parallelism {easyhard
} ?What makes
Supposition: The gap between developer’s intent and API of PET(parallel enabling technology) ...
Terminology: ”Parallel Enabling Technologies”
Means to the end
� Bottom-up
OpenMPI OpenMPCUDA OpenGL
● Maximum flexibility
● Maximum headaches
● Must implement fault tolerance
� Top-downHadoop GoldenorbGraphLab
● Limited flexibility
● Fewer headaches
● Fault tolerance is inherited
� Self-contained environment
SPM.Python● Maximum flexibility
● Fewest headaches
● Fault tolerance is inherited
N environments/installations for N frameworks
One environment/installation, N suites of pclosures>>> createVirtualCloud -async>>> cmdA >>> cmdA -parallel>>> cmdB >>> cmdB -parallel>>> cmdC >>> cmdC -parallel>>> cmdD >>> cmdD -parallel
Terminology: ”Parallel Enabling Technologies”
Means to the end
� Bottom-up
OpenMPI OpenMPCUDA OpenGL
● Maximum flexibility
● Maximum headaches
● Must implement fault tolerance
� Top-downHadoop GoldenorbGraphLab
● Limited flexibility
● Fewer headaches
● Fault tolerance is inherited
� Self-contained environment
SPM.Python● Maximum flexibility
● Fewest headaches
● Fault tolerance is inherited
N environments/installations for N frameworks
One environment/installation, N suites of pclosures>>> createVirtualCloud -async>>> cmdA >>> cmdA -parallel>>> cmdB >>> cmdB -parallel>>> cmdC >>> cmdC -parallel>>> cmdD >>> cmdD -parallel
Terminology: ”Parallel Enabling Technologies”
Means to the end
� Bottom-up
OpenMPI OpenMPCUDA OpenGL
● Maximum flexibility
● Maximum headaches
● Must implement fault tolerance
� Top-downHadoop GoldenorbGraphLab
● Limited flexibility
● Fewer headaches
● Fault tolerance is inherited
� Self-contained environment
SPM.Python● Maximum flexibility
● Fewest headaches
● Fault tolerance is inherited
N environments/installations for N frameworks
One environment/installation, N suites of pclosures>>> createVirtualCloud -async>>> cmdA >>> cmdA -parallel>>> cmdB >>> cmdB -parallel>>> cmdC >>> cmdC -parallel>>> cmdD >>> cmdD -parallel
Terminology: ”Parallel Enabling Technologies”
Means to the end
� Bottom-up
OpenMPI OpenMPCUDA OpenGL
● Maximum flexibility
● Maximum headaches
● Must implement fault tolerance
� Top-downHadoop GoldenorbGraphLab
● Limited flexibility
● Fewer headaches
● Fault tolerance is inherited
� Self-contained environment
SPM.Python● Maximum flexibility
● Fewest headaches
● Fault tolerance is inherited
N environments/installations for N frameworks
One environment/installation, N suites of pclosures>>> createVirtualCloud -async>>> cmdA >>> cmdA -parallel>>> cmdB >>> cmdB -parallel>>> cmdC >>> cmdC -parallel>>> cmdD >>> cmdD -parallel
Terminology: ”Parallel Enabling Technologies”
Means to the end
� Bottom-up
OpenMPI OpenMPCUDA OpenGL
● Maximum flexibility
● Maximum headaches
● Must implement fault tolerance
� Top-downHadoop GoldenorbGraphLab
● Limited flexibility
● Fewer headaches
● Fault tolerance is inherited
� Self-contained environment
SPM.Python● Maximum flexibility
● Fewest headaches
● Fault tolerance is inherited
N environments/installations for N frameworks
One environment/installation, N suites of pclosures>>> createVirtualCloud -async>>> cmdA >>> cmdA -parallel>>> cmdB >>> cmdB -parallel>>> cmdC >>> cmdC -parallel>>> cmdD >>> cmdD -parallel
Terminology: ”Parallel Enabling Technologies”
Means to the end
� Bottom-up
OpenMPI OpenMPCUDA OpenGL
● Maximum flexibility
● Maximum headaches
● Must implement fault tolerance
� Top-downHadoop GoldenorbGraphLab
● Limited flexibility
● Fewer headaches
● Fault tolerance is inherited
� Self-contained environment
SPM.Python● Maximum flexibility
● Fewest headaches
● Fault tolerance is inherited
N environments/installations for N frameworks
One environment/installation, N suites of pclosures
>>> createVirtualCloud -async>>> cmdA >>> cmdA -parallel>>> cmdB >>> cmdB -parallel>>> cmdC >>> cmdC -parallel>>> cmdD >>> cmdD -parallel
Terminology: ”Parallel Enabling Technologies”
Means to the end
� Bottom-up
OpenMPI OpenMPCUDA OpenGL
● Maximum flexibility
● Maximum headaches
● Must implement fault tolerance
� Top-downHadoop GoldenorbGraphLab
● Limited flexibility
● Fewer headaches
● Fault tolerance is inherited
� Self-contained environment
SPM.Python● Maximum flexibility
● Fewest headaches
● Fault tolerance is inherited
N environments/installations for N frameworks
One environment/installation, N suites of pclosures>>> createVirtualCloud -async>>> cmdA >>> cmdA -parallel>>> cmdB >>> cmdB -parallel>>> cmdC >>> cmdC -parallel>>> cmdD >>> cmdD -parallel
SPM.Python: Typical Flow
Visualization
Life Sciences
Finance
ITSoftware
Development
EDA
Analytics
Gap between intent
and API of
parallel primitives
Architectural● Scalable vocabulary
Developer
● Correct-by-construction
fault tolerance
self-cleaning
● Construct-by-correction
rapid prototyping
IT● No certification (!)
SPM.Python: Typical Flow
Visualization
Life Sciences
Finance
ITSoftware
Development
EDA
Analytics
Gap between intent
and API of
parallel primitives
Architectural● Scalable vocabulary
Developer
● Correct-by-construction
fault tolerance
self-cleaning
● Construct-by-correction
rapid prototyping
IT● No certification (!)
SPM.Python: Typical Flow
Visualization
Life Sciences
Finance
ITSoftware
Development
EDA
Analytics
Gap between intent
and API of
parallel primitives
Architectural● Scalable vocabulary
Developer
● Correct-by-construction
fault tolerance
self-cleaning
● Construct-by-correction
rapid prototyping
IT● No certification (!)
SPM.Python: Typical Flow
Visualization
Life Sciences
Finance
ITSoftware
Development
EDA
Analytics
Gap between intent
and API of
parallel primitives
Architectural● Scalable vocabulary
Developer
● Correct-by-construction
fault tolerance
self-cleaning
● Construct-by-correction
rapid prototyping
IT● No certification (!)
SPM.Python: Typical Flow
Visualization
Life Sciences
Finance
ITSoftware
Development
EDA
Analytics
Gap between intent
and API of
parallel primitives
Fundamental Prerequisite
Ability to express parallelism in terms of parallelprimitives (pclosures)
Problem Statement
... exploiting parallelism using parallel primitives
frameworks
libraries
Clone
CloneRepeat
PartitionAggregate
Decentralized
PartitionAggregate
Centralized
PartitionList
PartitionDAG
{● Single, self-contained parallel environment● Patented Technology ...
Enable any OpenMPI application to inherit support for:
● Fault tolerance● Timeout● Detection of deadlocks
Partition/OpenMPI
Suites of parallel primitives to process data and graphsin parallel
Partition/HybridFlow
Partition/OpenMPI: Prologue
GNU/Linux [] mpirun ... ./hello world -prefix ”api”
Typical OpenMPI application ... lacks support for:
● fault tolerance
● timeout
● detection of deadlocks
Partition/OpenMPI: Prologue
GNU/Linux [] mpirun ... ./hello world -prefix ”api”
Typical OpenMPI application ... lacks support for:
● fault tolerance
● timeout
● detection of deadlocks
Partition/OpenMPI: Prologue
GNU/Linux [] mpirun ... ./hello world -prefix ”api”
Typical OpenMPI application ... lacks support for:
● fault tolerance
● timeout
● detection of deadlocks
⇒ Prototyping is (deeply)∞
frustrating
Partition/OpenMPI: Problem Statement
Prototyping should be frictionless
Must use original OpenMPI application� original source code� original binary
Original OpenMPI application must inherit support for:� fault tolerance� timeout� detecting deadlocks
GNU/Linux []spm.python ...
mpirun ... ./hello world -prefix ”api”
Partition/OpenMPI: Problem Statement
Prototyping should be frictionless
Must use original OpenMPI application� original source code� original binary
Original OpenMPI application must inherit support for:� fault tolerance� timeout� detecting deadlocks
GNU/Linux []spm.python ...
mpirun ... ./hello world -prefix ”api”
Partition/OpenMPI: Problem Statement
Prototyping should be frictionless
Must use original OpenMPI application� original source code� original binary
Original OpenMPI application must inherit support for:� fault tolerance� timeout� detecting deadlocks
GNU/Linux []spm.python ...
mpirun ... ./hello world -prefix ”api”
Partition/OpenMPI: Problem Statement
Prototyping should be frictionless
Must use original OpenMPI application� original source code� original binary
Original OpenMPI application must inherit support for:� fault tolerance� timeout� detecting deadlocks
GNU/Linux []spm.python ...
mpirun ... ./hello world -prefix ”api”
Partition/OpenMPI: Problem Statement (Cont’d)
GNU/Linux []spm.python ...
mpirun ... ./hello world -prefix ”api”
AB
Exploiting two very different forms of parallelism:� Using same resources� At the same time
Drop-inreplacement for
mpirun
Multiple sessions ofmpirun
within a single session ofof spm.python
Can use same resources for:
● Checkpoint based parallelism
● What-if analysis
● Stress testing
Partition/OpenMPI: Problem Statement (Cont’d)
GNU/Linux []spm.python ...
mpirun ... ./hello world -prefix ”api”
AB
Exploiting two very different forms of parallelism:� Using same resources� At the same time
Drop-inreplacement for
mpirun
Multiple sessions ofmpirun
within a single session ofof spm.python
Can use same resources for:
● Checkpoint based parallelism
● What-if analysis
● Stress testing
Partition/OpenMPI: Problem Statement (Cont’d)
GNU/Linux []spm.python ...
mpirun ... ./hello world -prefix ”api”
AB
Exploiting two very different forms of parallelism:� Using same resources� At the same time
Drop-inreplacement for
mpirun
Multiple sessions ofmpirun
within a single session ofof spm.python
Can use same resources for:
● Checkpoint based parallelism
● What-if analysis
● Stress testing
Partition/OpenMPI: Problem Statement (Cont’d)
GNU/Linux []spm.python ...
mpirun ... ./hello world -prefix ”api”
AB
Exploiting two very different forms of parallelism:� Using same resources� At the same time
Drop-inreplacement for
mpirun
Multiple sessions ofmpirun
within a single session ofof spm.python
Can use same resources for:
● Checkpoint based parallelism
● What-if analysis
● Stress testing
Partition/OpenMPI: Problem Statement (Cont’d)
GNU/Linux []spm.python ...
mpirun ... ./hello world -prefix ”api”
AB
Exploiting two very different forms of parallelism:� Using same resources� At the same time
Drop-inreplacement for
mpirun
Multiple sessions ofmpirun
within a single session ofof spm.python
Can use same resources for:
● Checkpoint based parallelism
● What-if analysis
● Stress testing
Partition/OpenMPI: Problem Statement (Cont’d)
GNU/Linux []spm.python ...
mpirun ... ./hello world -prefix ”api”
AB
Exploiting two very different forms of parallelism:� Using same resources� At the same time
Drop-inreplacement for
mpirun
Multiple sessions ofmpirun
within a single session ofof spm.python
Can use same resources for:
● Checkpoint based parallelism
● What-if analysis
● Stress testing
Partition/OpenMPI: Problem Statement (Cont’d)
GNU/Linux []spm.python ...
mpirun ... ./hello world -prefix ”api”
AB
Exploiting two very different forms of parallelism:� Using same resources� At the same time
Drop-inreplacement for
mpirun
Multiple sessions ofmpirun
within a single session ofof spm.python
Can use same resources for:
● Checkpoint based parallelism
● What-if analysis
● Stress testing
Partition/OpenMPI: Anatomy - Timeline
GNU/Linux []spm.python ...
mpirun ./hello world -prefix ”api”
Partition/OpenMPI: Anatomy - Timeline (Cont’d)
Hub mpirun Spoke orted wrapper Application
exit();
exit();exit();
exit();
1
2 34
5
67
Launch:● mpirun
Monitor:● mpirun● Spokes
Launch:● orted
Monitor:● orted● wrapper
Launch:● Application
Monitor/Timeout:● Application
NormalExecution
Partition/OpenMPI: Anatomy - Timeline (Cont’d)
Hub mpirun Spoke orted wrapper Application
exit();
exit();exit();
exit();
1
2 34
5
67
Launch:● mpirun
Monitor:● mpirun● Spokes
Launch:● orted
Monitor:● orted● wrapper
Launch:● Application
Monitor/Timeout:● Application
NormalExecution
Partition/OpenMPI: Anatomy - Timeline (Cont’d)
Hub mpirun Spoke orted wrapper Application
exit();
exit();exit();
exit();
1
2 34
5
67
Launch:● mpirun
Monitor:● mpirun● Spokes
Launch:● orted
Monitor:● orted● wrapper
Launch:● Application
Monitor/Timeout:● Application
NormalExecution
Establish a nervous system over the OpenMPI application
Populate nervous systemwith streams oftime-series data
Partition/OpenMPI: Anatomy - Breakdown
Hub mpirun Spoke orted wrapper Application
exit();
exit();exit();
exit();
1
2 34
5
67
Launch:● mpirun
Monitor:● mpirun● Spokes
Launch:● orted
Monitor:● orted● wrapper
Launch:● Application
Monitor/Timeout:● Application
NormalExecution
Built-in Package Management System
● Selectively change default OpenMPI env
Redirection of library calls
● Augment libmpi.so, libc.so ...
with libSPM.so
Second Parallel Capability
● ∼ 60-line Python script
● Authored by developer
Partition/OpenMPI: Second Parallel Capability
@spm.util.dassert(predicateCb = spm.sys.sstat.amOffline)@spm.util.dassert(predicateCb = spm.sys.pstat.amHub)def __init():return spm.pclosure.macro.papply.template.openMPI.\
policyA.defun(signature = ’signature::Hub’,stage1Cb = __taskStat,);
__pc = __init();
Declaration + Definition of Pclosure
Partition/OpenMPI: Second Parallel Capability
@spm.util.dassert(predicateCb = spm.sys.sstat.amOffline)@spm.util.dassert(predicateCb = spm.sys.pstat.amHub)def main(pool,
taskApiArgs,taskTimeout):
# Initialize ’stage0’.__pc.stage0.init.main(typedef = ...);hdl = __pc.stage0.payload.tie();# Populate the template taskhdl.spm.meta.label = ’***’; # Not interested.hdl.spm.meta.apiArgs = taskApiArgs;hdl.spm.meta.timeout = taskTimeout;# Invoke the pmanager__pc.stage0.event.manage(pool = pool,
nSpokesMin = ...nSpokesMax = ...timeoutWaitForSpokes = ...timeoutExecution = ...);
return;
Population + Invocation of Pclosure
Partition/OpenMPI: Second Parallel Capability
r"""task<template> ::struct {# SPM component ...spm ::struct {
meta ::struct {label ::scalar<stringSnippet> = deferred;apiArgs ::dict<string,mixed> = deferred;timeout ::scalar<timeout> = deferred;
};
core ::struct {relaunchPre ::scalar<bool> = None;relaunchPost ::scalar<bool> = None;nameHost ::scalar<auto> = None;whoAmI ::scalar<auto> = None;
};
stat ::struct {exception ::scalar<auto> = None;returnValue ::scalar<record> = None;
};};# non-SPM component ...
};"""
Typedef for Template Task
Partition/OpenMPI: Second Parallel Capability
@spm.util.dassert(predicateCb = spm.sys.sstat.amOnline)@spm.util.dassert(predicateCb = spm.sys.pstat.amHub)def __taskStat(pc):try:hdl = pc.stage1.payload.tie();returnValue = hdl.spm.stat.returnValue;if (returnValue.Has(attr = ’stdOut’)):
print("\tstdOut : %s", returnValue.stdOut);if (returnValue.Has(attr = ’stdErr’)):
print("\tstdErr : %s", returnValue.stdErr);if (returnValue.Has(attr = ’stdOutErr’)):
print("\tstdOutErr: %s", returnValue.stdOutErr);except (SPMTaskDropped,
SPMTaskLoad,SPMTaskEval,), (hdl,):
pass;
return (pc.stage1.event.done(),None,)[-1];
Callback for Status Reports
Partition/OpenMPI: SPM.Python Session
l GNU/Linux [] spm.3.111116.trial.A.python(Trial Edition)
Spm.Python 3.111116 / Python 2.4.6
[GCC 4.4.3 (64 bit) on linux2]
NOTE
>>>> Trial period ends at <<<<
>>>> 24:00 hrs (Pacific Standard Time) <<<<
>>>> December 29, 2011 <<<<
Type "help", "copyright", "credits", "license" or "spm.Api()" for more information.
Type "spm.DemoExtract(dirname = ...)" to extract demo scripts.
Please visit www.mbasciences.com for the latest and growing
collection of scripts and technical briefs classified in terms of
parallel management patterns.
l >>> import pooll >>> import demol >>> import os;l >>> taskApiArgs = \l dict(app = os.getcwd() + ’/hello_world’,l appOptions = "-prefix=’app’",l );l >>> taskTimeout = spm.util.timeout.after(seconds = 10);3 >>> demo.main(pool = pool.intraAll(),l taskApiArgs = taskApiArgs,l taskTimeout = taskTimeout)l #: MetaStatus (hub): Waiting - ForSpokes ...l #: MetaStatus (hub): Tasks - Evall app => 0l app => 1l #: MetaStatus (hub): Tasks - EvalDone3 >>> demo.main(pool = pool.intraOnePerServer(),l taskApiArgs = taskApiArgs,l taskTimeout = taskTimeout)l #: MetaStatus (hub): Waiting - ForSpokes ...l #: MetaStatus (hub): Tasks - Evall #: MetaStatus (hub): Tasks - EvalDonel >>> exit()l GNU/Linux []
Problem Statement
... exploiting parallelism using parallel primitives
frameworks
libraries
Clone
CloneRepeat
PartitionAggregate
Decentralized
PartitionAggregate
Centralized
PartitionList
PartitionDAG
{● Single, self-contained parallel environment● Patented Technology ...
Enable any OpenMPI application to inherit support for:
● Fault tolerance● Timeout● Detection of deadlocks
Partition/OpenMPI
Suites of parallel primitives to process data and graphsin parallel
Partition/HybridFlow
Partition/HybridFlow: Basic Template
while (not done):
try:
for work in pc.generate(...):
eval(work); # Local Python/C/C++/GPU computation
pc.counter.async += 1; # Update parallel data structure(s)
if (some condition):
raise pc.exception(...); # Parallel exception
if (some condition):
pc.emit(...); # Emit work/report
done = True;
except (pc.exception,) (val,):
if (some condition):
continue; # Repeat with new consensus (’val’)
done = True;
Partition/HybridFlow: Basic Template
while (not done):
try:
for work in pc.generate(...):
eval(work); # Local Python/C/C++/GPU computation
pc.counter.async += 1; # Update parallel data structure(s)
if (some condition):
raise pc.exception(...); # Parallel exception
if (some condition):
pc.emit(...); # Emit work/report
done = True;
except (pc.exception,) (val,):
if (some condition):
continue; # Repeat with new consensus (’val’)
done = True;
Partition/HybridFlow: Basic Template
while (not done):
try:
for work in pc.generate(...):
eval(work); # Local Python/C/C++/GPU computation
pc.counter.async += 1; # Update parallel data structure(s)
if (some condition):
raise pc.exception(...); # Parallel exception
if (some condition):
pc.emit(...); # Emit work/report
done = True;
except (pc.exception,) (val,):
if (some condition):
continue; # Repeat with new consensus (’val’)
done = True;
Partition/HybridFlow: Basic Template
while (not done):
try:
for work in pc.generate(...):
eval(work); # Local Python/C/C++/GPU computation
pc.counter.async += 1; # Update parallel data structure(s)
if (some condition):
raise pc.exception(...); # Parallel exception
if (some condition):
pc.emit(...); # Emit work/report
done = True;
except (pc.exception,) (val,):
if (some condition):
continue; # Repeat with new consensus (’val’)
done = True;
Partition/HybridFlow: Basic Template
while (not done):
try:
for work in pc.generate(...):
eval(work); # Local Python/C/C++/GPU computation
pc.counter.async += 1; # Update parallel data structure(s)
if (some condition):
raise pc.exception(...); # Parallel exception
if (some condition):
pc.emit(...); # Emit work/report
done = True;
except (pc.exception,) (val,):
if (some condition):
continue; # Repeat with new consensus (’val’)
done = True;
Partition/HybridFlow: Suite of Parallel Primitives
while (not done):try:
for work in pc.generate(...):eval(work);pc.counter.async += 1;if (some condition):
raise pc.exception(...);if (some condition):
pc.emit(...);
done = True;except (pc.exception,) (val,):
if (some condition):continue;
done = True;
Partition/HybridFlow: Suite of Parallel Primitives
while (not done):try:
for work in pc.generate(...):eval(work);pc.counter.async += 1;if (some condition):
raise pc.exception(...);if (some condition):
pc.emit(...);
done = True;except (pc.exception,) (val,):
if (some condition):continue;
done = True;
pc.generator(...);
pc.emit(...);pc.exception(...);
pc.counter.async;
Partition/HybridFlow: Suite of Parallel Primitives
while (not done):try:
for work in pc.generate(...):eval(work);pc.counter.async += 1;if (some condition):
raise pc.exception(...);if (some condition):
pc.emit(...);
done = True;except (pc.exception,) (val,):
if (some condition):continue;
done = True;
pc.generator(...);
pc.emit(...);pc.exception(...);
pc.counter.async;
BAP
Partition/HybridFlow: Suite of Parallel Primitives
while (not done):try:
for work in pc.generate(...):eval(work);pc.counter.async += 1;if (some condition):
raise pc.exception(...);if (some condition):
pc.emit(...);
done = True;except (pc.exception,) (val,):
if (some condition):continue;
done = True;
pc.generator(...);
pc.emit(...);pc.exception(...);
pc.counter.async;
BAP BSPSpeculative
Partition/HybridFlow: Suite of Parallel Primitives
while (not done):try:
for work in pc.generate(...):eval(work);pc.counter.async += 1;if (some condition):
raise pc.exception(...);if (some condition):
pc.emit(...);
done = True;except (pc.exception,) (val,):
if (some condition):continue;
done = True;
pc.generator(...);
pc.emit(...);pc.exception(...);
pc.counter.async;
BAP BSPSpeculative BSP
Partition/HybridFlow: Suite of Parallel Primitives
while (not done):try:
for work in pc.generate(...):eval(work);pc.counter.async += 1;if (some condition):
raise pc.exception(...);if (some condition):
pc.emit(...);
done = True;except (pc.exception,) (val,):
if (some condition):continue;
done = True;
pc.generator(...);
pc.emit(...);pc.exception(...);
pc.counter.async;
BAP BSPSpeculative BSP DAG
Partition/HybridFlow: Suite of Parallel Primitives
while (not done):try:
for work in pc.generate(...):eval(work);pc.counter.async += 1;if (some condition):
raise pc.exception(...);if (some condition):
pc.emit(...);
done = True;except (pc.exception,) (val,):
if (some condition):continue;
done = True;
pc.generator(...);
pc.emit(...);pc.exception(...);
pc.counter.async;
BAP BSPSpeculative BSP DAG
● ● ●
Conclusion
... exploiting parallelism using parallel primitives
frameworks
libraries
Clone
CloneRepeat
PartitionAggregate
Decentralized
PartitionAggregate
Centralized
PartitionList
PartitionDAG
{● Single, self-contained parallel environment● Patented Technology ...
Enable any OpenMPI application to inherit support for:
● Fault tolerance● Timeout● Detection of deadlocks
Partition/OpenMPI
Suites of parallel primitives to process data and graphsin parallel
Partition/HybridFlow
Conclusion (Cont’d)
http://www.mbasciences.com
⎧⎪⎪⎪⎨⎪⎪⎪⎩
SPM.Python distribution
Technical Briefs
Parallel Management Patterns
⎫⎪⎪⎪⎬⎪⎪⎪⎭
CloneOnceRepeat
PartitionDAGList
PartitionAggregateCentralizedDecentralized
Elementary
Parallel Primitives
PartitionGrid/OpenMPI
In
Limited Beta
HPC
Parallel Primitives
PartitionData FlowGraph
Limited Beta
Jan 24, 2012
Data / Graph
Parallel Primitives