combinatorial optimization on the computational grid experiments on grid5000 nouredine melab (...

23
Combinatorial Optimization on the Computational Grid Experiments on Grid5000 Nouredine Melab ([email protected]) Member of Grid5000 steering committee Laboratoire d’Informatique Fondamentale de Lille Parallel Cooperative Optimization Research Group INRIA DOLPHIN Project

Upload: victor-shaw

Post on 26-Dec-2015

216 views

Category:

Documents


1 download

TRANSCRIPT

Combinatorial Optimization on the Computational Grid

Experiments on Grid5000

Nouredine Melab ([email protected])Member of Grid5000 steering committee

Laboratoire d’InformatiqueFondamentale de Lille

Parallel Cooperative

Optimization Research

Group

INRIA DOLPHIN Project

Combinatorial optimization problems

High-dimensional and complex optimization problems in many areas of industrial concern

Parallel hybrid optimization methods allow to efficiently provide effective solutions, but they remain insufficient for large problems …

… Need of large scale parallelism (Grid computing)

(Multi-Objective))(..., ),(),( )( min

21xxxxf fff

n

Sx

Const.

2n

(Mono-Objective) )(min xf

Sx ( )

A taxonomy of optimization methods

Exact algorithms Heuristics

Branchand X

DynamicProgramming

A*Specific

HeuristicsMeta-heuristics

SingleSolution

Population of solutions

LocalSearch

SimulatedAnnealing

TabuSearch

EvolutionaryAlgorithms

Scatter,Swarm search

Near-optimal solutions for large problem instances

Optimal solutions for small problem

instances

Design and implementation of Grid-based algorithms …

Meta-heuristics (near-optimal)Parallel hybrid design

… solving challenging problems in combinatorial optimization

Exact algorithmsParallel design

Implementation(ParadisEO@Grid)

Cooperation

Implementation(B&B@Grid)

Protein Structure Prediction Flow-Shop scheduling problem

Supported by ANR-GRID DOCK

Supported byACI-GRID DOC-G

Combinatorial Optimization on the Computational GridExperiments on Grid5000

Supported by ANR-GRID CHOC

Meta-heuristics: Parallel models and hybridization mechanisms

Parallel models They allow to improve efficiency and effectiveness Population-based meta-heuristics

Island model, parallel evaluation of the population, parallel evaluation of a single solution

Single solution-based meta-heuristics Multi-start model, parallel exploration of the neighborhood,

parallel evaluation of a single solution

Hybridization mechanisms … … allow to combine different methods for better robustness

and effectiveness, but are CPU-time intensive

N. Melab, E-G. Talbi, S. Cahon, E. Alba and G. Luque. Parallel Meta-heuristics: Algorithms and Frameworks. Chapter 6 in “Parallel Combinatorial Optimization”, Wiley Series on Parallel and Distributed Computing, ISBN: 0-471-72101-8, Nov 2006.

“Gridification” of parallel hybrid meta-heuristics

Major properties of computational grids Multi-administrative domain, heterogeneity, dynamic availability

of resources, large scale

Major adaptations of the different models and mechanisms

Asynchronous design and implementation Granularity management and load balancing Checkpointing-based fault tolerance (a memory for each model) Adaptation of the parameters of each model (e.g. migration

topology for the island model)

N. Melab, S. Cahon and E-G. Talbi. Grid Computing for Parallel Bioinspired Algorithms. Journal of Parallel and Distributed Computing (JPDC), Elsevier Science, Vol.66(8), Pages 1052-1061, 2006.

Our contributions

Multi-Objective EO (MOEO) for the design of multi-objective evolutionary algorithms

Moving Objects (MO) for the design of local search algorithms

ParadisEO for parallel hybrid metaheuristics

PARAllel and DIStributed Evolving Objectshttp://www2.lifl.fr/OPAC/Softwares/ParadisEO/

Message passing (MPI, PVM) Clusters, Networks of Workstations,

Multi-programming (PThreads) Shared Memory Multi-processors

(SMP) Parallel distributed computing

Clusters of SMPs (CLUMPS) Grid computing

Condor-MW and Globus (MPICH-G2)

EO

ParadisEO@Grid

MO MOEO PVM, PThreads MPI (LAM, CH)Condor-MW Globus

S. Cahon, N. Melab and E-G. Talbi. ParadisEO: A Framework for the Reusable Design of Parallel and Distributed Metaheuristics. Journal of Heuristics, Elsevier Science, Vol.10(3), pages 357-380, May 2004.

Evolving Objects framework (EO)

European project(Geneura Team, INRIA, LIACS)

http://eodev.sourceforge.net

Transparent use

ParadisEO-G4: ParadisEO on Globus 4

Design and implementation Gridification of the parallel models and hybridization

mechanisms provided in ParadisEO MPICH-G2 as the communication library

Deployment on the computational Grid (Grid5000) Building of system image for Globus 4 including MPICH-G2 Virtual Globus Grid on Grid5000 for the Grid-based

deployment of the parallel hybrid meta-heuristics provided in ParadisEO

Design and implementation of Grid-based algorithms …

Meta-heuristics (near-optimal)Parallel hybrid design

… solving challenging problems in combinatorial optimization

Exact algorithmsParallel design

Implementation(ParadisEO@Grid)

Cooperation

Implementation(B&B@Grid)

Protein Structure Prediction Flow-Shop scheduling problem

Supported by ANR-GRID DOCK

Supported byACI-GRID DOC-G

Combinatorial Optimization on the Computational GridExperiments on Grid5000

Supported by ANR-GRID CHOC

Protein Structure Prediction on the GridModelling

The problem consists in finding …

… the ground-state (tertiary stable) conformation of a protein from its primary structure composed of a sequence of amino-acids (residues)

Modelled as a bi-objective optimization problem Candidate solutions: Molecular conformations

(geometries) – vectors of torsion angles Molecular conformation with lower free energies (bonded

atoms and non-bonded atoms)

Protein Structure Prediction on the GridComplexity and landscape analysis

For a molecule of 40 residues with 10 conformations per residue, 1040 conformations are obtained in average … 1018 years are required at 1014 conformations explored

per second!

Landscape analysis Multi-modal landscape Need of parallel hybrid (global and local) meta-heuristics and Grid computing

Parallel evaluation of

the population

High-level co-evolutionary hybridizationMulti-start model

High-level co-evolutionary hybridization

Cooperative GAs (Island model)

Parallel asynchronous hierarchical hybrid meta-heuristic

A-A. Tantar, N. Melab, E-G. Talbi, O. Dragos and B. Parain. A Parallel Hybrid Genetic Algorithm for Protein Structure Prediction on the Computational Grid. FGCS, Elsevier Science, Vol.23(3), 398-409, 2007.

... ...

...∂

1∂

2∂

n

...∂'

1∂'

2 ∂'n

Genetic Algorithm Population

Local Search

Optimized Individual

Grid5000: 7 sites, Avg. 800 CPUs – Execution time: 1h – Cumul. time: 1 month

Preliminary experimental results on Grid5000

Implementation with ParadisEO-G4

Protein: Tryptophan-cage from Protein Data Bank (PDB - 1L2Y)

Average Quality Improvement: 62%

Interconnection Grid5000-DAS

Benefits More resources for dealing with very large proteins with

grid-based meta-heuristics New scientific challenge: scalability of ParadisEO-G

Requirements Need of a virtual Globus grid between Grid5000 and DAS

Common certification authority ?

Get longer the default run time of jobs in DAS Deployment time of the virtual Globus grid ~ 10 minutes Only 5 minutes for the combinatorial optimization process on

DAS !!

Design and implementation of Grid-based algorithms …

Meta-heuristics (near-optimal)Parallel hybrid design

… solving challenging problems in combinatorial optimization

Exact algorithmsParallel design

Implementation(ParadisEO@Grid)

Cooperation

Implementation(B&B@Grid)

Protein Structure Prediction Flow-Shop scheduling problem

Supported by ANR-GRID DOCK

Supported byACI-GRID DOC-G

Combinatorial Optimization on the Computational GridExperiments on Grid5000

Supported by ANR-GRID CHOC

Parallel models for exact optimization(B&B inspired)

B&B = Exploration + bounding of tree nodes Parallel models

Parallel multi-parametric model Parallel exploration of the search tree Parallel evaluation of the bounds Parallel evaluation of a single bound/solution

Parallel exploration of the search tree Massive parallelism needing a computational grid Gridification is required

Efficient work distribution during the exploration Need of low cost communications of work units

Efficient checkpointing-based Fault tolerance Search of an exact solution in a volatile

environment Low cost communication and storage of work units

Efficient termination detection May be implicit

The proposed approach: objectives

The approach uses a special coding … Node number Work unit (collection of nodes) = an

interval

Principles of the approach

0

0

0

1 2

2

3 4

4

5

[0,2] [3,5]

[0,5]

The approach is Dispatcher-Worker based on the work stealing paradigm Dispatcher: maintains a pool of work units (intervals) and the global

solution found so far Worker: performs B&B on a given interval and updates the global

solution

Work distribution and check-pointing Communication of intervals (two numbers) Two efficient operators: folding and unfolding of intervals

Design and implementation of Grid-based algorithms …

Meta-heuristics (near-optimal)Parallel hybrid design

… solving challenging problems in combinatorial optimization

Exact algorithmsParallel design

Implementation(ParadisEO@Grid)

Cooperation

Implementation(B&B@Grid)

Protein Structure Prediction Flow-Shop scheduling problem

Supported by ANR-GRID DOCK

Supported byACI-GRID DOC-G

Combinatorial Optimization on the Computational GridExperiments on Grid5000

Supported by ANR-GRID CHOC

N jobs to be scheduled on M machines Each machine can not be simultaneously assigned to two

jobs (colors) Jobs (colors) must be scheduled in the same order on all

machines One objective must be minimized

Cmax: Makespan (Total completion time)

M1

M2

M3

The Flow Shop Scheduling Problem

4 jobs on 3 machines

Network of the campus of Université de Lille1

123

FIL (Lille1)170

IUT A118

1718

A grid of more than 2000 processors

Grid5000 node at Lille

RENATER

NR

...NR

Other sites of GRID’5000

Grid’5000Grid’5000

Front-end

IP forwarding NAT

Dispatcher on a computation node

Experimental results

Standard Taillard’s benchmark: Ta056 - 50 jobs on 20 machines

Best known solution: 3681, Ruiz & Stutzle, 2004 Exact solution: 3679, Mezmaz, Melab & Talbi, 2006

Running wall clock time: 25 days 46 min

CPU time on a single processor: 22 years 185 days 16 hours

Avg. num. of exploited processors: 328

Maximum number of exploited processors: 1 195

Parallel efficiency: 97 % Bordeaux (88), Orsay (360), Sophia (190), Lille (98), Toulouse (112), Rennes (456), Univ. Lille1 (304)

M. Mezmaz, N. Melab, E-G. Talbi. A Grid-enabled Branch and Bound Algorithm for Solving Challenging Combinatorial Optimization Problems. Research Report, INRIA 5945, July 2006 (https://hal.inria.fr/inria-00083814).

Interconnection Grid5000-DAS

Benefits More resources for solving efficiently and optimally larger

problem instances with grid-based combinatorial optimization New scientific challenge: scalability (limits and solutions) The dispatcher has never crashed on Grid5000 (up to 2500

processors)

Requirements Avoiding the special configuration of the front-end to allow

transparent inter-grid communications between the dispatcher and the workers

Viewing DAS as a Grid5000 site and vice versa ?

Best-effort reservation mode in DAS Long-running problems Using the nodes as long as they are not requested for reservation