e-infrastructure shared between europe and latin america 1 e-infraestructure shared between europe...

32
1 E-infrastructure shared between Europe and Latin America www.eu-eela.org www.eu- eela.org E-Infraestructure shared between Europe and Latin America José Manuel Gutiérrez [email protected] EELA is project funded by the European Union under contract 026409 EELA Applied Meteorology Group http://www.meteo.unican.es High-Performance GRID Computing. Activities within EELA Project in Biomedicine and Climate 2nd International Seminar on Genomics, Proteomics and Bioinformatics Popayán (Colombia), 25-27 oct. 2006.

Upload: randall-flynn

Post on 30-Dec-2015

217 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: E-infrastructure shared between Europe and Latin America  1 E-Infraestructure shared between Europe and Latin America José Manuel Gutiérrez

1

E-infrastructure shared between Europe and Latin America

www.eu-eela.org

www.eu-eela.org

E-Infraestructure shared between Europe and Latin America

José Manuel Gutiérrez

[email protected]

EELA is project funded by the European Union under contract 026409

EELA

Applied Meteorology Group

http://www.meteo.unican.es

High-Performance GRID Computing. Activities within EELA Project in

Biomedicine and Climate

2nd International Seminar on Genomics, Proteomics and Bioinformatics

Popayán (Colombia), 25-27 oct. 2006.

Page 2: E-infrastructure shared between Europe and Latin America  1 E-Infraestructure shared between Europe and Latin America José Manuel Gutiérrez

2

E-infrastructure shared between Europe and Latin America

www.eu-eela.org Local clusters in Santander

Page 3: E-infrastructure shared between Europe and Latin America  1 E-Infraestructure shared between Europe and Latin America José Manuel Gutiérrez

3

E-infrastructure shared between Europe and Latin America

www.eu-eela.org

Surgeryplanning &

visualisation

Floodingcontrol

MIS

HEPdata

analysis

weather &pollutionmodelling

level 1 - special hardware

40 MHz (40 TB/sec)level 2 - embedded processorslevel 3 - PCs

75 KHz (75 GB/sec)5 KHz (5 GB/sec)100 Hz(100 MB/sec)data recording &offline analysis

Task 1.0: Co-ordination & management

Aplicaciones

En distintas disciplinas existen problemas que requieren computación de alto rendimiento a través de paralelización de procesos y/o de ejecución de múltiples trabajos.

Page 4: E-infrastructure shared between Europe and Latin America  1 E-Infraestructure shared between Europe and Latin America José Manuel Gutiérrez

4

E-infrastructure shared between Europe and Latin America

www.eu-eela.org

CrossGrid - International Testbed Organisation

UCY NikosiaDEMO Athens

Auth Thessaloniki

CYFRONET Cracow

ICM & IPJ Warsaw

PSNC Poznan

II SAS Bratislava

FZK Karlsruhe

UvA Amsterdam

CSIC Valencia

UAB Barcelona

CSIC Santander

CSIC Madrid

LIP Lisbon

USC Santiago

TCD Dublin

CrossGrid Project

Page 5: E-infrastructure shared between Europe and Latin America  1 E-Infraestructure shared between Europe and Latin America José Manuel Gutiérrez

5

E-infrastructure shared between Europe and Latin America

www.eu-eela.org

• Desarrollada a mediados de los noventa.

• Utilización de recursos computacionales distribuidos, heterogéneos, dinámicos y, de forma habitual, paralelos.

• Globus Toolkit y OGSA — software intermedio (middleware) y estándar para construir aplicaciones.

• Diversos proyectos de investigación y productos comerciales desarrollando esta tecnología.

Sería fantástico que la potencia de cómputo estuviese disponible de la misma manera que la electricidad (grid) (Ian Foster).

Computación GRID

Page 6: E-infrastructure shared between Europe and Latin America  1 E-Infraestructure shared between Europe and Latin America José Manuel Gutiérrez

6

E-infrastructure shared between Europe and Latin America

www.eu-eela.org Estructura del GRID

Grid Resource

Allocator Manager

Monitoring and

Discovering Sys.

Grid Resource

Inf. Service

Grid Index

Inf. Service

GridFTP

Page 7: E-infrastructure shared between Europe and Latin America  1 E-Infraestructure shared between Europe and Latin America José Manuel Gutiérrez

7

E-infrastructure shared between Europe and Latin America

www.eu-eela.org

UI

Broker

Optimal Resource AllocationReplica/Data Manager?

WEB SERVICE FINDER?

Master

Slave CESESESESESE CACHED

Ejemplo de "job"

Page 8: E-infrastructure shared between Europe and Latin America  1 E-Infraestructure shared between Europe and Latin America José Manuel Gutiérrez

8

E-infrastructure shared between Europe and Latin America

www.eu-eela.orgEELA. Goal and Objectives

E-infrastructure shared between Europe and Latin America

• Goal: To build a bridge between consolidated e-Infrastructure initiatives in Europe and emerging ones in Latin America.

• Objectives: Establish a human collaboration network between

Europe and Latin America Setting a pilot e-infrastructure in Latin America Identifying and promoting a sustainable framework

for e-Science in Latin America

Page 9: E-infrastructure shared between Europe and Latin America  1 E-Infraestructure shared between Europe and Latin America José Manuel Gutiérrez

9

E-infrastructure shared between Europe and Latin America

www.eu-eela.org Partners

Spain: CIEMAT, CSIC, UPV, RED.ES, UC

Italy: INFN

Portugal: LIP

International:CLARA

CERN

EU

Latin AmericaVenezuela: ULA

Cuba: CUBAENERGIA

Chile: UTFSM, REUNA, UDEC

Peru: SENAMHI

Mexico: UNAM

Argentina: UNLP

Brazil: UFRJ, CNEN,

CECIERJ/CEDERJ, RNP, UFF

Page 10: E-infrastructure shared between Europe and Latin America  1 E-Infraestructure shared between Europe and Latin America José Manuel Gutiérrez

10

E-infrastructure shared between Europe and Latin America

www.eu-eela.org Structure

WP2. Pilot testbed operation and supportGEANT, RedCLARA and European and Latin American NRENs will

provide the network infrastructure. The grid infrastructure will be based on the EGEE middleware framework .

WP1. Project administrative and technical management

WP3. Identification and support of Grid-Enhanced applications

WP4. Dissemination activities

Page 11: E-infrastructure shared between Europe and Latin America  1 E-Infraestructure shared between Europe and Latin America José Manuel Gutiérrez

11

E-infrastructure shared between Europe and Latin America

www.eu-eela.orgWP3. Applications

Task 3.1. Biomed Applications

Task 3.2. HEP Applications

Task 3.3. Additional Applications:E-LearningClimate

Deliverable D3 .1 .1. Selection Report Biomedicine and HEP Applications

Page 12: E-infrastructure shared between Europe and Latin America  1 E-Infraestructure shared between Europe and Latin America José Manuel Gutiérrez

12

E-infrastructure shared between Europe and Latin America

www.eu-eela.org

• Context:– The biomedical applications being deployed on the pilot EELA

infrastructure have been identified from current existing ones already in use in EGEE, and from the expertise and research activity of the LA and EU partners in EELA.

– The target of the biomedical part of EELA is to deploy Grid applications for the biomedical LA community to improve their research excellence and to foster the use of Grids in this community.

– Applications are selected considering their relevance for LA partners from the portfolio of existing and new applications.

• Project:– Two applications from the portfolio of mature EGEE biomedical

applications have been selected by LA partners: GATE and WISDOM.– Two new applications were identified from the specifics needs of LA

partners: BLAST and Phylogenetics.– EELA has joined the Ibero-American Portal of Bioinformatics.

Biomedical Applications

Page 13: E-infrastructure shared between Europe and Latin America  1 E-Infraestructure shared between Europe and Latin America José Manuel Gutiérrez

13

E-infrastructure shared between Europe and Latin America

www.eu-eela.org GATE

• GATE: Géant4 Application for Tomographic Emission– GATE is a C++ platform based on the Monte Carlo Geant4

software designed to model nuclear medicine applications (PET, SPECT). This platform is also adequate for radiotherapy and brachytherapy treatment planning.

– The objective of GATE is to use the Grid environment to reduce the computing time of Monte Carlo simulations in order to provide higher accuracy in a reasonable period of time.

– The main benefit of using the Grid is that it has enabled medical users to access to realistic Monte Carlo simulations for their research in radiotherapy planning. The EELA Grid provide of enough computational resources to deal with the large requirements that this processing has.

– GATE is already installed on several EELA’s partners sites.

Page 14: E-infrastructure shared between Europe and Latin America  1 E-Infraestructure shared between Europe and Latin America José Manuel Gutiérrez

14

E-infrastructure shared between Europe and Latin America

www.eu-eela.org WISDOM

• WISDOM: Wide In Silico Docking On Malaria– The objective of WISDOM is the creation of new inhibitors for a

family of proteins produced by Plasmodium falciparum. This protozoan parasite causes malaria.

– This application consists on the deployment of a high throughput virtual screening in the perspective of in silico drug discovery for neglected diseases.

– Interest of EELA partners: selection of new targets for malaria; study of new targets for new parasitory diseases; and contribution with resources for the WISDOM data challenge.

– The benefit of Grids is the reduction of the development cycle of new drugs for neglected diseases by providing in silico simulations of the selection of the adequate reactors for specific targets and the needed infrastructure to deal with the computational power required.

Page 15: E-infrastructure shared between Europe and Latin America  1 E-Infraestructure shared between Europe and Latin America José Manuel Gutiérrez

15

E-infrastructure shared between Europe and Latin America

www.eu-eela.org PHYLOGENY (MrBayes)

• Phylogeny with MrBayes program: – A phylogeny is a reconstruction of the evolutionary history of a

group of organisms.– Bayesian inference is a powerful mathematical method which is

implemented in the MrBayes program for estimating phylogenetic trees that are based on the “a posteriori” probability distribution of the trees.

– The phylogenetic tools are widely demanded by LA bioinformatics community.

– A Grid service for the parallelised version of MrBayes application will be developed and a simple interface will be deployed on the Ibero-American Portal of Bioinformatics. This Grid-enabled service will make use of EELA resources to run phylogenetic studies at high performance.

Page 16: E-infrastructure shared between Europe and Latin America  1 E-Infraestructure shared between Europe and Latin America José Manuel Gutiérrez

16

E-infrastructure shared between Europe and Latin America

www.eu-eela.org BLAST

• BLAST: Basic Local Alignment Searching Tool– BLAST finds regions of local similarity between sequences. The

program compares nucleotide or protein sequences to sequence databases and calculates the statistical significance of matches.

– The process of finding homologous of sequences is computionally-intensive. The size of available non-redundant databases increases daily. Since databases are periodically updated, the periodically update of the previous studies is convenient.

– The use of Grid will allow to increase the number of fragments to be analysed and the periodical update of this information.

– A Grid service for running MPIBlast on the EELA grid, and using the Ibero-American portal of Bioinformatics (CECALC-ULA), has been developed.

Page 17: E-infrastructure shared between Europe and Latin America  1 E-Infraestructure shared between Europe and Latin America José Manuel Gutiérrez

IST-2006-026409 www.eu-eela.org

E-infrastructure shared between Europe and Latin America

Blast in Grids (BiG)

Ignacio Blanquer

Universidad Politécnica de Valencia

Page 18: E-infrastructure shared between Europe and Latin America  1 E-Infraestructure shared between Europe and Latin America José Manuel Gutiérrez

18

E-infrastructure shared between Europe and Latin America

www.eu-eela.org

• BLAST (Basic Local Alignment Search Tool) is a Bioinformatics Procedure Applied to Identify Compatible Protein and Nucleotids Sequences in Protein and DNA Databases.

• BLAST can be Applied, Among Other Uses, to Annotate the Estimated

Function of Unknown

Sequences.• BLAST is Computationally

Intensive.

Page 19: E-infrastructure shared between Europe and Latin America  1 E-Infraestructure shared between Europe and Latin America José Manuel Gutiérrez

19

E-infrastructure shared between Europe and Latin America

www.eu-eela.org

• BLAST in Grids (BiG)– Grid Interface to MPI Blast. – Access Through a Web Portal (http://portal-bio.ula.ve/).– Access to EELA Grid Through Gate-to-Grid Using a Web

Service Rersource Framework Interface.

WEB Environment EELA Grid InfrastructureSE aker.dsic.upv.es

WNs

CE ramses.dsic.upv.es

Bioinformatics Portal

Gate-to-Grid

FASTAFile

(Input Sequence)

AGTACGTAGTAGCTGCTGCTACGTGGCTAGCTAGTACGTCAGACGTAGATGCTAGCTGACTCGA

FASTAFile

(Input Sequence)

AGTACGTAGTAGCTGCTGCTACGTGGCTAGCTAGTACGTCAGACGTAGATGCTAGCTGACTCGA

ExecutionParameters

ExecutionParameters

Protein Database

(Non Redundant e.g.)

Protein Database

(Non Redundant e.g.)

Output Matches

Xxxxx x x x x x xxx xx xxx x

Output Matches

Xxxxx x x x x x xxx xx xxx x

Page 20: E-infrastructure shared between Europe and Latin America  1 E-Infraestructure shared between Europe and Latin America José Manuel Gutiérrez

20

E-infrastructure shared between Europe and Latin America

www.eu-eela.org

• Design Objectives– Easy Interface with High Compatibility (Web Service + NCBI Based)

Same Parameters as BLAST. User-friendly and Intuitive.

– Support to Searching Simultaneously on Multiple Databases Parallel Process on Multiple Database Queries.

– Architecture Exportable to Other Common Problems Modular Structure of the System Components. Fast Capability to Migrate to Other Problems.

– Scalability Data Partition in Grid Approach Gives Scalability with

Huge Quantities of Data.

– High Performance Grid Computing + MPI Parallel Jobs in Dedicated

Clusters.

– Robust Fault Tolerance on Server and Client.

Page 21: E-infrastructure shared between Europe and Latin America  1 E-Infraestructure shared between Europe and Latin America José Manuel Gutiérrez

21

E-infrastructure shared between Europe and Latin America

www.eu-eela.org

• Hosted by the Ibero-American Portal of Bioinformatics (http://portal-bio.ula.ve) installed on the National Centre for Scientific Computation of the Universidad de Los Andes in Venezuela. – The Application is Available Through the Bioinformatics Portal of

CeCalcULA, Being Accessible for Registered Users. http://www.cecalc.ula.ve/blast/

– This portal also provides several on-line applications for registered users. It currently has almost 600 registered users from 70 countries (although 90% come from 10 countries).

• The Service is Also Being Used by the Genomic Centre of the Valencian Institute of Research on Agriculture (Centro de Genómica, Instituto Valenciano de Investigaciones Agrarias)

• Executions– 309 Runs Since June 2006.– 3200 CPU Hours (133) Consumed.

Page 22: E-infrastructure shared between Europe and Latin America  1 E-Infraestructure shared between Europe and Latin America José Manuel Gutiérrez

22

E-infrastructure shared between Europe and Latin America

www.eu-eela.org

• Alineamiento con BLAST

Page 23: E-infrastructure shared between Europe and Latin America  1 E-Infraestructure shared between Europe and Latin America José Manuel Gutiérrez

23

E-infrastructure shared between Europe and Latin America

www.eu-eela.org

Page 24: E-infrastructure shared between Europe and Latin America  1 E-Infraestructure shared between Europe and Latin America José Manuel Gutiérrez

24

E-infrastructure shared between Europe and Latin America

www.eu-eela.org

Page 25: E-infrastructure shared between Europe and Latin America  1 E-Infraestructure shared between Europe and Latin America José Manuel Gutiérrez

25

E-infrastructure shared between Europe and Latin America

www.eu-eela.org

Page 26: E-infrastructure shared between Europe and Latin America  1 E-Infraestructure shared between Europe and Latin America José Manuel Gutiérrez

26

E-infrastructure shared between Europe and Latin America

www.eu-eela.org

demo

msraalrlkipmpatmadfafpslrafsivvaldkqhgigdgesipwrvpedmaffkdqttllrnkkpptekkrnavvmgrktwesvpvkfrplkgrlnivlsskatveellaplpegkraaaaqdvvvvngglaealrllarppycssietaycvggaqvyadamlspcveklqevyltriyttapactrffpfppentttawdlassqgrrkseadglefeickyvprnheerqylel

1

demo

msraalrlkipmpatmadfafpslrafsivvaldkqhgigdgesipwrvpedmaffkdqttllrnkkpptekkrnavvmgrktwesvpvkfrplkgrlnivlsskatveellaplpegkraaaaqdvvvvngglaealrllarppycssietaycvggaqvyadamlspcveklqevyltriyttapactrffpfppentttawdlassqgrrkseadglefeickyvprnheerqylel

Page 27: E-infrastructure shared between Europe and Latin America  1 E-Infraestructure shared between Europe and Latin America José Manuel Gutiérrez

27

E-infrastructure shared between Europe and Latin America

www.eu-eela.org

demo

Page 28: E-infrastructure shared between Europe and Latin America  1 E-Infraestructure shared between Europe and Latin America José Manuel Gutiérrez

28

E-infrastructure shared between Europe and Latin America

www.eu-eela.org

Vicente Hernández, Ignacio Blanquer

Universidad Politécnica de Valencia

Camino de Vera s/n

46022 Valencia, Spain

Tel: +34-963879743

Fax. +34-963877274

E-mail: [email protected]

[email protected]

Page 29: E-infrastructure shared between Europe and Latin America  1 E-Infraestructure shared between Europe and Latin America José Manuel Gutiérrez

29

E-infrastructure shared between Europe and Latin America

www.eu-eela.orgClimate Models

Conservación de energía, masa, momento, vapor de agua,

ecuación de estado de gases.

360x180x32 x nvar

v = (u, v, w), T, p, = 1/ y q

Page 30: E-infrastructure shared between Europe and Latin America  1 E-Infraestructure shared between Europe and Latin America José Manuel Gutiérrez

30

E-infrastructure shared between Europe and Latin America

www.eu-eela.org ESG Home

Page 31: E-infrastructure shared between Europe and Latin America  1 E-Infraestructure shared between Europe and Latin America José Manuel Gutiérrez

31

E-infrastructure shared between Europe and Latin America

www.eu-eela.org Subsetting List

Page 32: E-infrastructure shared between Europe and Latin America  1 E-Infraestructure shared between Europe and Latin America José Manuel Gutiérrez

32

E-infrastructure shared between Europe and Latin America

www.eu-eela.org

Grid+OpenDAPTransparencyPerformanceTypical Application

Data(local)

netCDF lib

Application

Data(remote)

OpenDAP Client

Application

OpenDAPViahttp

Big Data(remote)

ESG client

Application

ESGGrid +DODS

OpenDAP Server ESG Server

Distributed Application

dataOpenDAP

ViaGrid

SecurityResource MgmtAnalysis functions