opennebulaconf 2016 - provisioning flexible and high available climate data services by marco...

12
Marco Mancini, Ph.D. Senior Scien)st - Advanced Scien)fic Compu)ng Division CTO – Supercompu)ng Center [email protected] h>p://github.com/km4rcus @marcomancini72 h>ps://www.linkedin.com/in/marco-mancini-0787551 Provisioning Flexible and High Available Climate Data Services

Upload: opennebula-project

Post on 07-Jan-2017

35 views

Category:

Technology


2 download

TRANSCRIPT

MarcoMancini,Ph.D.SeniorScien)st-AdvancedScien)ficCompu)ngDivisionCTO–Supercompu)[email protected]>p://github.com/km4rcus@marcomancini72h>ps://www.linkedin.com/in/marco-mancini-0787551

Provisioning Flexible and High Available Climate Data Services

About CMCC

•  CMCC is a non-profit research institution (Since 10th Dec. 2015 it is a Foundation)

•  Established in 2005, with the financial support of the Ministry of Education, University and Research (MIUR), the Ministry of the Environment and Protection of Land and Sea (MATT), the Ministry of Agricultural and Forestry Policies (MIPAF) and the Ministry of Finance (MEF)

•  CMCC’s Mission is to investigate and model our climate system and its interactions with society and the environment to guarantee reliable, rigorous, and timely scientific results to stimulate sustainable growth, protect the environment, and to develop science driven adaptation and mitigation policies in a changing climate.

•  6 Consortium Members: National Institute of Geophysics and Volcanology (INGV); University of Salento; Italian Aerospace Research Center (CIRA S.c.p.a); Ca’ Foscari University of Venice; University of Tuscia; University of Sassari.

•  8 Research Divisions: ASC, CSP, ECIP, IAFES, ODA, OPA, RAAS, REMHI •  1 Supercomputing Center with HPC and Storage facilities

The big challenge is to model this complex system

•  Several complex processes to be simulated

•  Several interacting processes

•  Great range of time scales to be analyzed

•  Great range of spatial scales to be considered

•  Need interdisciplinar sciences (physics, chemistry, biology, geology,…)

•  Inherently non-linear governing equations

•  Need sophisticated numerics •  Need huge computational

resources •  …and large volumes of data

can be produced

WarrenM.Washington–NCAR

Scien9ficGrandChallengesWorkshopSeries:ChallengesinClimateChangeScienceandtheRoleofCompu9ngattheExtremeScale

DOEWorkshop(ASCR-BER)November6-7,2008

CMCC information LIfecycle Management plAtform

CLIMA CMCC information LIfecycle Management plAtform

High Performance Computing

Analysis and Visualization

Sharing and Publication

Archiving and Retrieval

Objectives •  Enforcing Data Policies •  Optimizing Storage Cost •  Improving Data High Availability •  Robust Implementation of Operational Chains •  Ease Search&Discovery, Data Sharing and Collaboration

Federation of Data Services

CLIMA Data Service

Ingestion

Operational Chains

Data Access

Portal Gateway

Search & Discovery

Data Manage-

ment

iRODS is an open-source data management software: •  Virtualization •  Data Discovery •  Workflow Automation •  Data Sharing

Solr is open source enterprise search server that provides faceted navigation, clustering, grouping, and other search features

Thredds is a data access server that provides bulk file transfer, remote access, subsetting, web map services

ServersServersServers Disks

DisksDisks

NetworkingNetworkingNetworking

VLAN

ONEFLOW

Storage Service Compute & Networking Service

Physical Resources Storage Networking Virtualization Authentication

Multi-tier Infrastructure Orchestration (VMs)

Multi-tier Service Orchestration (Containers)

Multi-tier Application Provisioning - Scaling - Self Healing

Portal GatewayPortal Gateway Workflow AutomationOperational Chains Data AccessData Access Search & DiscoverySearch & Discovery Data ManagementData Management IngestionIngestion

CLIMA Rest Engine

CLIMA Backend

Create Data Service

ONEFLOWCLIMA Backend

Create Environment

Create API Key

Create Registration Token

Create OneFlow Service Template

Instantiate OneFlow Service Template

Create S3 Bucket

Instantiate Rancher Stack Create Container Volumes

Data Service OneFlow Template

Data Service Rancher Stack

High Available Data Services

Amazon EC2 Amazon S3

ONEFLOW

VPN

VMVMVM

VMVMVM

VMVMVM

VMVMVM

ONEFLOW

Federation + File Replication

Cross Data Center Replication

Federation

Slave Zone

Master Zone

Current/Future Works

ONEFLOW

Thank you.