metadata development in the earth system curator spanning the gap between models and datasets rocky...

21
Metadata Development in the Earth System Curator Spanning the Gap Between Models and Datasets Rocky Dunlap, Georgia Tech

Upload: sergio-orton

Post on 14-Jan-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Metadata Development in the Earth System Curator Spanning the Gap Between Models and Datasets Rocky Dunlap, Georgia Tech

Metadata Development in the Earth System Curator

Spanning the Gap Between Models and DatasetsRocky Dunlap, Georgia Tech

                                                                                                             

Page 2: Metadata Development in the Earth System Curator Spanning the Gap Between Models and Datasets Rocky Dunlap, Georgia Tech

Motivation

Two primary products of the climate community: datasets and the models used to produce them

Models Datasets

Page 3: Metadata Development in the Earth System Curator Spanning the Gap Between Models and Datasets Rocky Dunlap, Georgia Tech

Motivation

Many efforts in place to provide uniform access to datasetsAdditionally, groups like ESMF are working to develop frameworks for component exchanges and interoperability e.g., couple two different ocean

models with the same atmosphere

Page 4: Metadata Development in the Earth System Curator Spanning the Gap Between Models and Datasets Rocky Dunlap, Georgia Tech

Motivation

However there is currently a gap between models and datasetsModels and datasets are currently treated as distinct and separate entitiesEarth System Curator’s claim: This gap is actually an artificial barrier

that inhibits access to resources and results

Page 5: Metadata Development in the Earth System Curator Spanning the Gap Between Models and Datasets Rocky Dunlap, Georgia Tech

What is the Earth System Curator?

The goal of the Earth System Curator project is to provide a transparent interface to climate models and their output data

Models Datasets

What do we need to make this happen?

Page 6: Metadata Development in the Earth System Curator Spanning the Gap Between Models and Datasets Rocky Dunlap, Georgia Tech

Metadata

Metadata is data about dataWhat would it take to completely describe a particular climate model run? “completely” means you could reproduce

the output bit for bit

Model Run Output

…………………………………………

Model Metadata

Page 7: Metadata Development in the Earth System Curator Spanning the Gap Between Models and Datasets Rocky Dunlap, Georgia Tech

Convergence of Models and Data

ESC begins with a crucial insight: the descriptors used for comprehensively specifying a model configuration are also needed for a scientifically useful description of the model output data

This leads to the convergence of models and dataThere is a need for a common metadata formalism to unify the treatment of models and data

Page 8: Metadata Development in the Earth System Curator Spanning the Gap Between Models and Datasets Rocky Dunlap, Georgia Tech

(Simplified) System Overview

Graphical User

Interface

Metadata

Simulation Datasets

Models

??Query

Page 9: Metadata Development in the Earth System Curator Spanning the Gap Between Models and Datasets Rocky Dunlap, Georgia Tech

Research Approach

Study metadata structures of existing projects in the climate communityCreate a common ontology that aligns the metadata models, while also allowing for eventual inclusion of other metadata sources Extensibility is a priority!

The resulting metadata description will be the foundation of the Earth System Curator database

Page 10: Metadata Development in the Earth System Curator Spanning the Gap Between Models and Datasets Rocky Dunlap, Georgia Tech

Current Efforts

Earth System Modeling Framework (ESMF) NSF, NCAR, DoD, NASA, NOAA, DoE,

MIT, UCLA, University of Michigan

Earth System Grid (ESG) DoE SciDAC sponsored Labs: ANL, LBNL, LLNL, NCAR, ORNL

GFDL Curator database

Page 11: Metadata Development in the Earth System Curator Spanning the Gap Between Models and Datasets Rocky Dunlap, Georgia Tech

Earth System Modeling Framework

Common modeling infrastructure for climate and weather modelsComponents have standard interfaces which facilitates couplingESMF already contains a number of metadata-rich structures for describing climate models gridded component, coupler

component, field, bundle, state, clock, grid

Page 12: Metadata Development in the Earth System Curator Spanning the Gap Between Models and Datasets Rocky Dunlap, Georgia Tech

ESMF Coupled ModelGEOS-5

surface fvcore gravi ty_wave_drag

history agcm

dynamics physics

chemistry moist_processes radiation turbulence

infrared solar lake land_ice data_ocean land

vegetation catchment

coupler

coupler coupler

coupler

coupler

coupler

coupler

Our goal is to extract the metadata needed to adequately describe hierarchical, coupled climate models

Page 13: Metadata Development in the Earth System Curator Spanning the Gap Between Models and Datasets Rocky Dunlap, Georgia Tech

Earth System Grid

Make output of high-resolution, long-duration climate simulations available to global-change impacts researchersEnable analysis and knowledge development from earth system modelsIncrease productivity by linking users with needed data

Page 14: Metadata Development in the Earth System Curator Spanning the Gap Between Models and Datasets Rocky Dunlap, Georgia Tech

Earth System Grid

Page 15: Metadata Development in the Earth System Curator Spanning the Gap Between Models and Datasets Rocky Dunlap, Georgia Tech

GFDL Curator

An initial shot at a database that describes both climate models and dataMultiple compartments Models Variables Workflow Post Processing Data Portal

Page 16: Metadata Development in the Earth System Curator Spanning the Gap Between Models and Datasets Rocky Dunlap, Georgia Tech

Conceptual Modeling

Page 17: Metadata Development in the Earth System Curator Spanning the Gap Between Models and Datasets Rocky Dunlap, Georgia Tech

Why is this hard?

Disagreement about what terms mean What is a model? What is a component? What is a coupler? What is a code base?

Metadata must be as generic as possible while still being useful

Page 18: Metadata Development in the Earth System Curator Spanning the Gap Between Models and Datasets Rocky Dunlap, Georgia Tech

Deliverables

Allow researchers to archive and query Earth system models, experiments, model components, and model output dataPerform technical compatibility checking How can we determine if two components will

run together? What about scientific compatibility?

Prototype auto-assembly of components to facilitate model runs Involves automatic code generation of simple

couplers

Page 19: Metadata Development in the Earth System Curator Spanning the Gap Between Models and Datasets Rocky Dunlap, Georgia Tech

Broader Impacts

Improve climate prediction for policy makersFacilitate Model Intercomparison Projects (MIPs) by allowing fast setup and execution of experiments using different model componentsEncourage Curator-like activity in other domainsPromote the use of Curator as a normative ontology

Page 20: Metadata Development in the Earth System Curator Spanning the Gap Between Models and Datasets Rocky Dunlap, Georgia Tech

ESC Collaborators

NSF FundedNational Center for Atmospheric ResearchNOAA Geophysical Fluid Dynamics LaboratoryMITGeorgia Tech

Page 21: Metadata Development in the Earth System Curator Spanning the Gap Between Models and Datasets Rocky Dunlap, Georgia Tech

Thanks!

Website: http://www.cc.gatech.edu/projects/curator/

Questions?