a proposed earth science collaboratory k-s kuo 1,2, chris lynnes 1, rahul ramachandran 3 1 nasa...
Post on 27-Dec-2015
215 Views
Preview:
TRANSCRIPT
IGARSS 2011, Vancouver, Canada 1
A PROPOSED EARTH SCIENCE
COLLABORATORY K-S Kuo1,2, Chris Lynnes1, Rahul Ramachandran3
1NASA Goddard Space Flight Center, USA2Caelum Research Corporation, USA
3University of Alabama-Huntsville, USA
7/27/11
Why ESC?
7/27/112IGARSS 2011, Vancouver, Canada
Data Intensive ScienceMany forms and sources of data
In situ measurementsRemote sensing observationsModel simulations
Large volumes of data
Effectiveness as a scientistIncreasing proportion of effort in data managementThreatening:
ReproducibilityCorrectnessProductivity
IGARSS 2011, Vancouver, Canada 3
What is an ESC?Vision of a rich model development/simulation and data analysis environment that:
Provides access to various Earth Science models
Facilitates model and analysis software development
Provides access across a wide spectrum of Earth Science data
Provides a diverse set of science analysis services and tools
Supports the application of services and tools to data
Supports collaboration, i.e. sharing of data, tools and results
Supports discovery and publication of all science artifacts
7/27/11
Basically, a new and natural place for Earth scientists to conduct their work and
collaborate with others!
7/27/114IGARSS 2011, Vancouver, Canada
The Situation TodayIslands of data and services with selective
connectivity
Data Center A
Data Center C
Data Center B
IGARSS 2011, Vancouver, Canada 5
High-Level View
7/27/11
Cyberinfrastructure
Tool Library
Data Library
Laboratory Notebook
Workflow
Mediator
Data Centers
7/27/116IGARSS 2011, Vancouver, Canada
Tool Library
• Discovery• Social
oSharingoTaggingoDiscussion
• Configuration ManagementoTestingoVersioning
Packager• autoconf• RPM• Web
wrapper
PROVISIONED
• GrADS• IDL• MatLab• ncl• nco• cdat
COMMUNITY• Quality filter• Coincidence• Feature
detection• Event service• Visualization
CONTRIBUTED
• [Tool 1]• [Tool 2]• [Tool 3]• [Tool 4]• [Tool 5]• …
PERSONAL• [Tool 1]• [Tool 2]• [Tool 3]• [Tool 4]• [Tool 5]• …
7/27/117IGARSS 2011, Vancouver, Canada
Data Library
• Cache• Discovery• Social
oSharingoTaggingoDiscussion
• Configuration ManagementoTestingoVersioning
Packager• data probe• format
check• metadata
wizard
PROVISIONED
• EOSDIS
COMMUNITY• Field
campaigns• MEaSUREs• ACCESS• Validation
CONTRIBUTED
• [Dataset 1]• [Dataset 2]• [Dataset 3]• [Dataset 4]• [Dataset 5]• …
PERSONAL• [Dataset 1]• [Dataset 2]• [Dataset 3]• …
7/27/118IGARSS 2011, Vancouver, Canada
Workflow Library
• Discovery• Social
oSharingoTaggingoDiscussion
• Configuration ManagementoTestingoVersioning
Packager• Workflow editor
PROVISIONED
• Processing Algorithms
COMMUNITY• GeoBrain• SciFlo• Data Mining• Giovanni
CONTRIBUTED
• [Workflow 1]• [Workflow 2]• [Workflow 3]• [Workflow 4]• [Workflow 5]• …
PERSONAL• [Workflow 1]• [Workflow 2]• [Workflow 3]• …
7/27/119IGARSS 2011, Vancouver, Canada
Laboratory Notebook
• Discovery• Social
oSharingoTaggingoDiscussion
• Configuration ManagementoVersioning
Packager• Project Manager
• Experiment manager
• Notebook editor
PROVISIONED
• Tutorials• User guides• Example
uses• Educational
packages
COMMUNITY• Project results• Publications• Example
cases• Educational
packages
PROJECT• [Project 1]• [Project 2]• [Project 3]• [Project 4]• [Project 5]• …
PERSONAL• Notes• Journals• …
7/27/1110IGARSS 2011, Vancouver, Canada
Mediator• Mediates tool interaction with data• OPeNDAP – a common data model
(accessible by most tools)• Custom modules reformat data for
the rest of the tools• Ontology matches tools with data,
and vice versa.
IGARSS 2011, Vancouver, Canada 11
CyberinfrastructureServices used by all other
components
Securityauthenticationauthorizationcode audit/padded cell integrity checking
Socialtaggingsharingdiscussionsgroups
Cloudelastic provisioned storage and computing
Discoverydata, tools, workflows, experimentssearch by keyword, variable, time, author
Information Management
provenanceidentifiersarchive
Semantic Webdata ontologytools ontology 7/27/11
IGARSS 2011, Vancouver, Canada 12
Key Advantages of ESC
Tool availability will be a force multiplierMore tools will be usable with more datasets
More tools will be more available to more users
Knowledge sharing evolves from text on paper to a rich mixture of data, tools, workflows and articlesA “wikihow” for Earth Science data analysis
Incorporating live data, services and workflows
ESC maintains a record of the analysis processShare, repeat, build upon analysis techniques
Transparency of the process is built in
7/27/11
IGARSS 2011, Vancouver, Canada 13
Prior ArtTalkoot, myExperiment.org – workflow sharing, virtual notebooksEarth System Grid – provisioned tools, format standards/checkersNASA Earth Exchange (NEX)Land Information System – OPeNDAP as access infrastructureEarth Science Modeling Framework – programmatic approach to integrationGiovanni, LAS – community services/toolsCanadian Space Science Data Portal (EOS, Feb. 22, 2011)Nebula – cloud provisioning
7/27/11
A Use CaseGPM Precipitation Retrieval Algorithm
Development
7/27/1114IGARSS 2011, Vancouver, Canada
GPM Core Satellite: Dual-Frequency Precipitation Radar (JAXA) and GPM Microwave Imager (NASA)GPM Constellation: International partner satellites with mostly microwave radiometersRetrieval algorithms – 3 types
Radar-onlyRadiometer-onlyRadar-radiometer-combined
Participants in algorithm development are distributed in Japan, NASA centers (GSFC, MSFC, JPL), NCAR, and universities (FSU, Uwisc, etc.)
A Use CaseGPM Algorithm Development – Current
Situation
7/27/1115IGARSS 2011, Vancouver, Canada
Interdependence among 3 types of algorithmsCommunication/Coordination – Narrow bandwidth
Periodic workshop meetings and teleconferences
Data access – DuplicativeEach location/group has a copy or subset of required data
Sharing of data/tools – Individual, not concertedthrough ftp/email
Knowledge sharing – Delayed
A Use CaseGPM Algorithm Development – with ESC
7/27/1116IGARSS 2011, Vancouver, Canada
Tools
Data
ESC Client
mySci
Cat.
A
Tools
Data
ESC Client
mySci
Cat.
Z
Cloud
VM Image
Tools
Data
A
VM Image
Tools
Data
B
Community Catalog
ESC
A Use CaseGPM Algorithm Development – Multi-
level MembershipDC
B
A
K
J I H
G
F
E
GPM
Radar-Only
Radiometer-Only
Combined
Algorithm
M
L
A Use CaseGPM Algorithm Development – in ESC
7/27/1118IGARSS 2011, Vancouver, Canada
Enhanced communication/coordination – wide bandwidthEfficient data access – less duplicationImproved sharing – more pervasiveEffective knowledge sharing – immediate
Why now?Because we can do it (finally)!
Advances in standards acceptance andimplementation (OPeNDAP, autoconf)
A consistent, loosely coupled architecture encapsulates complexity and maximizes flexibility
Social networking has reached the mainstream
Key lessons can be learned from prior efforts
The need is growingInterest in working with multiple datasets is growing
Calls for transparency and reproducibility are growing
7/27/1120IGARSS 2011, Vancouver, Canada
What’s New?Macro View (forest-level)
Systematic approach to making data available to services and vice versa
Integration of all major analysis components
Consistent view of all architectural components
Cyberinfrastructure services for all architectural components
Micro View (tree-level): Nothing!
7/27/1121IGARSS 2011, Vancouver, Canada
top related