earth data science planning meeting #2 march 7, 2013
TRANSCRIPT
Earth Data Science Planning Meeting #2
March 7, 2013
Agenda
• Recap from 1st meeting• Discuss near term action items• Construct WGs• Revisit key questions
Recap for Meeting #1
• Introductions• Discussed the history of forming the group
(e.g., 8X Retreat)• Study Objectives• Initial Plans• Questions
Study Objective (1)
• Evaluation of the business case of targeting “data science” as a technology growth area in earth data systems research
• Identification of near-term science questions/challenges to address
• Identification of Data Science vs. Big Data synergies and differences
• Development of a capabilities roadmap• Current state of JPL vs competitors• Required staffing needs and gaps
Study Objective (2)
• Key partnerships• Necessary facilities support vs. current state• Recommendations on how to structure a long-term
program• Identify opportunities to work NASA ESD Program
and propose
Today’s Discussion: Proposed Near-Term Actions
• Data Lifecycle • Data Science White Paper• IT for Climate Research Workshop• Use Cases (Near-term, Long-term)• Invited Speakers
Earth Science Data Lifecycle
For JPL Internal Use Only
Data Acquisition
and Command
Instrument
Operations
EDOS/GDS
L0A Processin
g
ScienceData
ProcessingL0BL1L2L3L4
SDS EOSDIS DAAC
ScienceData
Management Archive
&Distribution
Instrument
Operations
EDOS/GDS
L0A Processin
g
ScienceData
ProcessingL0BL1L2L3L4
SDS EOSDIS DAAC
ScienceData
Management Archive
&Distribution
EOSDIS Data Centers
ScienceData
Management Archive
&Distribution
ScienceData
ProcessingL0BL1L2L3L4
Science Data Systems
Instrument Operations
EDOS/Ground Data
Systems
L0A Processing
Science Teams Outreach
Mission Operation
s
Mission Operations
TDRS Network
On Board Processing
Applications
Analysis, Modeling and
ApplicationEnvironments/
Gateways
DecisionSupport
Research
Data Science White Paper• Real example of a problem
• What are the problems we are trying to solve?
• Tracability matrix (link to use cases)• Data Science Concepts
– Massive Data Analysis– Long –tail Science Data Analysis
• Gap Analysis– Existing JPL activities (including investments)– Benchmarking (both our competitors and others)
• Opportunities– Climate, Shifting Archives, Applications->Decision Support
• Recommendations– Targets
• NASA Earth Science Technology Office (ESTO)• Jack Kaye’s programs; Steve Volz’s programs (Maiden)
IT for Climate Research Workshop
• JPL held 1st IT for Climate Research Workshop in 2009 at SMC-IT in Pasadena, CA– Focus on motiving the problem of model to data comparison
• JPL/GSFC held 2nd IT for Climate Research Workshop in 2010 at GSFC– Focused on integration of ESG with satellite data
• Proposed 3rd workshop– Focus on data science aspects for climate-model comparison
Use Case Planning
• Identify key use cases that focus on data science challenges
• Identify a use case leader
• Capture the use case in a template
Proposed Use Case #1: Climate Model/Obs Comparison
• Growing, distributed, massive record of observational and climate model output– CMIP3: ~34 Terabytes– CMIP5: ~3 Petabytes– CMIP6: 350 PBs – 3 Exabytes (per D.
Williams and 2011 Climate Knowledge Discovery Workshop)
• A new paradigm is required to shift focus from data access and independent data analysis to online analysis services for highly distributed, heterogeneous data to• Fuse data together for long-term records• Compute higher order data products on request• Analyze distributed data (e.g., climate model output, satellite data, etc)
with distributed computation• Establish a scalable computing infrastructure for missions and science
projects
Proposed Use Case #2: CO2 Research
AIRS
OCO-2
TES
GOSAT
Bias Update, etc.
Data Assim. CO2 f(x,y,z,t) Data Assim.
d[CO2]/dt at surface
Primarily a GCM, e.g. GEOS-5
A coupled chemistry-transport model with surface model, e.g. GEOS-Chem
L2 data products from Data Centers
In Situ
Aircraft
TCCON
Provide data records tied to WMO standard
•Existing measurements are integrated from disparate data centers•Methods for generating long-term records are under development•Initial data records from ACOS and AIRS have been captured•Support for generating OCO-2 L3 products will be in place
Proposed Use Case #3: Extreme Weather Events
Can we look at historical data to link to extreme weather events? (e.g., Hurricanes, etc?)
Use Cases Beyond Earth
• We started at Earth, but do want to wear a bigger hat
• Astronomy
• Engineering
• Planetary
Working Groups
• Data Lifecycle• Data Science White Paper• Use Cases
– Climate– CO2– Extreme Weather– Astronomy– Planetary– Engineering
Speakers?
• Begin to socialize efforts from colleagues
• Recommendation: Focus on those people analyzing massive scientific data sets in physical sciences – Remote sensing would be ideal
Questions• What is Big Data?
• What is the life cycle of the entire flow?
• What are the problems to solve• CO2 Sink/Source on ….
• What are the opportunities?
• What is the long term plan?
• What are some low hanging fruit?
• What are our next steps
• What benefits do we expect?