the noaa environmental software infrastructure and interoperability group cecelia deluca robert...
Post on 26-Dec-2015
216 Views
Preview:
TRANSCRIPT
The NOAA Environmental Software Infrastructure and Interoperability Group
Cecelia DeLucaRobert OehmkeSylvia MurphySEE PresentationSeptember 27, 2011
Outline• Overview• Part 1: Modeling frameworks
◦ ESMF◦ NUOPC◦ Climate-Hydrological Coupling◦ ESMF Regridding (Oehmke)
• Part 2: Modeling workflows and data services◦ Earth System Curator◦ TeraGrid Environmental Science Gateway◦ CoG Portal◦ Global Interoperability Program◦ Curator demonstration (Murphy)
The Basics
• NESII focuses on development of software infrastructure for Earth system modeling
• Arrived at ESRL / CIRES on Nov 1, 2009
• Formerly the Earth System Modeling Infrastructure section at the National Center for Atmospheric Research
• Started with the Earth System Modeling Framework (ESMF) project – has grown to include numerous others
• Partners and customers are from research and operational centers, weather and climate, across U.S. agencies and international organizations
The Vision• Develop interoperable modeling components that can connect in multiple ways
Improve predictions and support research
• Build advanced utilities that many models can useEnable research, promote cost efficiency
• Enable models to be self-describingIncrease understanding and defensibility of outputs
• Create workflows that automate the modeling process from beginning to endImprove productivity
• Build workspaces that encourage collaborative, distributed development of models and data analysisLeverage distributed expertise
Key Strategies
• Use and evolve standards (data, metadata, component interfaces, services) to maximize interoperability
• Community-driven development and community ownership◦ Formal governance processes in which customers set priorities◦ Frequent public design reviews and demonstrations
• Openness of project metrics, code and information◦ Public storage of project records wherever possible
• Commitment to a globally distributed and diverse development and customer base◦ Leverage broad base of code and expertise to build complex systems◦ Distributed development and routine international collaboration
The TeamPerson Role Location
Cecelia DeLuca Technical Manager ESRL
Sylvia Murphy Operations & Project Manager ESRL
Silverio Vasquez Test and Integration Lead ESRL
Gerhard Theurich Senior Developer – architecture CA
Bob Oehmke Senior Developer - grids ESRL
Luca Cinquini Senior Developer - database ESRL/CA
Peggy Li Developer - performance CA
Allyn Treshansky Developer - metadata CA
Walter Spector Developer – porting CA
Ryan O’Kuinghttons Developer - everything ESRL
Fei Liu Developer - applications New Jersey
Kathy Saint Developer - web Florida
Earl Schwab Developer - utilities CO
NESII Collaborators and CustomersIf it’s a large center engaged in Earth System Modeling, chances are good NESII is working with
someone there …• NOAA ESRL GSD/PSD, GFDL, NCDC, PMEL, NCEP Environmental Modeling Center• NASA JPL, Goddard Space Flight Center, GISS• DOE PCMDI, Argonne National Laboratory, ORNL, Sandia• DoD NRL Stennis and Monterey, Naval Oceanography, Army ERDC, Air Force Weather Agency• DOI USGS• NCAR Community Earth System Model, WRF, MPAS, HAO, Unidata• University of Michigan, Purdue University, University of South Carolina, University of Colorado,
Colorado State University, Georgia Institute of Technology• GO-ESSP, CUAHSI, CSDMS, OpenMI, OGC, CCA, METAFOR• Delft Hydraulics, British Atmospheric Data Center, CERFACS, IPSL, Univ. Reading, UK Met
Office, DKRZ, MPI• Many more …
Outline• Overview• Part 1: Modeling frameworks
◦ ESMF◦ NUOPC◦ Climate-Hydrological Coupling◦ ESMF Regridding (Oehmke)
• Part 2: Modeling workflows and data services◦ Earth System Curator◦ TeraGrid Environmental Science Gateway◦ CoG Portal◦ Global Interoperability Program◦ Curator demonstration (Murphy)
Earth System Modeling Framework
Started: 2002Collaborators: Co-developed and used by NASA (GEOS-5 climate model), NOAA (NCEP weather models),
Navy (global and regional models), Community Earth System Model, othersSponsors: NASA MAP, NOAA NWS and CPO, NSF SEIII, DoD HPCMP
•ESMF increases code reuse and interoperability in climate, weather, coastal and other Earth system models •ESMF is based on the idea of components, sections of code that are wrapped in standard calling interfaces•Most U.S. weather and climate models now use the framework in some fashion ~100 components
ESMF Architecture
• Each box in the diagram is a component• Components can be arranged hierarchically, helping to organize the structure of complex
models• Different modeling groups may create different kinds or levels of components
Architecture of the GEOS-5 atmosphericgeneral circulation model
Architecture
Low Level Utilities
Fields and Grids Layer
Model Layer
Components LayerGridded ComponentsCoupler Components
ESMF Infrastructure
User Code
ESMF Superstructure
MPI, NetCDF, …External Libraries
• ESMF provides a superstructure for assembling model components into applications.
• ESMF provides an infrastructure that modelers use to– Generate and apply
interpolation weights– Handle metadata, time
management, data I/O and communications, and other functions
Standard Interfaces• All ESMF components have the same
three standard methods:◦ Initialize◦ Run◦ Finalize
• Each standard method has the same simple interface:
call ESMF_GridCompRun (myComp, importState, exportState, clock, …)
Where:myComp points to the componentimportState is a structure containing input fieldsexportState is a structure containing output fieldsclock contains timestepping information
Steps to adopting ESMF• Divide the application into components (without
ESMF)• Copy or reference component input and output
data into ESMF data structures• Register components with ESMF• Set up ESMF couplers for data exchange
• Interfaces are wrappers and can often be set up in a non-intrusive way
Component Overhead
• Diagram at right shows overhead of an ESMF-wrapped CCSM4 component vs native code
• For this example, ESMF wrapping required NO code changes to scientific modules
• No significant performance overhead (< 3% is typical)
• Few code changes for codes that are modular
• Platform: IBM Power 575, bluefire, at NCAR• Model: Community Climate System Model (CCSM)• Versions: CCSM_4_0_0_beta42 and
ESMF_5_0_0_beta_snapshot_01 • Resolution: 1.25 degree x 0.9 degree global grid with 17
vertical levels for both the atmospheric and land model, i.e. 288x192x17 grid. The data resolution for the ocean model is 320x384x60.
Metadata Handling and Usage•Complete provenance of codes and data is critical as Earth system models are scrutinized and employed for decision making!•Metadata is represented in ESMF by the Attribute class as name/value pairs
- Generate comprehensive simulation descriptions- Automate aspects of model execution and coupling- Archive metadata with output data and automatically link the two in a display
(TeraGrid Environmental Science Gateway)•Standard metadata is organized by Attribute packages
- Aggregate, store, output in XML and other formats•Attribute packages include the following conventions
- Climate and Forecast (CF)- Select ISO standards- METAFOR Common Information Model (CIM)- These can be linked and nested
Applications of information
layer
ESMF as anInformation Layer
Native model data structures
Standard data structures
Standard metadata
• Parallel generation and application of interpolation weights• Run-time compliance checking of metadata and time behavior• Fast parallel I/O• Redistribution and other parallel communications• Automated documentation of models and simulations (new)• Ability to run components in workflows and as web services (new)
FieldField GridGrid ClockClockComponentComponent
Attributes: CF conventions, ISO standards, METAFOR Common Information Model
Attributes: CF conventions, ISO standards, METAFOR Common Information Model
Structured model information stored in ESMF wrappers
User data is referenced or copied into ESMF structures
modulesmodulesfieldsfields
gridsgridstimekeepingtimekeeping
ESMF data structures
Portability and Testing• ESMF is comprehensively tested and extremely portable!• Many tests and examples bundled with the software
◦ More than 4600 unit tests◦ An additional, automated test harness to cover the many options related
to grids and distributions◦ Dozens of examples◦ Dozens of system tests◦ External demonstrations, showing ESMF linked to applications
• Users can separately download use test cases, with more realistic problem and data sizes
• Regression tests run nightly on 24+ platform/compiler combinations
ESMF Summary• Widely used: ~4000 downloads, ~2300 members of informational mailing list• Stable calling interface, largely guaranteed to be backward compatible• Fast parallel remapping for unstructured or logically rectangular grids, many options• Core methods are scalable to tens of thousands of processors• Supports hybrid (threaded/distributed) programming for optimal performance on many
computer architectures• Multiple coupling and execution modes for flexibility• Time management utility with many calendars, forward/reverse time operations, alarms,
and other features• Metadata utility that enables comprehensive metadata to be written out in standard
formats• Support most current platform/compiler combinations, exhaustive test suite and
documentation• Couples Fortran or C-based model components
ESMF Key Investments
• Maintain and promote the framework with multi-agency partners• Increase outreach and number of people trained• Pilots with other NOAA centers – especially GFDL and OHD –
resulting in more code reuse• Explore ESMF regridding for reuse in the data services
community via a python interface, avoid massive redundant development
National Unified Operational Prediction Capability (NUOPC)
• ESMF allows for many levels of components, types of components, and types of connections
• In order to achieve greater interoperability, usage and content conventions and component templates are needed
• This collaboration is building a “NUOPC Layer” that constrains how ESMF is used, and introduces metadata and other content standards
• The initial pilot project (delivered June 2011) focuses on atmosphere-ocean coupling in NCEP NEMS and Navy NOGAPS and COAMPS codes
Started: 2010Collaborators: Tri-agency (NOAA, Navy, Air Force) consortium of operational weather prediction centers,
with participation from NOAA GFDL and NASA modelersSponsors: NOAA NWS and Navy
A Common Model Architecture
NUOPC partners have agreed on a subset of components whose interactions will be standardized
Operational Modeling Systems using (or will use ) ESMF/NUOPC Layer in some form
21
• Global Forecast System (GFS)• Global Ensemble Forecast System
(GEFS)• North American Mesoscale Model (NMM)• Finite Element Icosahedral Model (FIM)• NOAA Environmental Modeling System
(NEMS)• Global Assimilation of Ionospheric
Measurements (HAF-GAIM)• Weather Research and Forecasting
Model (WRF)• Land Information System (LIS)
List courtesy of David McCarren, NOAA/Navy
Builds on Battlespace Environments Institute (2005-2010) reliance on ESMF
• Naval Operational Global Atmospheric Prediction System (NOGAPS)
• Coupled Ocean Atmosphere Mesoscale Prediction System (COAMPS)
• Navy Coastal Ocean Model (NCOM)• Hybrid Coordinate Ocean Model
(HYCOM)• Wave Watch 3 (WW3)• Community Ice Code (CICE)• Ensemble Forecast System (EFS)• Simulating Waves Near Shore (SWAN)• Advanced Circulation Model (ADCIRC)
NUOPC Layer Features and FunctionsDelivered with June 2011 Prototype
22
• Establish an architecture in which major components are siblings. The initial design supports explicit coupling and concurrent or sequential execution of components.
• Allow inheritance from generic component templates.• Couple components with pre-fabricated connectors. • Standardize the number and function of phases of initialize, creating a standard setup
pattern. • Constrain the data that may be sent between components with standardized field data
structures and a field dictionary based on the Climate and Forecast standard.• Implement a compliance checker to provide feedback during component development.• Use compatibility checking to determine if required import fields for a component were
supplied. Other run-time reporting alerts users to any issues with compliance.
NUOPC Key Investments
• Ensure the NUOPC Layer is adopted by target operational applications (So far … NOGAPS-HYCOM, NEMS … next COAMPS)
• Increase use of the NUOPC Layer within the broader ESMF customer base in order to increase research to operations potential
Curator Hydrology
• A new perspective on climate impacts modeling• Instead of what do we “put in” the climate model …• How do we create a linked network of models that multiple communities can use?
Started: 2009Collaborators: University of South Carolina, CUAHSI, University of MichiganSponsors: NOAA GIP
Climate-Hydro Coupling
• Hydrological impact studies can be improved when forced with data from climate models [Zeng et al., 2003; Yong et al., 2009]
• Ideally the coupling would be two-way• There are scale issues, but these are lessening as hydrological models increase in
size of region covered and climate models increase in resolution• A technology gap exists:
◦ Many hydrological models run on personal computers◦ Most climate models run on high performance supercomputers
• Existing frameworks: ESMF (climate/weather) and OpenMI (hydrology) can connect these types of models◦ ESMF and OpenMI components can be operated as web services that can be
used to communicate across a distributed network◦ Both ESMF and OpenMI are widely used
Design Goals for Climate Impacts
Goals Strategies
Modeling systems can be reconfigured easily for including different models or solving different problems
Leverage model interface and data standards
Modeling systems are highly accessible and can be integrated into workflows that include analysis, visualization, and other processing of outputs
Service oriented architecture
Communities formed around local/regional modeling and climate are able to utilize the social and technical structures that have evolved in their domains
Models retain their native codes, computing platforms, and data formats as much as possible
Prototype Climate-Hydro System
Data FilesData Files
SWATSWAT
OpenMIOpenMI
CAM OpenMI Wrapper
CAM OpenMI Wrapper
DriverDriver
High Performance ComputerHigh Performance Computer
ESMF Web Services
ESMF Web Services
ESMF CAMComponentESMF CAMComponent
Personal Computer• SWAT (hydrology model) runs on PC
• CAM (atmospheric model) runs on HPC
• Wrappers for both SWAT and CAM provide OpenMI interface to each model
• Driver (OpenMI Configuration Editor) uses OpenMI interface to timestep through models via wrappers
• Access to CAM across the network provided by ESMF Web Services
• CAM output data written to NetCDF files and streamed to CAM wrapper via ESMF Web Services
• Using prototype to explore feasibility of 2-way coupling
From Saint, iEMSs 2010
Target Coupled System
• Target system informed by exploration of parameter space for different strategies (estimated SWAT and CAM run times and transfer times)
• SWAT covering southeast U.S. coupled to CAM/CLM – purple region• Restricting finest SWAT resolution to watersheds of interest (Neuse and Savannah)
makes calibration somewhat easier• SWAT forced by CAM fields (precip, temperature, wind speed, etc.); ET from SWAT
nudges values in CAM
Curator-Hydrology Key Investments• Link to emerging NOAA efforts, e.g. IWRSS, OHD - co-evolve
and promote standards and an interoperable component pool• Link to community efforts – DOE integrated Regional Earth
System Model (iRESM)
Outline• Overview• Part 1: Modeling frameworks
◦ ESMF◦ NUOPC◦ Climate-Hydrological Coupling◦ ESMF Regridding (Oehmke)
• Part 2: Modeling workflows and data services◦ Earth System Curator◦ TeraGrid Environmental Science Gateway◦ CoG Portal◦ Global Interoperability Program◦ Curator demonstration (Murphy)
Earth System Curator
• Intergovernmental Panel on Climate Change (IPCC) assessments rely on data generated by Coupled Model Intercomparison Projects (CMIPS), where different scenarios are tested across many models
• For the last IPCC assessment, there was little metadata available about the runs performed
• The Curator project collaborated on a comprehensive metadata schema for climate models, and implemented a metadata display in the Earth System Grid data distribution portal
Started: 2005Collaborators: METAFOR , NCAR, DOE PCMDI, Earth System Grid Federation, NOAA GFDL, Georgia
Institute of TechnologySponsors: NSF SEIII, NASA MAP, NOAA GIP
ESG Metadata Display
RESULT:MUCH more information about climate models used in assessments, in browsable, searchable form
This screen shot shows a real CMIP5 run as it appears in an ESGF portal
Curator Key Investments• Add features to the Curator project to address science-driven requirements
collected and prioritized during reviews – search on forcings, differencing, dynamic comparison table
• Engage key partners (PCMDI, British Atmospheric Data Center, NCAR) to evolve the Curator architecture for new data systems
TeraGrid Environmental Science Gateway
Started: 2008Collaborators: NCAR CISL and CESM, Purdue UniversitySponsors: NSF TeraGrid
• Creates an end-to-end, self-documenting workflow for running the Community Earth System Model (CESM)• GUI configuration and submission of runs through the Purdue CESM
portal• ESMF is used within CESM to organize and output extensive model
metadata• Data and metadata is archived back to an Earth System Grid Federation
Gateway, where it can be searched and browsed• Currently have a working prototype
Workflow Architecture
Create Case Configure Case Submit Case
Authentication/ Authorization
Track Status
Post-process
Debugging
Transfer Files
Publish Metadata
ESG Data Publisher
iRODSScratch Storage
Token Mgr
Data/Metadata
JobsOutput
CESM Web Services
User requests
Job Management
CESM portalESG gateway
Output
TG MyProxy
Account DB
Publish Data
A paper describing this architecture recently won a best paper award at the TG11 conference:Zhao, L., C. Song, C, Thompson, H. Zhang, M. Lakshminarayanan, C. DeLuca,S. Murphy, K. Saint, Don Middleton, Nathan Wilhelmi, Eric Nienhouse, and Michael Burek, Developing an integrated end-to-end TeraGrid climate modeling environment , TeraGrid ’11, Salt Lake City, Utah, July 18-21, 2011
Environmental Science Gateway KeyInvestments• Support the automated collection of provenance information for
post-processing of climate and related model output – a critical gap! – to support efforts like the NCA
Earth System CoG
• Will facilitate collaborative model building, evaluation and analysis• Particularly well-suited for model intercomparison projects (MIPs)• Project hosting and indexing with connections to data and analysis services through
the Earth System Grid Federation• Projects will have standard metadata and can be linked to form networks • Pilot projects include 2012 workshop on comparison of atmospheric dynamical cores
(previously supported 2008 workshop) at NCAR, 2013 downscaling techniques workshop coordinated by the National Climate Predictions and Projections Platform
Started: 2009Collaborators: NCAR, Eart System Grid Federation, University of Michigan, CU Community Surface
Dynamics Modeling SystemSponsors: NSF CDI
2008 Dynamical Core Colloquium on CoG
CoG prototype includes data search, bookmarking, wikis, and communications
CoG Key Investments
• Establish CoG as a center for collaborative modeling and analysis• Engage the broader community in co-development of this capability – e.g.
British Atmospheric Data Center – in order to increase resource base
National Climate Predictions and Projections Platform
Started: 2010Collaborators: Mainly NOAA PSD, NCAR RAL, many others participatingSponsors: NOAA CPO
• Developing a portal and associated services to deliver information about the local and regional effects of climate change
• Still in formative stages• Defining pilot projects in collaboration with USGS and others• Thinking about implementation strategies• Investigating prior art and related projects
• NESII is serving as a technical coordinator
NCPP Key Investments
• Unify access to climate related holdings within DOE Earth System Grid, USGS GeoData Portal, NOAA NOMADS and other sources through standard data formats and tools (CF conventions, OGC standards, THREDDS, NWAVE)
• Promote standards for user services (e.g. climate indices generation) that can be applied to these unified data holdings (REST, OGC WPS)
NOAA Global Interoperability Program
• GIP builds software infrastructure that• can be used in the weather, water, and climate disciplines, and for training
modelers• integrates and automates workflows
• NESII lead DeLuca coordinates the project
Started: 2009Collaborators: NOAA GFDL, PMEL, GSD, and NCDC, Unidata, NCAR, CSU, University of Michigan Sponsors: NOAA CPO
Building Along Workflows
Climate Simulations
Application of Climate Information
Weather and Water Forecasting
Training Modelers
Model Utilities and Coupling
• Standardized analysis workflows for climate models
• Metadata display for CMIP5
• ESMF in CESM
• The NOAA Climate Projection Pilot
• A common model architecture for operational weather centers (NUOPC)
• NOAA Environmental Modeling System (NEMS)
• Geodesic grids in NEMS
• Summer School in Atmospheric Modeling
• The Art of Climate Modeling course
Metadata Standards
Data Services and Workflows
Building Across Disciplines
Climate Simulations
Application of Climate Information
Weather and Water Forecasting
Training Modelers
Model Utilities and Coupling
• ESMF core support• Hydrological-climate coupling with ESMF and OpenMI modeling frameworks
Metadata Standards • Gridspec integration into the Unidata LibCF library
Data Services and Workflows
• Merger of Ferret and CDAT analysis services
coordination coordination
RESULT:Better coordination of infrastructure development across disparate groups
GIP Key Investments
• Align GIP with NOAA strategic plans and objectives and “institutionalize” the program
• Formally associate GIP with global infrastructure coordination organizations, especially the Global Organization of Earth System Science Portals
The Vision• Develop interoperable modeling components that can connect in multiple ways
Improve predictions and support research
• Build advanced utilities that many models can useEnable research, promote cost efficiency
• Enable models to be self-describingIncrease understanding and defensibility of outputs
• Create workflows that automate the modeling process from beginning to endImprove productivity
• Build workspaces that encourage collaborative, distributed development of models and data analysisLeverage distributed expertise
Questions?
• For more information, links and references, see our newish group page: http://esrl.noaa.gov/nesii/
top related