Applying Semantics in Dataset Summarization for Solar Data Ingest Pipelines
James Michaelis ([email protected]), Deborah L. McGuinness ([email protected]), Stephan Zednik ([email protected]), Patrick West ([email protected]), Peter Arthur Fox ([email protected])
Rensselaer Polytechnic Institute 110 8th St., Troy, NY, 12180 United States
Poster: MT15A-08Glossary:RPI – Rensselaer Polytechnic InstituteTWC – Tetherless World Constellation at Rensselaer Polytechnic InstituteVSTO – Virtual Solar Terrestrial Observatory.FITS – Flexible Image Transport System
Acknowledgments:Sapan Shah and Naveen Sridhar from the Tetherless World Constellation at RPIJoan Burkepile, Steve Tomczyk and Leonard Sitongia at the High Altitude Observatory.
Sponsors:
National Science Foundation
Motivations and Challenges:• Analysis of solar data necessary for space weather modeling and
forecasting – which have broad implications for terrestrial activity (e.g., communication grid reliability).
• Time series visualizations of solar activity, created by the High Altitude Observatory [1], enable needed analyses.
• This work focuses on two challenges:• Only small sections of the data will typically contain content of
interest to scientists• Subsets of time-series data may correspond to an event of
interest at a particular time (e.g., a solar event)• Based on these challenges, one goal in this work was to enable
scientists to get back data sets corresponding to desired data products - to facilitate further analysis.
Data Management Strategies:• Provenance records for individual visualizations.• Ontological classification of visualizations, using DQ and STOM• Encoding records in RDF Datacube [2] (proposed)
IN51D-1713
Next Steps
http://bit.ly/V8NkBo
Get the poster at
Mauna Loa Solar Observatory (MLSO)Hawaii
Intensity Visualizations
Raw Image Data
Raw Image DataCaptured
National Center for Atmospheric Research (NCAR) Data Center.Boulder, CO
Follow-up Processing on Raw Data
Publishes
Time-stamped Observation Logs, maintained by MLSO staff. Comments on:Weather + Instrument conditions
Case Study: Coronal Multi-channel Polarimeter (CoMP):
Semantic Visualization Provenance Records:
Datacube Basics:
Properties attached to datasets/slices/observations:
Dimensions: Year, Metric
Attributes: GBU Metric
Measures: 146 (the value)
Datacube Usage:For HAO visualization records, Datacube can be used in two ways:- Returning aggregations of statistics for images (e.g., GBU results).
- Returning sets of visualizations (data points) for further exploration, based on constraints (e.g., temporal range).
Use Cases:- Activity Log Usage: Return images corresponding to a specific solar event record.
- Provenance (utilized data product): For this set of images utilizing the following flat field configuration file.
- Provenance (utilized process): For this set of images running based on version 2.0 of process “Extract Intensity”.
- Observer Log Usage: For the following observer log comment, return visualizations within 2 hours of the comment timestamp.
Time-stamped Activity Logs, maintained by MLSO staff. Comments on solar events (Coronal Mass Ejections, Active Regions)
- Deployment of provenance record retrieval as part of Virtual Solar Terrestrial Observatory .- Semantic Encoding of MLSO Event Logs - or data from Lockheed Martin's Heliophysics Events Knowledge Base [3].- Expanded use of dimensions in data cube, to include FITS header data.
References:[1] Mauna Loa Solar Observatory (High Altitude Observatory Site): http://mlso.hao.ucar.edu/[2] RDF Datacube Vocabulary: http://www.w3.org/TR/vocab-data-cube/[3] Heliophysics Event Knowledge Base: http://www.lmsal.com/hek/index.html