usability issues facing 21st century data archives

Post on 18-Jan-2016

18 Views

Category:

Documents

2 Downloads

Preview:

Click to see full reader

DESCRIPTION

Usability Issues Facing 21st Century Data Archives. Joey Mukherjee and David Winningham joey@swri.org. Future Scientists. Mission. Archive. Team. Write Papers. Current Archiving Goal. Raw Data. Processed Data. Data Iteration. Quality Data. Quality Data. Future Scientists. - PowerPoint PPT Presentation

TRANSCRIPT

Usability Issues Facing 21st Century Data Archives

Joey Mukherjee and David Winninghamjoey@swri.org

Current Archiving Goal

Mission TeamRawData Processed

Data

Write Papers

DataIteration

QualityData

ArchiveFuture Scientists

QualityData

Current Archiving Reality

Mission TeamRawData Processed

Data

Write Papers

DataIteration

DataSubsets

Permanent Archive

Future Scientists

UncheckedData

Home Institution

Archive

PublicData

New Goal

Mission TeamRawData Processed

Data

Write Papers

DataIteration

ProcessedData

ArchiveFuture Scientists

ProcessedData

Standardizing HOWTO

Make it easyMake it usefulMake it extensible

Make it Easy

Reading / writing files must be super easy (i.e. cheap!)

– Either with tools or libraries

Tools can be command line or GUI

Make it Useful

How do I look at it?– Plots/Analysis

What else can I do with it?– Read into IDL, Matlab, Excel, etc.

Must have immediate benefits

Make it Extensible

Must be possible for others to add value added servicesMust be able to hold varieties of dataMust agree to give up control on content

Case Studies: HTML

Easy to create!Once done, look at in browserEmbrace / Extend

Case Studies: SPASE

Creation is slow and difficultOnce created, no real benefits yetVxOs have embraced, no one extended yet

Case Studies: IDFS

Until recently, difficult to create, complexOnce in, easy to look at, use, archive, etc.Somewhat extensible

Things right with IDFS

EfficientSelf documentingCalibrations stored in text file Science units derived instead of storedLittle to no reprocessing ever needed

Other IDFS Benefits

Can store most types of space physics data from raw telemetry to highly processed science unitsReversible from science units to raw telemetryUsable by data processor, scientist, and data archiver

Things wrong with IDFS

Overly complex format and APINot enough support in other tools - poor buy-inAnalysis routines merged with the file format - tried to do too much!

Implementation Plan

Develop a simple file format that can contain any and all types of time series space physics dataDevelop tools that allow someone to create and inspect files in this format Merge in the best parts of IDFS, CDF, netCDF, HDF, FITS, etc... without breaking paradigm of simplicity

Simple File Format

Format might already exist:– HDF5– XML– JSON– Other data models?

Making it useful

Get buy-in from visualization tools (SDDAS, DataShop, VisBard, IDL DLM, etc.)Get buy-in from archives sites (PDS, PSA, NSSDC, etc.)Seed money is essential

Advantages

ProvidersUsersManagement

Advantages: Providers

Instrument teams now have something to work towardCan develop expertise

Advantages: Users

Quick ways to create plots or access dataExpertise again!

Advantages: Management

Homogenous archives are infinitely easier to manage and maintainValue added services are a natural extension of quality archives

Conclusion

Why now? Because SPASE is gaining traction, this is the next logical step.This will save money for everyone in the long run.Everyone benefits with value added services.

top related