A Mul&-‐Decade Case: The Evolu&on of Data Products and Designated Audiences
NISO 2016
Karen S. Baker Graduate School of Informa<on Sciences University of Illinois Urbana-‐Champaign
1
The story traces the evolu<on of a set of data products, asking
• How is knowledge mobilized? • What are the data products? • Who are the designated communi<es?
We present a three decade data story • Karen Baker, Ruth Duerr, and Mark Parsons, • Scien<fic Knowledge Mobiliza<on: Co-‐evolu<on of Data
Products and Designated Communi<es • Interna<onal Journal of Digital Cura<on 10(2), 2015
A Story About Data Product Development
Note on coauthors: Ruth Duerr now at Ronin Ins<tute for Independent Scholarship Mark Parsons now Secretary General of the Research Data Alliance (RDA)
2
Where the Story Takes Place: Na<onal Snow and Ice Data Center (NSIDC):
From Baker & Duerr, in press, Data & the Diversity of Repositories. In Cura<ng Research Data: A Handbook of Current Prac<ce
NSIDC
NSIDC
3
A data product is data at a par<cular stage of processing that can be iden<fied uniquely and described.
Digital Data Products
Kinds of data products • Ini<al recorded data • Calibrated data • Cleaned data • Gridded/Interpolated data • Interpreted data
• Derived data • Transformed data • Synthesized data
Note: Data product development is influenced by the intended use of the product.
4
Discussion Points
• Data Product Descrip<on § Collec<on of data products § Data product teams
• Data Product Development § Mul<-‐level collec<on § Mul<-‐cycle trajectory
• Data Product Delivery § Diverse audiences § Mul<-‐mode communica<on
5
Collec<on of Sea Ice Data Products
Redrawn circa 2010 from original work by Donna Scoa, who manages the NSIDC Passive Microwave Product Team.
Preliminary – gold box
Source – brown box Final – green hexagon
Near real-‐<me – blue oval Value added – red octagon
6
NSIDC-‐0081 Near-‐Real-‐Time DMSP SSM/I Daily Polar Gridded Sea Ice
Concentra<ons
Remote Sensing Systems F17 Tbs (Wentz)
NSIDC-‐001 SSM/I Polar Gridded Tbs
NSIDC-‐0051 Preliminary Sea Ice Concentra<ons from
Nimbus-‐7 SSMR and DMSP SSM/I
NSIDC-‐0051 Sea Ice Concentra<ons from Nimbus-‐7 SSMR and DMSP
SSM/I
G02135 Sea Ice index
Arc<c Sea Ice News and Analysis
From the Sea Ice Data Products Collec<on
Preliminary – gold box
Source – brown box Final – green hexagon
Near real-‐<me – blue oval Value added – red octagon
Data Product Teams
Roles -‐ Skill Sets • Data managers • Programmers • Technical writers • Scien<sts • Instrument engineers • Science communicators • Systems/Database managers • User support specialists
8
Data Product Team Intermediaries
Roles -‐ Skill Sets • Data managers • Programmers • Technical writers • Scien<sts • Instrument engineers • Science communicators • Systems/Database managers • User support specialists
“This ac<ve human element of data management is not always recognized by funding agencies, nor is it explicit in the OAIS Reference Model …” – Parsons and Duerr, 2005
Parsons, M. A., & Duerr, R. (2005). Designa<ng user communi<es for scien<fic data: challenges and solu<ons. Data Science Journal, 4, 31-‐38.
Intermediaries
9
OAIS Reference Model
A Narra<ve Framework: Open Archive Informa<on System
OAIS Archive
Ingest Access
Archive
Data Mgmt
Administration
Producer
Preservation Planning
Consumer
MANAGEMENT
SIP
AIP AIP
DIP
Descriptive Information
Descriptive Information
Func4onal model
CCSDS. (2012). Consulta<ve Commiaee for Space Data Systems, Reference Model for an Open Archival Informa<on System (OAIS). Washington DC: CCSDS 650.0-‐M-‐2, Magenta Book. Issue 2. June 2012.
10
OAIS Reference Model
Informa4on Package Concepts
CCSDS. (2012). Consulta<ve Commiaee for Space Data Systems, Reference Model for an Open Archival Informa<on System (OAIS). Washington DC: CCSDS 650.0-‐M-‐2, Magenta Book. Issue 2. June 2012.
Submission Informa<on Package
Preserva<on Informa<on Package
Dissemina<on Informa<on Package
SIP
PIP
DIP
11
OAIS Reference Model
OAIS Archive Responsibili4es
CCSDS. (2012). Consulta<ve Commiaee for Space Data Systems, Reference Model for an Open Archival Informa<on System (OAIS). Washington DC: CCSDS 650.0-‐M-‐2, Magenta Book. Issue 2. June 2012.
• Nego<ate for and accept informa<on • Obtain sufficient control to ensure long-‐term preserva<on • Designate one or more communi<es as designated audience
who should be able to understand what is • Ensure that the informa<on is independently understandable to them • Follow documented procedures and policies for data preserva<on and access • Make the informa<on available with evidence suppor<ng its authen<city
haps://public.ccsds.org
12
The Data Landscape: In Development
Data System
Informa<on System Data Repository Data Archive
Dataset Data set
Data Package Metadata
repositories web of
Data Data Element & Interconnec<ons
13
Discussion Points
• Data Product Descrip<on ü Collec<on of data products ü Data product teams
• Data Product Development § Mul<-‐level collec<on § Mul<-‐cycle trajectory
• Data Product Delivery § Diverse audiences § Mul<-‐mode communica<on
14
Sea Ice Data Products: Dependencies & Levels
15
Levels of Data Products
16
Con<nuing Development of Data Products
17
Figure 2. A simplified view of the con<nuing development of scien<fic data products. Each cycle is ini<ated by one or more events that create a new audience that leads to genera<on of a new data product in response to the needs of a recently iden<fied designated user community.
Data Products: Mul<-‐cycle Trajectory
18
Discussion Points
• Data Product Descrip<on ü Collec<on of data products ü Data product teams
• Data Product Development ü Mul<-‐level collec<on ü Mul<-‐cycle trajectory
• Data Product Delivery § Diverse audiences § Mul<-‐mode communica<on
19
To a remote sensing community, the world is: • Large-‐scale earth coverage using well-‐defined plaoorms • A series of images with gridded pixels that can be manipulated
computa<onally
To ecologists, the world is: • A set of observa<ons/measurements captured as parameters such as
temperature and popula<on counts • A system of interac<ng systems with dependencies among the
parameters that vary con<nuously
To the public, the world is: • The place within which their neighborhood resides • A place where decision-‐making is increasing in complexity due to the
interdependencies of natural systems and human systems
* following Mark Parsons, Ben Domenico, and Stefano Na<vi
Who is the audience? What is their worldview?
20
Greenland Ice Sheet Melt Data Products
21
Knowledge Mobilized via Data Product Genera<on
1. Data workforce and data work are changing • Data product descrip<on
ü Collec<on of data products ü Data product teams
2. Data products gain value curated as a con<nuing collec<on • Data product development
ü Mul<-‐level collec<on ü Mul<-‐cycle trajectory
3. Data product delivery takes many forms • Data product delivery
ü Diverse audiences ü Mul<-‐mode communica<on
22
Developing the Workforce for Data
NRC (2015). Preparing the Workforce for Digital Cura<on: Commiaee on Future Career Opportuni<es and Educa<onal Requirements for Digital Cura<on; Board on Research Data and Informa<on; Policy and Global Affairs.
23
Developing Workforce for Data Work
Making the time to tell the story
… to multiple audiences
… in multiple formats
… with multiple intermediaries
24
Karen Baker [email protected]
25
Karen Baker [email protected]
Acknowledgement: Data Cura<on Educa<on in Research Centers (DCERC) project, funded by the Ins<tute of Museum and Library Services (RE-‐02-‐10-‐0004-‐10), co-‐led by Carole Palmer. Par<cipants at the Na<onal Snow and Ice Data Center including Donna Scoa who manages the NSIDC Passive Microwave Product Team.
26