christine borgman keynote
DESCRIPTION
RDA Fourth Plenary Keynote - Prof. Christine L. Borgman, Professor Presidential Chair in Information Studies at UCLA: "Data, Data, Everywhere, Nor Any Drop to Drink." Tuesday 23rd Sept 2014, Amsterdam, the Netherlands https://rd-alliance.org/plenary-meetings/fourth-plenary/plenary4-programme.htmlTRANSCRIPT
Data, Data, Everywhere, Nor Any Drop to Drink
Keynote presenta6on Research Data Alliance, Fourth Plenary Mee6ng Amsterdam, September 2014
Chris6ne L. Borgman Professor and Presiden6al Chair in Informa6on Studies University of California, Los Angeles
Gustave Dore, Rime of the Ancient Mariner, Woodcut, 1798
Gustave Dore, Ancient Mariner Illustra6on, 1798
Day aSer day, day aSer day, We stuck, nor breath nor mo6on; As idle as a painted ship Upon a painted ocean. Water, water, every where, And all the boards did shrink; Water, water, every where, Nor any drop to drink.
Stanzas from The Rime of the Ancient Mariner Samuel Taylor Coleridge, 1798
Big Data, LiVle Data, No Data: Scholarship in the Networked World*
• Part I: Data and Scholarship – Ch 1: Provoca6ons – Ch 2: What Are Data? – Ch 3: Data Scholarship – Ch 4: Data Diversity
• Part II: Case Studies in Data Scholarship – Ch 5: Data Scholarship in the Sciences – Ch 6: Data Scholarship in the Social Sciences – Ch 7: Data Scholarship in the Humani6es
• Part III: Data Policy and Prac6ce – Ch 8: Releasing, Sharing, and Reusing Data – Ch 9: Credit, AVribu6on, and Discovery – Ch 10: What to Keep and Why
4 *C. L. Borgman (2015, January) MIT Press
Neelie Kroes, VP European Commission:
• To collect, curate, preserve and make available ever-‐increasing amounts of scien6fic data, new types of infrastructures will be needed. The poten6al benefits are enormous but the same is true for the costs. We therefore need to lay the right founda6ons and the sooner we start the beVer.
5
Wood, J., Andersson, T., Bachem, A., Best, C., Genova, F., Lopez, D. R., … Hudson, R. L. (2010). Riding the wave: How Europe can gain from the rising <de of scien<fic data. Final report of the High Level Expert Group on Scien6fic Data. Retrieved from hVp://cordis.europa.eu/fp7/ict/e-‐infrastructure/docs/hlg-‐sdi-‐report.pdf
Precondi6on:
Researchers share data 6
Scholars’ perspec6ves on data sharing
• Rewards • Responsibility • Data • Incen6ves
7
Persistent URL: photography.si.edu/SearchImage.aspx?id=5799 Repository: Smithsonian Ins6tu6on Archives
Rewards
• Publica6ons • Publica6ons • Publica6ons • Publica6ons • Publica6ons • Publica6ons • Grants • Awards and honors • Teaching • Service • Data
hVp://blog.starjreshtoday.com/Portals/170402/images/improve-‐credit-‐score1.jpg
Func6ons of Scholarly Publica6ons
• Legi6miza6on – Authority, quality – Priority, trustworthiness
• Dissemina6on – Awareness – Diffusion – Publicity
• Access, preserva6on, cura6on – Availability – Discovery – Retrieval – Persistence
Borgman, C.L. (2007). Scholarship in the Digital Age: Informa6on, Infrastructure, and the Internet. MIT Press.
Scholars’ perspec6ves on data sharing
• Rewards • Responsibility • Data • Incen6ves
10
Persistent URL: photography.si.edu/SearchImage.aspx?id=5799 Repository: Smithsonian Ins6tu6on Archives
Responsibility
Publica6ons are arguments made by authors, and data are the evidence used to support the arguments.
C.L. Borgman (2015). Big Data, LiBle Data, No Data: Scholarship in the Networked World. MIT Press
Responsibility
• Publica6ons – Independent units – Authorship is nego6ated
• Data – Compound objects – Ownership is rarely clear – AVribu6on
• Long term responsibility: Inves6gators • Exper6se for interpreta6on: Data collectors and analysts
hudsonalpha.org
AVribu6on of data • Legal responsibility
– Licensed data – Specific aVribu6on required
• Scholarly credit: contributorship – “Author” of data – Contributor of data to this publica6on – Colleague who shared data – SoSware developer – Data collector – Instrument builder – Data curator – Data manager – Data scien6st – Field site staff – Data calibra6on – Data analysis, visualiza6on – Funding source – Data repository – Lab director – Principal inves6gator – University research office – Research subjects – Research workers, e.g., ci6zen science… 13
Scholars’ perspec6ves on data sharing
• Rewards • Responsibility • Data • Incen6ves
14
Persistent URL: photography.si.edu/SearchImage.aspx?id=5799 Repository: Smithsonian Ins6tu6on Archives
http://www.census.gov/population/cen2000/map02.gif
What are data?
ncl.ucar.edu http://onlineqda.hud.ac.uk/Intro_QDA/Examples_of_Qualitative_Data.php
Marie Curie’s notebook aip.org
hudsonalpha.org
NASA Astronomy Picture of the Day
15
16
17
Center for Embedded Networked Sensing
18
• NSF Science & Tech Ctr, 2002-‐2012 • 5 universi6es, plus partners • 300 members • Computer science and engineering • Science applica6on areas
Slide by Jason Fisher, UC-Merced, Center for Embedded Networked Sensing (CENS)
UCLA USC UCR CALTECH UCM CENTER FOR EMBEDDED NETWORKED SENSING
Sensor Collected Application Data
Sensor Collected Proprioceptive Data
Sensor Collected Performance Data
Hand Collected Application Data
Flow
Water depth
Ammonium
Ammonia Phosphate
Water temp
pH
Temperature
Conductivity
Chlorophyll
GPS/location Time
Sap flow
CO2
Humidity
Rainfall Packets transmitted
Packets received ORP
PAR
Motor speed
Rudder angle
Heading
Roll/pitch/yaw Soil moisture
Nitrate
Calcium
Chloride
Water potential
Wind speed
Wind direction
Wind duration
Leaf wetness
Routing table
Neighbor table
Fault detection
Awake time
Organism presence
Organism concentration
Battery voltage
Mercury
Methylmercury
Nutrient concentration
Nutrient presence
LandSat images Mosscam
CDOM
Bird calls
CENS data variation
Borgman, et al. (2007). Drowning in data: Digital library architecture to support scientific use of embedded sensor networks. JCDL
Documen6ng Data for Interpreta6on
Engineering researcher: “Temperature is temperature.”
Biologist: “There are hundreds of ways to measure temperature. ‘The temperature is 98’ is low-‐value compared to, ‘the temperature of the surface, measured by the infrared thermopile, model number XYZ, is 98.’ That means it is measuring a proxy for a temperature, rather than being in contact with a probe, and it is measuring from a distance. The accuracy is plus or minus .05 of a degree. I [also] want to know that it was taken outside versus inside a controlled environment, how long it had been in place, and the last <me it was calibrated, which might tell me whether it has driYed.." CENS Robo6cs team
Center for Dark Energy Biosphere Inves6ga6ons
Repository for seafloor cores. Photo: Peter Darch
Interna6onal Ocean Discovery Program Iodp.tamu.org
• NSF Science & Tech Ctr, 2010-‐2020 • 20 universi6es, plus partners (35 ins6tu6ons) • 90 scien6sts • Biological sciences • Physical sciences 21
Social science data
hVp://dss.princeton.edu/images/gss.gif 22
Social science data
hVp://dss.princeton.edu/images/gss.gif 23
24
Data are representa6ons of observa6ons, objects, or other en66es used as evidence of phenomena for the purposes of research or scholarship.
C.L. Borgman (2015). Big Data, LiBle Data, No Data: Scholarship in the Networked World. MIT Press
hudsonalpha.org
Scholars’ perspec6ves on data sharing
• Rewards • Responsibility • Data • Incen6ves
25
Persistent URL: photography.si.edu/SearchImage.aspx?id=5799 Repository: Smithsonian Ins6tu6on Archives
Incen6ves
• Publica6ons that report the research Vs. • Data that are reusable by others
Image: Alyssa Goodman, Harvard Astronomy 26
27 Pepe, A., Mayernik, M. S., Borgman, C. L. & Van de Sompel, H. (2010). From Ar6facts to Aggrega6ons: Modeling Scien6fic Life Cycles on the Seman6c Web. Journal of the American Society for Informa6on Science and Technology, 61(3): 567–582.
Metadata
• Metadata is structured informa6on that describes, explains, locates, or otherwise makes it easier to retrieve, use, or manage an informa6on resource.* – descrip6ve – structural – administra6ve
*Na6onal Informa6on Standards Organiza6on 2004 photo by @kissane
Provenance
• Libraries: Origin or source • Museums: Chain of custody • Internet: Provenance is informa6on about en66es, ac6vi6es, and people involved in producing a piece of data or thing, which can be used to form assessments about its quality, reliability or trustworthiness.*
*World Wide Web Consor6um (W3C) Provenance working group
Bri6sh Library, provenance record: Bes6ary -‐ cap6on: 'Owl mobbed by smaller birds'
• Reuse by inves6gator • Reuse by collaborators • Reuse by colleagues • Reuse by unaffiliated others • Reuse at later 6mes
– Months – Years – Decades – Centuries
hVp://chandra.harvard.edu/photo/2013/kepler/kepler_525.jpg
Reuse across place and 6me
30
Gustave Dore, Ancient Mariner Illustra6on, 1798
Day aSer day, day aSer day, We stuck, nor breath nor mo6on; As idle as a painted ship Upon a painted ocean. Water, water, every where, And all the boards did shrink; Water, water, every where, Nor any drop to drink.
Stanzas from The Rime of the Ancient Mariner Samuel Taylor Coleridge, 1798
Emerging themes in data prac6ces
• Scarcity or abundance of data • Centrality of data to research • Time frame of research • Heterogeneity of exper6se • Maturity of standards • Community building
Borgman, C. L., et al. (2014). The Ups and Downs of Knowledge Infrastructures in Science: Implica6ons for Data Management. IEEE/ACM Digital Libraries Conference, London
Economics of the Knowledge Commons
33
Subtractability / Rivalry
Low High
Exclusion
Difficult
Public Goods General knowledge Public domain data
Common-‐pool resources Libraries Data archives
Easy Toll or Club Goods Subscrip6on journals Subscrip6on data
Private Goods Printed books Raw or compe66ve data
Adapted from C. Hess & E. Ostrom (Eds.), Understanding knowledge as a commons: From theory to prac<ce. MIT Press.
hVp://environment.na6onalgeographic.com/environment/habitats/water-‐pressure/
To share data, scholars need
• Fresh water – Tools – Services – Skills – Resources – Incen6ves
To share data, scholars need
• Life boats – Repositories – Governance models – Provenance models – Data stewardship workforce
Patent Model, Life Boat, 1841; Smithsonian American History Museum
Image: Alyssa Goodman, Harvard Astronomy
Knowledge Infrastructures
36
hVp://know
ledgeinfrastructures.org
Acknowledgements UCLA Data Practices team • Peter Darch, Milena Golshan, Irene
Pasquetto, Ashley Sands, Sharon Traweek
• Former members: Rebekah
Cummings, David Fearon, Ariel Hernandez, Elaine Levia, Jaklyn Nunga, Matthew Mayernik, Alberto Pepe, Kalpana Shankar, Katie Shilton, Jillian Wallis, Laura Wynholds, Kan Zhang
• Research funding: National Science Foundation, Alfred P. Sloan Foundation, Microsoft Research
• University of Oxford: Balliol College, Oliver Smithies Fellowship, Oxford Internet Institute, Oxford eResearch Center, Bodleian Library