from darpa to shakespeare: all the data we can handle
DESCRIPTION
Big Data and Digital Humanities overview presented to CUA LSC874 Digital Humanities Class February 2014.TRANSCRIPT
![Page 1: From DARPA to Shakespeare: All the Data we Can Handle](https://reader034.vdocuments.us/reader034/viewer/2022051816/54630038b1af9f92238b5243/html5/thumbnails/1.jpg)
![Page 2: From DARPA to Shakespeare: All the Data we Can Handle](https://reader034.vdocuments.us/reader034/viewer/2022051816/54630038b1af9f92238b5243/html5/thumbnails/2.jpg)
From DARPA to Shakespeare and all the data we can handle
Big Data and Digital Humanities
February 2014
http://www.darpa.mil/newsevents/releases/2012/03/29.aspx
![Page 3: From DARPA to Shakespeare: All the Data we Can Handle](https://reader034.vdocuments.us/reader034/viewer/2022051816/54630038b1af9f92238b5243/html5/thumbnails/3.jpg)
1. Big Data 2. Libraries & Librarians 3. University Researchers & Beyond 4. Digital Humanities
![Page 4: From DARPA to Shakespeare: All the Data we Can Handle](https://reader034.vdocuments.us/reader034/viewer/2022051816/54630038b1af9f92238b5243/html5/thumbnails/4.jpg)
1. Big Data
![Page 5: From DARPA to Shakespeare: All the Data we Can Handle](https://reader034.vdocuments.us/reader034/viewer/2022051816/54630038b1af9f92238b5243/html5/thumbnails/5.jpg)
High-Performance Computing (HPC) Act of 1991 (Public Law 102-194)
as amended by the Next Generation Internet Research Act of 1998 (Public Law
105-305) and America COMPETES Act of 2007 (Public Law 110-69).
It’s the law!
These laws authorize Federal agencies to set goals, prioritize their investments, and coordinate their activities in networking and information technology research and development.
George O. Strawn NITRD Networking and Information Technology Research and Development (NITRD) Program
From : Hot Topics in Big Data: What You Need to Know Now!
FEDLINK, NFAIS, CENDI; December 11, 2012
![Page 6: From DARPA to Shakespeare: All the Data we Can Handle](https://reader034.vdocuments.us/reader034/viewer/2022051816/54630038b1af9f92238b5243/html5/thumbnails/6.jpg)
![Page 7: From DARPA to Shakespeare: All the Data we Can Handle](https://reader034.vdocuments.us/reader034/viewer/2022051816/54630038b1af9f92238b5243/html5/thumbnails/7.jpg)
Big data... is a mystery is a child of the internet
Big Data has grown from...
CPU's of information Disks of information
...to Networks of information Sensors everywhere
George O. Strawn NITRD
![Page 8: From DARPA to Shakespeare: All the Data we Can Handle](https://reader034.vdocuments.us/reader034/viewer/2022051816/54630038b1af9f92238b5243/html5/thumbnails/8.jpg)
Urban computing also aims to deeply understand the nature and sciences behind the phenomenon occurring in urban spaces, using a variety of heterogeneous data sources, such as traffic flows, human mobility, geographic and map data, environment, energy consumption, populations, and economics, etc. Recently, real-world data reflecting city dynamics becomes widely available, including, e.g., users’ mobile phone signal, GPS traces of vehicles and people, ticketing data in public transportation systems, user-generated content (like tweets, micro-blog, check-ins, photos), data from transportation sensor networks (camera and loop sensors) and environment sensor networks (temperature and air quality), as well as data from the Internet of Things. http://www.meetup.com/UrbanComputing/
Smart Cities
![Page 9: From DARPA to Shakespeare: All the Data we Can Handle](https://reader034.vdocuments.us/reader034/viewer/2022051816/54630038b1af9f92238b5243/html5/thumbnails/9.jpg)
Examples of big data: • Electronic Health Records • Text vs tables • Textual analytics TEI • Sentiment analysis - FB posts, Twitter • Distributed data, distributed computing • Atmospheric sensors, undersea sensors • Hubble telescope • Library ERM
![Page 10: From DARPA to Shakespeare: All the Data we Can Handle](https://reader034.vdocuments.us/reader034/viewer/2022051816/54630038b1af9f92238b5243/html5/thumbnails/10.jpg)
Big Data & Science... • Analyzing output from simulations • Analyzing instrument output - LHC, Curiosity • Creating DB's to support wide collaboration: Human Genome Project • Creating Knowledge Bases from textural information:
Semantic Medline • Proteomics will be bigger than genomics
How do you move 100TB of information within a University or a research area?
![Page 11: From DARPA to Shakespeare: All the Data we Can Handle](https://reader034.vdocuments.us/reader034/viewer/2022051816/54630038b1af9f92238b5243/html5/thumbnails/11.jpg)
http://www.ibmbigdatahub.com/infographic/four-vs-big-data
![Page 12: From DARPA to Shakespeare: All the Data we Can Handle](https://reader034.vdocuments.us/reader034/viewer/2022051816/54630038b1af9f92238b5243/html5/thumbnails/12.jpg)
Experimental Science Theoretical Science Computational Science Data Science - Big Data
4th Paradigm of Science
![Page 13: From DARPA to Shakespeare: All the Data we Can Handle](https://reader034.vdocuments.us/reader034/viewer/2022051816/54630038b1af9f92238b5243/html5/thumbnails/13.jpg)
From bits to its... Does the world consist of ... matter, energy and information? Newton - matter and motion Steam engine - thermodynamics, matter, energy Computer - science of information, matter, energy and information Data intensive science is revolutionary science
Big Data is TOO BIG To KNOW! The dust hasn't settled; dust is swirling all around us; it is FUN dust! George O. Strawn
![Page 14: From DARPA to Shakespeare: All the Data we Can Handle](https://reader034.vdocuments.us/reader034/viewer/2022051816/54630038b1af9f92238b5243/html5/thumbnails/14.jpg)
See presentation: Philosophy & Big Data: Big Data, the Individual, and Society by Melanie Swan January 24, 2013 http://www.slideshare.net/lablogga/philosophy-and-big-data-big-data-the-individual-and-society
![Page 15: From DARPA to Shakespeare: All the Data we Can Handle](https://reader034.vdocuments.us/reader034/viewer/2022051816/54630038b1af9f92238b5243/html5/thumbnails/15.jpg)
![Page 16: From DARPA to Shakespeare: All the Data we Can Handle](https://reader034.vdocuments.us/reader034/viewer/2022051816/54630038b1af9f92238b5243/html5/thumbnails/16.jpg)
2. Libraries & Librarians 3. University Researchers (YOU) & Beyond
![Page 17: From DARPA to Shakespeare: All the Data we Can Handle](https://reader034.vdocuments.us/reader034/viewer/2022051816/54630038b1af9f92238b5243/html5/thumbnails/17.jpg)
http://d2c2.lib.purdue.edu/publications
Purdue University D. Scott Brandt and Jake Carlson
![Page 18: From DARPA to Shakespeare: All the Data we Can Handle](https://reader034.vdocuments.us/reader034/viewer/2022051816/54630038b1af9f92238b5243/html5/thumbnails/18.jpg)
Michael Furlough Associate Dean for Research and Scholarly Communications Penn State University Libraries
Libraries roles and challenges: Libraries will have to operate on faith Libraries will need deep collaboration
![Page 19: From DARPA to Shakespeare: All the Data we Can Handle](https://reader034.vdocuments.us/reader034/viewer/2022051816/54630038b1af9f92238b5243/html5/thumbnails/19.jpg)
Librarians - new roles Instruction - Best Practices
Data Information Literacy Collaborate - DMP & more
Data Management Plans Preserving/curating research
DO Manage - RDS Services
Keeping up!
![Page 20: From DARPA to Shakespeare: All the Data we Can Handle](https://reader034.vdocuments.us/reader034/viewer/2022051816/54630038b1af9f92238b5243/html5/thumbnails/20.jpg)
Conversion & Interoperability Cultures of Practice Databases & Data Formats Data Curation & Reuse Data Management & Organization Data Processing & Analysis Data Quality & Documentation Discovery & Acquisition Ethics & Attribution Metadata & Data Description Preservation Visualization & Representation See more at: Data Information Literacy Competencies http://wiki.lib.purdue.edu/display/ste/Materials+for+the+DIL+Symposium
Data is
information
![Page 21: From DARPA to Shakespeare: All the Data we Can Handle](https://reader034.vdocuments.us/reader034/viewer/2022051816/54630038b1af9f92238b5243/html5/thumbnails/21.jpg)
Librarians - new roles Instruction - Best Practices
Data Information Literacy Collaborate - DMP & more
Data Management Plans Preserving/curating research
DO Manage - RDS Services
Keeping up!
![Page 22: From DARPA to Shakespeare: All the Data we Can Handle](https://reader034.vdocuments.us/reader034/viewer/2022051816/54630038b1af9f92238b5243/html5/thumbnails/22.jpg)
Build on successes MANTRA - Research Management Data Training http://datalib.edina.ac.uk/mantra/ Data Management Course 2014 - University 0f Minnesota https://sites.google.com/a/umn.edu/data-management-workshop-series/ Data Train http://archaeologydataservice.ac.uk/learning/DataTrain#section-DataTrain-AimsObjectives
![Page 23: From DARPA to Shakespeare: All the Data we Can Handle](https://reader034.vdocuments.us/reader034/viewer/2022051816/54630038b1af9f92238b5243/html5/thumbnails/23.jpg)
See Data Managment Modules from University of Minnesota Lisa Johnston https://sites.google.com/a/umn.edu/data-management-workshop-series/module1
![Page 24: From DARPA to Shakespeare: All the Data we Can Handle](https://reader034.vdocuments.us/reader034/viewer/2022051816/54630038b1af9f92238b5243/html5/thumbnails/24.jpg)
http://www.oucs.ox.ac.uk/oxgarage/
![Page 25: From DARPA to Shakespeare: All the Data we Can Handle](https://reader034.vdocuments.us/reader034/viewer/2022051816/54630038b1af9f92238b5243/html5/thumbnails/25.jpg)
Librarians - new roles Instruction - Best Practices
Data Information Literacy Collaborate - DMP & more
Data Management Plans Preserving/curating research
DO Manage - RDS Services
Keeping up!
![Page 26: From DARPA to Shakespeare: All the Data we Can Handle](https://reader034.vdocuments.us/reader034/viewer/2022051816/54630038b1af9f92238b5243/html5/thumbnails/26.jpg)
What do researchers care about? Where can I put my stuff? What is a data management plan?
Data needs to be... • available • findable • re-usable • citable
![Page 27: From DARPA to Shakespeare: All the Data we Can Handle](https://reader034.vdocuments.us/reader034/viewer/2022051816/54630038b1af9f92238b5243/html5/thumbnails/27.jpg)
![Page 28: From DARPA to Shakespeare: All the Data we Can Handle](https://reader034.vdocuments.us/reader034/viewer/2022051816/54630038b1af9f92238b5243/html5/thumbnails/28.jpg)
![Page 29: From DARPA to Shakespeare: All the Data we Can Handle](https://reader034.vdocuments.us/reader034/viewer/2022051816/54630038b1af9f92238b5243/html5/thumbnails/29.jpg)
DO
![Page 30: From DARPA to Shakespeare: All the Data we Can Handle](https://reader034.vdocuments.us/reader034/viewer/2022051816/54630038b1af9f92238b5243/html5/thumbnails/30.jpg)
DataNet from NSF http://datafed.org/
Digital Preservation from the LoC
http://www.digitalpreservation.gov/ HathiTrust Digital Library
http://www.hathitrust.org/ Digital Preservation Network
http://www.dpn.org/
![Page 31: From DARPA to Shakespeare: All the Data we Can Handle](https://reader034.vdocuments.us/reader034/viewer/2022051816/54630038b1af9f92238b5243/html5/thumbnails/31.jpg)
Title: State of Sustainability Practices among Minnesota Tourism Businesses, 2007-2013 Authors: Qian, Xinyi (Lisa) Schneider, Ingrid E.
![Page 32: From DARPA to Shakespeare: All the Data we Can Handle](https://reader034.vdocuments.us/reader034/viewer/2022051816/54630038b1af9f92238b5243/html5/thumbnails/32.jpg)
Title: Public-Use Data from the Obstetrics and Periodontal Therapy (OPT) Study, a randomized trial of periodontal therapy to prevent pre-term birth Authors: Hodges, James S. Michalowicz, Bryan S.
![Page 33: From DARPA to Shakespeare: All the Data we Can Handle](https://reader034.vdocuments.us/reader034/viewer/2022051816/54630038b1af9f92238b5243/html5/thumbnails/33.jpg)
Title: "Laundry Soap" from the Ojibwe Conversational Archives Project Authors: Hermes, Mary Tainter, Rose Kingbird-Porter, Margaret
![Page 34: From DARPA to Shakespeare: All the Data we Can Handle](https://reader034.vdocuments.us/reader034/viewer/2022051816/54630038b1af9f92238b5243/html5/thumbnails/34.jpg)
https://www.lib.umn.edu/datamanagement/archiving
![Page 35: From DARPA to Shakespeare: All the Data we Can Handle](https://reader034.vdocuments.us/reader034/viewer/2022051816/54630038b1af9f92238b5243/html5/thumbnails/35.jpg)
https://www.lib.umn.edu/datamanagement/archiving
![Page 36: From DARPA to Shakespeare: All the Data we Can Handle](https://reader034.vdocuments.us/reader034/viewer/2022051816/54630038b1af9f92238b5243/html5/thumbnails/36.jpg)
Librarians - new roles Instruction - Best Practices
Data Information Literacy Collaborate - DMP & more
Data Management Plans Preserving/curating research
DO Manage - RDS Services
Keeping up!
![Page 37: From DARPA to Shakespeare: All the Data we Can Handle](https://reader034.vdocuments.us/reader034/viewer/2022051816/54630038b1af9f92238b5243/html5/thumbnails/37.jpg)
Research Data Services University of Minnesota https://www.lib.umn.edu/datamanagement/archiving George Mason University http://dataservices.gmu.edu/resources/data-management University of Maryland http://www.lib.umd.edu/data
![Page 38: From DARPA to Shakespeare: All the Data we Can Handle](https://reader034.vdocuments.us/reader034/viewer/2022051816/54630038b1af9f92238b5243/html5/thumbnails/38.jpg)
For all links please see: http://guides.lib.cua.edu/hoffman [tab] BigData Keeping Research Data Safe http://www.beagrie.com/krds.php
![Page 39: From DARPA to Shakespeare: All the Data we Can Handle](https://reader034.vdocuments.us/reader034/viewer/2022051816/54630038b1af9f92238b5243/html5/thumbnails/39.jpg)
4. Digital Humanities WHY?
![Page 40: From DARPA to Shakespeare: All the Data we Can Handle](https://reader034.vdocuments.us/reader034/viewer/2022051816/54630038b1af9f92238b5243/html5/thumbnails/40.jpg)
4. Digital Humanities ...Using data to tell our story
![Page 41: From DARPA to Shakespeare: All the Data we Can Handle](https://reader034.vdocuments.us/reader034/viewer/2022051816/54630038b1af9f92238b5243/html5/thumbnails/41.jpg)
Data Visualization Catalog
http://blog.visual.ly/the-data-visualization-catalogue/
![Page 42: From DARPA to Shakespeare: All the Data we Can Handle](https://reader034.vdocuments.us/reader034/viewer/2022051816/54630038b1af9f92238b5243/html5/thumbnails/42.jpg)
Visualization
http://www.edwardtufte.com/tufte/posters http://www.masswerk.at/minard/ http://vannevar.blogspot.com/2009/03/minard-napolean-russia-1812-best-chart.html
![Page 43: From DARPA to Shakespeare: All the Data we Can Handle](https://reader034.vdocuments.us/reader034/viewer/2022051816/54630038b1af9f92238b5243/html5/thumbnails/43.jpg)
http://research.google.com/bigpicture/music/?utm_content=buffer662d6&utm_medium=social&utm_source=twitter.com&utm_campaign=buffer#
![Page 44: From DARPA to Shakespeare: All the Data we Can Handle](https://reader034.vdocuments.us/reader034/viewer/2022051816/54630038b1af9f92238b5243/html5/thumbnails/44.jpg)
http://www.ucl.ac.uk/infostudies/melissa-terras/DigitalHumanitiesInfographic.pdf
![Page 45: From DARPA to Shakespeare: All the Data we Can Handle](https://reader034.vdocuments.us/reader034/viewer/2022051816/54630038b1af9f92238b5243/html5/thumbnails/45.jpg)
http://www.folgerdigitaltexts.org/
![Page 46: From DARPA to Shakespeare: All the Data we Can Handle](https://reader034.vdocuments.us/reader034/viewer/2022051816/54630038b1af9f92238b5243/html5/thumbnails/46.jpg)
![Page 47: From DARPA to Shakespeare: All the Data we Can Handle](https://reader034.vdocuments.us/reader034/viewer/2022051816/54630038b1af9f92238b5243/html5/thumbnails/47.jpg)
Geography of the London Ballad Trade 1500-1700 http://ebba.english.ucsb.edu/balladprintersite/LBP_main.html World War I Document Archive http://www.gwpda.org/
![Page 48: From DARPA to Shakespeare: All the Data we Can Handle](https://reader034.vdocuments.us/reader034/viewer/2022051816/54630038b1af9f92238b5243/html5/thumbnails/48.jpg)
![Page 49: From DARPA to Shakespeare: All the Data we Can Handle](https://reader034.vdocuments.us/reader034/viewer/2022051816/54630038b1af9f92238b5243/html5/thumbnails/49.jpg)
Examples and Tools for DH projects http://miriamposner.com/blog/how-did-they-make-that/#more-1571
ScrollKit https://www.scrollkit.com/
![Page 51: From DARPA to Shakespeare: All the Data we Can Handle](https://reader034.vdocuments.us/reader034/viewer/2022051816/54630038b1af9f92238b5243/html5/thumbnails/51.jpg)
![Page 52: From DARPA to Shakespeare: All the Data we Can Handle](https://reader034.vdocuments.us/reader034/viewer/2022051816/54630038b1af9f92238b5243/html5/thumbnails/52.jpg)
![Page 53: From DARPA to Shakespeare: All the Data we Can Handle](https://reader034.vdocuments.us/reader034/viewer/2022051816/54630038b1af9f92238b5243/html5/thumbnails/53.jpg)
Examples of TEI: American Memory (uses a TEI-conformant DTD) http://memory.loc.gov/ammem/index.html Early Canada Online http://www.canadiana.org/
Victorian Women Writers Project http://www.indiana.edu/~letrs/vwwp/index.html
Oxford Text Archive http://ota.ahds.ac.uk/
![Page 54: From DARPA to Shakespeare: All the Data we Can Handle](https://reader034.vdocuments.us/reader034/viewer/2022051816/54630038b1af9f92238b5243/html5/thumbnails/54.jpg)
Metadata Standards (UNM) http://libguides.unm.edu/content.php?pid=137795&sid=2556043 Data Formats Types and formats of data HDF http://en.wikipedia.org/wiki/Hierarchical_Data_Format Common Data Format http://cdf.gsfc.nasa.gov/ [Also use of protocol buffers]
![Page 55: From DARPA to Shakespeare: All the Data we Can Handle](https://reader034.vdocuments.us/reader034/viewer/2022051816/54630038b1af9f92238b5243/html5/thumbnails/55.jpg)
NEVER DONE
![Page 56: From DARPA to Shakespeare: All the Data we Can Handle](https://reader034.vdocuments.us/reader034/viewer/2022051816/54630038b1af9f92238b5243/html5/thumbnails/56.jpg)
• Data is information • Libraries can be partners in providing value
- access and analytics • Deep Collaboration - Federal, University,
Business, Researchers/Industry, Future of Research
• Data Policies • Renaissance of Archivists • Librarians as information consultants • Librarians as researchers
![Page 57: From DARPA to Shakespeare: All the Data we Can Handle](https://reader034.vdocuments.us/reader034/viewer/2022051816/54630038b1af9f92238b5243/html5/thumbnails/57.jpg)
Images: Images: http://www.darpa.mil/newsevents/releases/2012/03/29.aspx http://www.darpa.mil/uploadedImages/Content/NewsEvents/Releases/2012/cyber_c.jpg http://www.ibm.com/smarterplanet/ie/en/smarter_cities/overview/index.html?re=CS1 http://upload.wikimedia.org/wikipedia/commons/4/4b/OSU_William_Oxley_Thompson_Memorial_Library_Stacks.JPG http://www.lib.ua.edu/wiki/sura/index.php/Data_Life_Cycle_Models
![Page 58: From DARPA to Shakespeare: All the Data we Can Handle](https://reader034.vdocuments.us/reader034/viewer/2022051816/54630038b1af9f92238b5243/html5/thumbnails/58.jpg)
References
2012/03/29 DARPA calls for advances in big data to help the warfighter. (2012). Retrieved from
http://www.darpa.mil/newsevents/releases/2012/03/29.aspx
Boyle, D. E., Yates, D. C., & Yeatman, E. M. (2013). Urban sensor data streams: London 2013. Internet Computing, IEEE, 17(6), 12-20.
doi:10.1109/MIC.2013.85
Domingo, A., Bellalta, B., Palacin, M., Oliver, M., & Almirall, E. (2013). Public open sensor data: Revolutionizing smart cities. Technology and
Society Magazine, IEEE, 32(4), 50-56. doi:10.1109/MTS.2013.2286421
Gladney, H. M. (2012). Long-term digital preservation: A digital humanities topic? HISTORICAL SOCIAL RESEARCH-HISTORISCHE
SOZIALFORSCHUNG, 37(3), 201-217.
IBM smarter cities - overview - ireland. Retrieved from http://www.ibm.com/smarterplanet/ie/en/smarter_cities/overview/index.html?re=CS1
JADH 2013: ODDly pragmatic: Documenting encoding practices in digital humanities projects by james cummings on prezi. Retrieved from
http://prezi.com/af2auinap-ug/jadh-2013-oddly-pragmatic-documenting-encoding-practices-in-digital-humanities-projects/
Lisa Johnston, Research Data Management and Curation Lead, & University Libraries University of Minnesota -‐ Twin Cities . (2014). A
Workflow Model for Curating Research Data in the University of Minnesota Libraries: Report from the 2013 Data Curation Pilot .
().University Digital of Minnesota Conservancy.
Michael Pepi. (2013). The postmodernity of big data – the new inquiry. Retrieved from http://thenewinquiry.com/essays/the-postmodernity-of-
big-data/
Van den Eynden, V., Corti, L., Woollard, M., Bishop, L., & Horton, L. (2011). Managing and sharing data: Best practice for researchers