notes and letters of support for crowdsourcing ground truth - factminers, prima, & emop knight...

4
Knight Prototype Fund Submission: Nov. 2015 Title: “Turn Text Soup into Smart Data in Newspaper and Magazine Archives: Step 1. Crowdsourcing Ground-Truth” Who We Are Jim Salmons and Timlynn Babitsky, are Iowa-based Citizen Scientists who will serve as Project Coordinators (Principal Investigators) on the proposed prototyping project. We are submitting on behalf of the to-be “non-profit startup” (the most appropriate submission form category) FactMiners.org. This is the legal entity to be created in support of the FactMiners.org Open Source developer community. The custom version of the Zooniverse crowdsourcing platform that will be created by this project, for example, will be among the Open Source and Open Data repositories created and maintained by the FactMiners.org community. The Hackathon Project Method An Open Source “hackathon” – where a highly-motivated, multi-skilled group of creative folks get together to design and write software over two-to-three days of intensive small group coding sessions – is an ideal format for producing the custom version of Zooniverse to be developed for adding Ground-Truth development features to this acclaimed Citizen Science Open Source platform. As a two-event, two-location project design (U. Salford, UK and Texas A&M, USA), these “anchor events” will produce follow-through collaborative activity on-line. In addition, we will ensure that both events are fully streamed for remote participation by interested individuals or groups. A special effort will be made to ensure full-event remote participation by students, faculty, and researchers at the U. of Salford and Texas A&M. A secondary goal of our project is to stimulate the development of “cross- pond” collaborations among Salford & Texas A&M Digital Humanities students and researchers. The full $35,000 USD budget will be used for travel and lodging bursaries and other expenses directly related to the production of these events. These bursaries will be used to get the Core Team members and select VIP Designer/Developer invitees to the two in-person working events. No project staff salaries or required award-servicing project overhead expenses will be incurred. Risk Reduction/Mitigation The Open Source Hackathon project model is itself a risk mitigating feature of this submission. While the ideal format for a hackathon is face-to-face in a shared physical location, this is not an absolute requirement. The exact mix of who goes where, when for the physical events is a to-do planning task. We can say, however, that every effort will be made to work with the event-hosting universities to significantly reduce the direct costs of holding the traditional “lean and simple” hackathon events on campus. Through the additional use of AirBnB, Uber, etc. we can further reduce event expenses. And, if required, we can always supplement our project funding with individual or project-based crowdfunding once the project is visible and promoted.

Upload: jim-salmons

Post on 10-Feb-2017

496 views

Category:

Data & Analytics


1 download

TRANSCRIPT

Knight Prototype Fund Submission: Nov. 2015

Title: “Turn Text Soup into Smart Data in Newspaper and MagazineArchives: Step 1. Crowdsourcing Ground-Truth”

Who We AreJim Salmons and Timlynn Babitsky, are Iowa-based Citizen Scientists who will serve asProject Coordinators (Principal Investigators) on the proposed prototyping project. Weare submitting on behalf of the to-be “non-profit startup” (the most appropriatesubmission form category) FactMiners.org. This is the legal entity to be created insupport of the FactMiners.org Open Source developer community. The custom versionof the Zooniverse crowdsourcing platform that will be created by this project, forexample, will be among the Open Source and Open Data repositories created andmaintained by the FactMiners.org community.

The Hackathon Project MethodAn Open Source “hackathon” – where a highly-motivated, multi-skilled group of creativefolks get together to design and write software over two-to-three days of intensive smallgroup coding sessions – is an ideal format for producing the custom version ofZooniverse to be developed for adding Ground-Truth development features to thisacclaimed Citizen Science Open Source platform. As a two-event, two-location projectdesign (U. Salford, UK and Texas A&M, USA), these “anchor events” will producefollow-through collaborative activity on-line.

In addition, we will ensure that both events are fully streamed for remote participationby interested individuals or groups. A special effort will be made to ensure full-eventremote participation by students, faculty, and researchers at the U. of Salford andTexas A&M. A secondary goal of our project is to stimulate the development of “cross-pond” collaborations among Salford & Texas A&M Digital Humanities students andresearchers.

The full $35,000 USD budget will be used for travel and lodging bursaries and otherexpenses directly related to the production of these events. These bursaries will beused to get the Core Team members and select VIP Designer/Developer invitees to thetwo in-person working events. No project staff salaries or required award-servicingproject overhead expenses will be incurred.

Risk Reduction/MitigationThe Open Source Hackathon project model is itself a risk mitigating feature of thissubmission. While the ideal format for a hackathon is face-to-face in a shared physicallocation, this is not an absolute requirement. The exact mix of who goes where, whenfor the physical events is a to-do planning task.

We can say, however, that every effort will be made to work with the event-hostinguniversities to significantly reduce the direct costs of holding the traditional “lean andsimple” hackathon events on campus. Through the additional use of AirBnB, Uber, etc.we can further reduce event expenses. And, if required, we can always supplement ourproject funding with individual or project-based crowdfunding once the project isvisible and promoted.

Appendix. Letters of SupportWe are very pleased to provide the following letters of support from our research partners.

Initiative for Digital Humanities, Media, and Culture

Serving the Colleges of Architecture, Education,Engineering, and Liberal Arts

446 Liberal Arts and Arts & Humanities Building4227 TAMUCollege Station, TX 77843-4227Email: [email protected]

Tel. 979.458.1552 Fax 979.862.4334http://idhmc.tamu.edu

Laura [email protected]: (979) 845-8345C: (513) 560-7860

November 16, 2015

To Whom It May Concern:

I write to offer support for the Knight Prototype Fund Entry called "Crowd-Sourcing Ground-Truth," and toconfirm our desire to participate in the project. We would like to host a Hackathon and Demo at our Center for2 to 3 days, broadcasting the events as a Webinar, and projecting the Demo on our big screen at the HumanitiesVisualization Space. Texas A & M is home to the Mellon-funded Early Modern OCR Project(http://emop.tamu.edu), and our need for crowd-sourced correction of Ground Truth documents is paramount.We hope through our efforts to prevent the creation of "dark archives" of texts that are not machine-readable:Europeana has issued a report in which they worry that, without cleanly digitized documents, we could enteranother "dark ages." Our goal is to prevent that, and we look forward to working with FactMiners and PRImAon recreating our cultural heritage as readable, smart data.

Sincerely,

Laura MandellDirector, Initiative for Digital Humanities, Media, and CultureProfessor, English