hathitrust a shared digital repository hathitrust: a second life for library collections jeremy york...
TRANSCRIPT
HATHITRUST A Shared Digital Repository
HathiTrust: A Second Life for Library Collections
Jeremy YorkExploring Humanities Cyberinfrastructure
April 30, 2013
PartnershipArizona State UniversityBaylor UniversityBoston CollegeBoston UniversityBrandeis UniversityBrown UniversityCalifornia Digital LibraryCarnegie Mellon UniversityColumbia UniversityCornell UniversityDartmouth CollegeDuke UniversityEmory UniversityFlorida State UniversityGetty Research InstituteHarvard University LibraryIndiana UniversityIowa State UniversityJohns Hopkins UniversityKansas State UniversityLafayette CollegeLibrary of CongressMassachusetts Institute of
TechnologyMcGill University`Michigan State UniversityNew York Public LibraryNew York UniversityNorth Carolina Central
University
North Carolina StateUniversity
Northwestern UniversityThe Ohio State UniversityThe Pennsylvania State
UniversityPrinceton UniversityPurdue UniversityStanford UniversitySyracuse UniversityTexas A&M UniversityTufts UniversityUniversidad Complutense
de MadridUniversity of AlbertaUniversity of ArizonaUniversity of CalgaryUniversity of California
BerkeleyDavisIrvineLos AngelesMercedRiversideSan DiegoSan FranciscoSanta BarbaraSanta Cruz
The University of ChicagoUniversity of ConnecticutUniversity of Delaware
University of FloridaUniversity of HoustonUniversity of IllinoisUniversity of Illinois at ChicagoThe University of IowaUniversity of KansasUniversity of MarylandUniversity of MiamiUniversity of MichiganUniversity of MinnesotaUniversity of MissouriUniversity of Nebraska-LincolnThe University of North
Carolina at Chapel HillUniversity of Notre DameUniversity of PennsylvaniaUniversity of PittsburghUniversity of UtahUniversity of VermontUniversity of VirginiaUniversity of WashingtonUniversity of Wisconsin-
MadisonUtah State UniversityVanderbilt UniversityVirginia TechWake Forest UniversityWashington UniversityYale University Library
Digital Repository
• Launched 2008• Initial focus on digitized book and journal
content– 10.7 million total volumes – 5.6 million book titles– 278,000 serial titles– 3.3 million public domain (~31%)
Full-text SearchReadingCollections
Lawful uses under specific terms and conditions: http://www.hathitrust.org/access_use#ic-access
APIsDatasetsResearch Center
* Photo by ben.gallaher CC-BY http://bit.ly/ZY040K
Research Center Key Ideas
• Bring researchers to the data• Evolve around user demand• Start with public domain materials• Support non-consumptive research
Research Center
• FAQ– http://bit.ly/XkZKev
• Listserv– http://bit.ly/12KB3t7
• Portal: browse volume lists and algorithms, execute algorithms, view results
• Catalog (Blacklight): assemble worksets• Sandbox: can run own algorithms