google books, umi and other intriguing trends in digital publishing joe wible hopkins marine station...
TRANSCRIPT
Google Books, UMI and Other Intriguing Trends in Digital PublishingJoe Wible
Hopkins Marine Station of Stanford University
October 9, 2006
Caveats Weak moment HMS participation in Google Books Project Confidentiality Publishers suing Google
Categories of Materials Journals Books Archival Materials
Categories of Materials JournalsJournals Books
Recently published books Historical book collections Dissertation
Archival Materials
________
__________________
Categories of Materials JournalsJournals Books
Recently published books Historical book collections Dissertation
Archival Materials
________
__________________
Study to Determine Availability of Recently Published Books in
Electronic Format
Books purchased between 9/04 and 2/06 (18 months) Publication years 2002-2006 Stratified random sample
9271 titles (10.2% of total titles)
Sources Searched Netlibrary Ebrary MyiLibrary Questia Overdrive Other (eg. publishers, associations, free-
internet)
Fund Cluster No. Titles English Non-EnglishGeneral Reference 539 510 29US/UK History/Lang/Lit 7345 7104 241All other Area & Language 49679 7651 42028Humanities 18091 9551 8540Interdisciplinary 928 867 61Social Science & Education 7162 5665 1497Sciences 6720 6621 99Media, Reserves 602 601 1Total 91066 38569 (42.4%) 52496 (57.6%)
All Subjects/Languages Pct E-book Available
7.60%
45.10%
19.30%
9.60%
8.10%
0.85%
17.50%
0.10%
All Subjects/Languages
Sciences & Technology
Social Sciences & Education
US/UKHistory/Language&Literature
Humanities
Area Studies
English-language, all subjects
Foreign-language, all subjects
36.40%
36.70%
17.00%
49.50%
28.60%
61.10%
65.80%
45.10%
Biology
Chemistry
Earth Scis
Engineering
Hopkins Marine
Math/CompSci
Physics
All SciTech
SciTech Books Pct E-available
(N = 15)
Categories of Materials Journals Books
Recently published books Historical book collections Dissertation
Archival Materials__________________
________
What would you do with an offer To digitize every book in your library with no
damage to the book, To return to you a digital copy for preservation
and other purposes, and To present you and the world with a combined
word index to millions of books
???
Stanford’s Purposes Digital Preservation
Virtual Bookshelves in Stanford Digital Repository under construction as part of the Stanford Digital Repository
For Stanford use only Other searching and research functions
Subtle searching Taxonomic & Associative Searching Citation linking & “InfoTools” Alerts & recommendations Better navigation, e.g. Grokker
Digitized books from all sources as test bed for new research
Google Books Partner Libraries Original partners
Stanford University of Michigan Harvard New York Public Library Oxford
Later additions University of California Universidad Complutense de Madrid
Mechanics of Google Books 3,000 books scanned per day (UC) Scanning done by hand
http://books.google.com/books?vid=OCLC03812955&id=1GB1kuY5-pkC&pg=PA3&lpg=PA3 http://books.google.com/books?vid
=0sVgqoZH8_0vk2uEA6uPPZ&id=n-28bvRNoroC&pg=RA1-PR1000 http://books.google.com/books?vid=OCLC03812955&id=1GB1kuY5-pkC&pg=PR32
Mechanism to get books back if needed Impact on collection
Better care than most patrons Identifies materials for preservation
U.S. Copyright Basics Published before 1923 - Public Domain Published from 1923 through 1963
After 14 years had to be renewed Approximately 15% renewed (200,000) Remaining 85% in Public Domain
Published after 1963 in copyright Life of the author plus 70 year
Orphan Works http://www.copyright.gov/orphan/ Deals with works in copyright for which the
copyright holder cannot be found Report delivered to Congress
Requires “reasonably diligent search” Need to meet that standard with Determinator Also limits remedies
Unknown when Congress will act
Copyright Determinator -Why? Renewal required for US books published
1923-1963 books not renewed are in the public domain
Renewals took place 1950-1992 No electronic records of renewals 1950-1977 Limited access to 1978-1992 records
Determinator – Database Module Create machine-readable database of 1923-1963
renewals First portion of data now available
http://sulwebappdev1.stanford.edu:4040/determinator/bin/page?forward=home
Hand coding has been outsourced for remaining 1950-1977 records
1978-1992 records to be uploaded soon Match database records to catalog records
Limited fields for matching In discussions with OCLC Will look at internal options if this does not work out
Determinator – Benchmark Module Benchmark database results against manual
search Sample set of >500 items from the Stanford
catalog have been manually searched for status A subset of those items (100) are being checked
by the LOC Need to demonstrate due diligence
Determinator – Legal Review Legal input on validity of results Initial contact has been made with a group of
copyright experts Coordinating with General Counsel
Categories of Materials Journals Books
Recently published books Historical book collections Dissertation
Archival Materials__________________
________
Dissertations in Digital Format ProQuest (UMI) migrating to digital
submission 15% submitted electronically last year 30% submitted electronically this year 25 schools in the queue to switch to digital 1.9 million in microfilm, 800,000 as PDFs
Online Submission Form http://dissertations.umi.com/
Urgent Need to Switch to Digital Submission
ProQuest still scanning in B&W Tested color scanning but file size too large Today’s students are using color heavily I don’t want to have to loan a dissertation
because the data in the ProQuest copy useless
Questions?