sharing our special collections with the world— lessons learned / geoffrey skinner and jon haupt,...
TRANSCRIPT
Sharing Our Special Collections with the World
Geoffrey SkinnerCataloging and Metadata Librarian
Jon HauptBranch Manager, Healdsburg Regional Library and Sonoma County Wine Library
Sonoma County LibraryMay 5, 2016(A few) Lessons Learned
Sharing our Collections with the world a few lessons learned
Introduction Thank you to Justine Withers for suggesting us
Outline of talkBrief tour of SCL Special Collections and Digital Projects (6 minutes)Lessons learned (14 minutes)Case study: IWRDB (10 minutes)
1
Sonoma County Special CollectionsSonoma County Wine Library
Local History & Genealogy Library
Sonoma CountyArchives
Petaluma History RoomSonoma County Special Collections
Although the SCL is a public library, we have extensive special collections roughly 70,000 cataloged books, photographs and archival collections (and approximately 10-15 uncataloged items)/ These are spread across four main locations: we have a wine library, a Sonoma Co. local history and genealogy library, a collection devoted to Petaluma history and we manage the Countys archives. We also have materials of special interest elsewhere, such as local author books and local music recordings. Like most special collections, public access to most of these materials is limited. Our goal is to share as many of these collections with the world via our digital projects. 2
Community(Local Authors)Sonoma County Local History IndexGuide to the ArchivesInternational Wine Research Database (IWRDB)Find HistoryPin Flickr CommonsCalisphere OAC
Sonoma Heritage Collections
SCL Digital Projects
Internet Archive
YouTube
SCL Digital Projects Universe
We have created a digital universe to meet this goalEfforts begun in mid 2000sMost projects launched in 2011 and 2012 Brief tour:Guide to the ArchivesCommunity website for local authors with links from catalogSC Local History Index (indexing project stretching back to the 1960s)SHCIWRDB (newest project)
see links at end and in slides on SlideShare
3
Hosted CONTENTdm site launched 201238,000+ objects in 17 collections5 PartnershipsDigitization mostly outsourced Includes photos, video, audio, texts, maps and plans Modeled after Minnesota Reflections
http://heritage.sonomalibrary.orgSonoma Heritage CollectionsSonoma Heritage Collections
Hosted CONTENTdm site launched in 2012SHC currently includes nearly 38,000 items in 17 collections mostly from SCLs own holdings -- but we have also partnered with other organizations to share their materials as well (SCBA and WSHCS)SHC includes a variety of different media and all are assigned one of 22 special topics ranging from Agriculture, Rural Life and Fisheries to Wine and WinemakingPrimary funding: Sonoma County Tourism Board (initial funding through the State Librarys Local History Digital Resources Project (LHDRP) and a large Tourism Board grant)4
Entrance to the Santa Rosa Free Public Library on 4th Street, 1959 (SCL photo 3826)
Photo example in SHC
SHC primarily photos
Digitized by Luna Imaging, Backstage Library works or locallyHosted by OCLC/CONTENTdmLocal archived copy
5
De Diversorum Vini Generum (1559)Text example in SHC
Digitized texts, including this 1559 book on wine digitized for us by Internet Archive as part of a project to digitize all out-of -copyright books in the Wine Library.
Our digitized texts like all media except images -- are:Hosted by Internet ArchiveEmbedded in CONTENTdmArchival digital copy in SCL
On the diverse types of wine (De Diversorum Vini Generum) by Jacobus Praefectus (active 1536) published in Venice in 1559. 6
Charles Scalione Oral HistoryAudio + Text Example in SHC
Also Audio in this case an oral history interview paired with the interview transcript Audio digitized locally
7
Bill Jacobs and His Trick Horse, Tops
Video Example in SHC
Small number of videos this one part of the Sonoma County Fair CollectionDigitized locally
8
CONTENTdm Metadata Prep SpreadsheetCONTENTdm Metadata Prep Spreadsheet
Initial project:Brief MARC records originally in library catalog, some with linksUpgraded to full level and retrospectively all digitize photosMass export of all photo MARC recordsConverted MARC to tabular data (closely matching CONTENTdm fields) with in-house developed RubyGem Corrected and enhanced metadata, enhanced using Libre Office CalcUploaded metadata and images to CONTENTdm
Ongoing:Create spreadsheetsUploaded metadata and images to CONTENTdm
9
Conversion Progress (SCL photos only)Photo Record Conversion Progress31,000
Conversion and upload of existing SCL photos still in process: (SCL only, no partners)Total SCL Photo Collection 43,000 imagesPercentage Photos in SHC: 63% (31,000) Photos in Find Only 14% (6,500) Photos in Find Still to be Converted 6% (3,000)Photos in Process 5% (2,500)Rejected: 12% (6,0000) (more on this later)And more photos added to our collection all the time.
10
Community(Local Authors)Sonoma County Local History IndexGuide to the ArchivesInternational Wine Research Database (IWRDB)Find HistoryPin Flickr CommonsCalisphere OAC
Sonoma Heritage Collections
SCL Digital Projects
Internet Archive
YouTube
SCL Digital Projects Universe Review
SCL Digital Projects universe Extended:We also send our materials out to the world through:Flickr CommonsCalisphereOACHistoryPinAnd we get out the word on social media:BlogTwitterFacebook
Also Find the SCL catalog
11
A Few LessonsClassroom at Village Elementary School, Santa Rosa, California, 1957(source: SCL photo 27963)
A Few Lessons Intro
Its been a learning process. Ill admit Ive learned some lessons numerous times and will probably learn them again in the future. While some lessons have been specific to our projects, here are a few that I think may be of broader interest12
You Can't Take Hi-Res Pictures with Your Barbie CamSource: http://digicamhistory.comLesson 1: You Can't Take Hi-Res Pictures with Your Barbie Cam
Lesson 1
Mattel Barbie Cam, introduced in 1998, capable of stunning 160 x 120 pixel images
My first experience with a digital camera, circa 1999, for a website limitations quickly apparentApplies to:Record quality (brief records in catalog)Records iteratively built prior to conversion, but many opportunities for interpolation errors, etc.Image qualityVolunteer scanned images (pre 2004) at low resolution a policy decision to protect our IPPhotocopies in collection likely a cost-saving decision to photocopy rather than reprint
The photocopies and early photos are like the Barbie Cam photos either not enough info captured or too much now lost no amount of technical wizardry will really improve them13
Dont Get LostSource: https://libraryofbabel.infoLibrary of Everythingin theLesson 2: Dont Get Lost IN THE LIBRARY OF EVERYTHING
Lesson 2
Illustrated here by Borges Library of Babel taken here from Jonathan Basiles library of Babel. Project weve had no shortage of possible digitization projects. What is doable? Does it fit with mission and have local/regional/national value? If the answer is yes, do we have the resources to pull it off?
In our case , weve looked at:Photocopies in image collection digitized, but not in SHC (mostly)8x 10 reproductions of postcards (in SHC, but now rethinking)Posters, maps: very cool, high local interest, but unclear rights (on hold)Copyrighted material such as out of-print, but post 1922 wine books (how much work to get permissions?)
14
You Really Want Everything With That??Source: http://en.rocketnews24.com/Lesson 3: You Really Want Everything With That??
Lesson 3
Like this Burger with Everything from Japanese fast-food chain Lotteria, tempting to make use of every bit of metadataHow much metadata is truly useful? Find the sweet spot between the minimum (a title may be the only required field) and everything (126+ fields plus system-generated admin fields in Cdm)Our choices driven by what was in the MARC records and what we could fit in CONTENTdm we couldnt translate everything, but I made the choice to translate as much as possible we currently use 99 fields across all our collections (admittedly several have little or no metadata) just because I didnt want to lose the MARC record richnessCost: time-consuming data entry and management. No analytics on what visitors actually find usefulChoices probably would have been different if we hadnt started with MARC. My spreadsheets have been close to overwhelming
Japanese fast-food firm Lotterias Burger with Everything: The Burger with Everything on It (zenbunoseburger in Japanese) is both adieticians and linguists nightmare, as it manages to somehow be simultaneously enticing and intimidating. Its also a fever dream come to life, though, for big or indecisive eaters (http://en.rocketnews24.com/2015/04/29/truth-in-advertising-lotterias-monstrous-burger-with-everything-on-it-is-exactly-that/)
15
Its not bragging if you can back it up
hopelessSource: SCL Photo 37461Lesson 4: Its not bragging if you can back it up
Lesson 4
To paraphrase Muhammad AliIts not hopeless if you can back it up
Ive had images corrupted, images not transferred; metadata corrupted. Mostly recoverable, but not always. Recently I scrambled a spreadsheet with over 4000 lines and thought that I had a backup, but discovered way too late that my backup was also corrupted. Im still cleaning up the resulting problems
Jack Schofield, blogger for ZDNet: Schofield's Second Law of Computing states that data doesn't really exist unless you have at least two copies of it.Ideally, you should have at least three copies of everything, preferably on different media. It is a good idea to store one copy in the cloud, as then you have data "off premise" -- buildings have been known to flood or burn down -- as long as it's not your only copy. Having three copies means you can do file comparisons and therefore check if one of them has been corrupted. (http://www.zdnet.com/article/follow-schofields-three-laws-of-computing-and-avoid-disasters/)
SoBack up working copies of digital objects, spreadsheets, whatever and make sure you can identify versionsUse Windows backup if applicable
For digital objects, Archival backup we have images and other files backed up on a local archival server (that will soon be moving to the cloud) and older images also on CD-ROM off -siteMany platforms offer an archival backup option during the ingest process
16
Use the Right Tool(s) for the JobSource: WSCHS Item 11-358Lesson 5: Use the Right Tool(s) for the Job
Lesson 5
Best tools for a particular job depend on that jobNeeds change, Tools change. Sometimes custom tools are called for (RubyGem for MARC conversion)
I have a lot of favorite free or low cost tools:, including my workhorse programs:Libre Office Calc works better than Excel for BIG spreadsheets and has RegEx supportNotepad ++Notetab ProMarcEdit
Also use:GIMPPaint.NETEtc.
If you cant do a task with the tools at hand, someone has probably posted just the right tool to GitHub or elsewhere on the Web. Dont be afraid to explore and play.
17
The Best Laid Plans of Mice and Metadata Wranglers OftenAwry
GoSource: SCL Photo 33246Lesson 6: The Best Laid Plans of Mice and Metadata Wranglers Often Go Awry
Lesson 6
SHC launched with a custom JavaScript to present a corresponding StreetView (or map) for all photos with geocoordinates that we spent a long time gathering and entering. Very cool for then and now displays!Butin the meantime, Google changed their API and displayed locations were suddenly 70 miles to the southeast
Another example we planned to update metadata in only one place SHC -- but I couldnt make create a workflow to make that actually happen, so we update in both CONTENTdm and in our Horizon catalog and records arent necessarily in sync
Architectural plan for remodeling the house of Mrs. G. W. Connors of 742 Orchard Street, Santa Rosa prepared by J. C. Lindsay, ca. 1920 (SCL photo 33246)
18
Flexibility is a Good ThingSource: http://wikipedia.orgLesson 7: Flexibility is a Good Thing
Lesson 7
Pretty much applies to everythingGoogle StreetView was a cool part of SHC, but since I didnt have time to rewrite and test the script to use the new API, I turned it off for the time being.
I made quite a few decisions based on the current state of current platforms. Perhaps somewhat unavoidable, but my aim is to make metadata frameworks and content as reusable as possible. Metadata is currently based on the original AACR2 MARC records and structured for CONTENTdm, but now were on to RDA and were going to be moving to Islandora. Staying current demands flexibility no way around it!
19
Dancing is More Fun with the Right Partner(s)Source: SCL photo 22514Lesson 8: Dancing is More Fun with the Right Partner(s)
Lesson 8
By partners, I really mean everyone involved in our efforts staff, funders, partner organizations, vendors. Even if we havent yet achieved everything we originally envisioned, without our partners, we wouldnt be here today. Ive been the sole person slogging through our projects much of the time juggling them with my regular cataloging work -- but without my staff and colleagues whove helped at various times, we would have accomplished so much less. At least one former staff member is here in the audience thank you, Justine! and a talented former colleague who is now with Lyrasis, Mark Cooper, did essential programming for most of the projects.
We just got some very good news: our budget for the next FY was approved this week with money for a new professional cataloging position and we also will get part of a new library specialist position. I think well be dancing up a storm very soon!
Mr. Palmieri and dance partner, Santa Rosa, California, 1961 (SCL photo 22514)
20
International Wine Research Database
Turning to project developed and managed by Jon Haupt, Branch Manager21
The International Wine Research Database (IWRDB) strives to be the most comprehensive bibliography of wine literature in the world.1970s1999: Clippings file1999: LSTA GrantWinefiles.org2013: Another LSTA grantIWRDB
What is IWRDB?
22
What was Winefiles?Local History+Index to Wine Archive Periodicals
Staffing: Part time librarian + volunteersWine Librarian -> select articlesVolunteers -> photocopyVolunteers -> enter dataPT Assistant Librarian -> Clean up + authority control
Strictly a SCWL projectOne of many small-scale indexing projects underway among wine libraries and collectionsNot comprehensive, but still the MOST comprehensive
23
Winefiles.org screenshot
24
Winefiles DataSQL data:
INSERT INTO `article` (`a_id`,`author`,`title`,`language`,`publication_month`,`publication_day`,`publication_year`,`publisher`,`publication`,`volume`,`number`,`pages`,`item_type`,`fulltext_path`,`url`,`abstract`,`keywords`,`subject_folder`,`history_folder`,`company_folder`,`subject`,`business`,`business_contact`,`business_org`,`region`,`varietal`,`appellation`,`comments`,`username`,`entry_date`,`update_date`,`dbname`) VALUES(720,':Howie, Millie:','Wine Words: The Carneros difference','Eng','October','2','1992','','Healdsburg Tribune','','','','Article: Newspaper','','','The author describes the Carneros Quality Alliance and its efforts to promote the distinctive character of wines from the Carneros at an event held in the Stanford Court Hotel in San Francisco. The alliance was formed in 1985 and by the time of the article included 27 wineries and 60 growers in its membership. The event included panel discussions and a tasting of older vintages from the Carneros district. Details of speakers, topics, and wines tasted are included in the article.','lp wine words: the carneros . . . . [yada yada]
Basically, this data served its purpose OK, but was difficult to transfer into another system not very clearly-structured. As a result, the limitations of the interface (circa 1999) became more and more troublesome.
25
What Were Our Hopes and Dreams?Collaborative indexingNot enough staffEst. annual cost of indexing trade publications: $80,000 Our answer: Distributed indexingFour institutions providing $20K/yearEight institutions providing $10K/yearSixteen institutions providing $5K/yearBuild a global network of wine libraries, archives, and information centers: IAWL
26
What Were Our Hopes and Dreams?Data harvestingFreely available data (citations + abstracts)Publishers provide streamlined access to free data?Worst case scenario: harvest directly from web sitesHope to provide links directly to content if your institution has a subscriptionWe see possible pool of new customers for publishers
27
IWRDB Search Interface screenshot
28
IWRDB Results Interface screenshot
29
IAWL screenshot
30
What Dreams Were Dashed?Data harvestingPublishers are mostly unresponsiveThey do not see $$ reasons for helping our projectSome trade publications do not have indexing at all!They do appreciate we are not directly competing
Long-term cultivation approachSnowballing: Success may breed successPerhaps publishers will want to participate in IAWL
31
What Dreams Were Dashed?Collaborative IndexingSupportive institutions, but noncommittalNo bandwagon yetNo clear organization yet to rely on for dissemination of information (IAWL is too new)
Wed hoped the harvesting would serve as a springboardNow it looks like we need collaborative indexing firstWe may need to seek out independent consultants
32
Whats Next?Another New Wine Librarian at SCWLTag-teaming to build IAWL and establish workgroupContinue to develop other organizations rolesContinue review of our role in the project and how it aligns (or not) with our organizations Strategic Plan
33
Thank you
and stay on track
Source: WSCHS Item 01-093Thank You
Thank you for allowing us to share a our challenges and lessons learned in bringing our special collections to the world
Stan Strout and two conductors in front of Petaluma & Santa Rosa RR car No. 55 at the Forestville Station, circa 1904 (WSCHS Collection)
34
Resource List: SCL Digital Projects Sonoma Heritage Collections: http://sclib.us/shcInternational Wine Research Database: http://iwrdb.orgCommunity Sonoma Co. Authors: http://sclib.us/communitySonoma Co. Local History Index: http://sclib.us/lhiGuide to the Archives: http://sclib.us/archivesBeyond the LibraryFlickr Commons: http://bit.ly/flickr-sclYouTube Channel: http://bit.ly/yt-sclInternet Archive: http://bit.ly/ia-sclCalisphere: http://bit.ly/cali-sclOAC: http://bit.ly/oac-sclTwitter: @scldigital
35
Charles Scallione cip