sharing our special collections with the world— lessons learned / geoffrey skinner and jon haupt,...

35
Sharing Our Special Collections with the World… Geoffrey Skinner Cataloging and Metadata Librarian Jon Haupt Branch Manager, Healdsburg Regional Library and Sonoma County Wine Library Sonoma County Library May 5, 2016 (A few) Lessons Learned

Upload: northern-california-technical-processes-group

Post on 08-Feb-2017

211 views

Category:

Education


0 download

TRANSCRIPT

Sharing Our Special Collections with the World

Geoffrey SkinnerCataloging and Metadata Librarian

Jon HauptBranch Manager, Healdsburg Regional Library and Sonoma County Wine Library

Sonoma County LibraryMay 5, 2016(A few) Lessons Learned

Sharing our Collections with the world a few lessons learned

Introduction Thank you to Justine Withers for suggesting us

Outline of talkBrief tour of SCL Special Collections and Digital Projects (6 minutes)Lessons learned (14 minutes)Case study: IWRDB (10 minutes)

1

Sonoma County Special CollectionsSonoma County Wine Library

Local History & Genealogy Library

Sonoma CountyArchives

Petaluma History RoomSonoma County Special Collections

Although the SCL is a public library, we have extensive special collections roughly 70,000 cataloged books, photographs and archival collections (and approximately 10-15 uncataloged items)/ These are spread across four main locations: we have a wine library, a Sonoma Co. local history and genealogy library, a collection devoted to Petaluma history and we manage the Countys archives. We also have materials of special interest elsewhere, such as local author books and local music recordings. Like most special collections, public access to most of these materials is limited. Our goal is to share as many of these collections with the world via our digital projects. 2

Community(Local Authors)Sonoma County Local History IndexGuide to the ArchivesInternational Wine Research Database (IWRDB)Find HistoryPin Flickr CommonsCalisphere OAC

Sonoma Heritage Collections

SCL Digital Projects

Internet Archive

YouTube

SCL Digital Projects Universe

We have created a digital universe to meet this goalEfforts begun in mid 2000sMost projects launched in 2011 and 2012 Brief tour:Guide to the ArchivesCommunity website for local authors with links from catalogSC Local History Index (indexing project stretching back to the 1960s)SHCIWRDB (newest project)

see links at end and in slides on SlideShare

3

Hosted CONTENTdm site launched 201238,000+ objects in 17 collections5 PartnershipsDigitization mostly outsourced Includes photos, video, audio, texts, maps and plans Modeled after Minnesota Reflections

http://heritage.sonomalibrary.orgSonoma Heritage CollectionsSonoma Heritage Collections

Hosted CONTENTdm site launched in 2012SHC currently includes nearly 38,000 items in 17 collections mostly from SCLs own holdings -- but we have also partnered with other organizations to share their materials as well (SCBA and WSHCS)SHC includes a variety of different media and all are assigned one of 22 special topics ranging from Agriculture, Rural Life and Fisheries to Wine and WinemakingPrimary funding: Sonoma County Tourism Board (initial funding through the State Librarys Local History Digital Resources Project (LHDRP) and a large Tourism Board grant)4

Entrance to the Santa Rosa Free Public Library on 4th Street, 1959 (SCL photo 3826)

Photo example in SHC

SHC primarily photos

Digitized by Luna Imaging, Backstage Library works or locallyHosted by OCLC/CONTENTdmLocal archived copy

5

De Diversorum Vini Generum (1559)Text example in SHC

Digitized texts, including this 1559 book on wine digitized for us by Internet Archive as part of a project to digitize all out-of -copyright books in the Wine Library.

Our digitized texts like all media except images -- are:Hosted by Internet ArchiveEmbedded in CONTENTdmArchival digital copy in SCL

On the diverse types of wine (De Diversorum Vini Generum) by Jacobus Praefectus (active 1536) published in Venice in 1559. 6

Charles Scalione Oral HistoryAudio + Text Example in SHC

Also Audio in this case an oral history interview paired with the interview transcript Audio digitized locally

7

Bill Jacobs and His Trick Horse, Tops

Video Example in SHC

Small number of videos this one part of the Sonoma County Fair CollectionDigitized locally

8

CONTENTdm Metadata Prep SpreadsheetCONTENTdm Metadata Prep Spreadsheet

Initial project:Brief MARC records originally in library catalog, some with linksUpgraded to full level and retrospectively all digitize photosMass export of all photo MARC recordsConverted MARC to tabular data (closely matching CONTENTdm fields) with in-house developed RubyGem Corrected and enhanced metadata, enhanced using Libre Office CalcUploaded metadata and images to CONTENTdm

Ongoing:Create spreadsheetsUploaded metadata and images to CONTENTdm

9

Conversion Progress (SCL photos only)Photo Record Conversion Progress31,000

Conversion and upload of existing SCL photos still in process: (SCL only, no partners)Total SCL Photo Collection 43,000 imagesPercentage Photos in SHC: 63% (31,000) Photos in Find Only 14% (6,500) Photos in Find Still to be Converted 6% (3,000)Photos in Process 5% (2,500)Rejected: 12% (6,0000) (more on this later)And more photos added to our collection all the time.

10

Community(Local Authors)Sonoma County Local History IndexGuide to the ArchivesInternational Wine Research Database (IWRDB)Find HistoryPin Flickr CommonsCalisphere OAC

Sonoma Heritage Collections

SCL Digital Projects

Internet Archive

YouTube

SCL Digital Projects Universe Review

SCL Digital Projects universe Extended:We also send our materials out to the world through:Flickr CommonsCalisphereOACHistoryPinAnd we get out the word on social media:BlogTwitterFacebook

Also Find the SCL catalog

11

A Few LessonsClassroom at Village Elementary School, Santa Rosa, California, 1957(source: SCL photo 27963)

A Few Lessons Intro

Its been a learning process. Ill admit Ive learned some lessons numerous times and will probably learn them again in the future. While some lessons have been specific to our projects, here are a few that I think may be of broader interest12

You Can't Take Hi-Res Pictures with Your Barbie CamSource: http://digicamhistory.comLesson 1: You Can't Take Hi-Res Pictures with Your Barbie Cam

Lesson 1

Mattel Barbie Cam, introduced in 1998, capable of stunning 160 x 120 pixel images

My first experience with a digital camera, circa 1999, for a website limitations quickly apparentApplies to:Record quality (brief records in catalog)Records iteratively built prior to conversion, but many opportunities for interpolation errors, etc.Image qualityVolunteer scanned images (pre 2004) at low resolution a policy decision to protect our IPPhotocopies in collection likely a cost-saving decision to photocopy rather than reprint

The photocopies and early photos are like the Barbie Cam photos either not enough info captured or too much now lost no amount of technical wizardry will really improve them13

Dont Get LostSource: https://libraryofbabel.infoLibrary of Everythingin theLesson 2: Dont Get Lost IN THE LIBRARY OF EVERYTHING

Lesson 2

Illustrated here by Borges Library of Babel taken here from Jonathan Basiles library of Babel. Project weve had no shortage of possible digitization projects. What is doable? Does it fit with mission and have local/regional/national value? If the answer is yes, do we have the resources to pull it off?

In our case , weve looked at:Photocopies in image collection digitized, but not in SHC (mostly)8x 10 reproductions of postcards (in SHC, but now rethinking)Posters, maps: very cool, high local interest, but unclear rights (on hold)Copyrighted material such as out of-print, but post 1922 wine books (how much work to get permissions?)

14

You Really Want Everything With That??Source: http://en.rocketnews24.com/Lesson 3: You Really Want Everything With That??

Lesson 3

Like this Burger with Everything from Japanese fast-food chain Lotteria, tempting to make use of every bit of metadataHow much metadata is truly useful? Find the sweet spot between the minimum (a title may be the only required field) and everything (126+ fields plus system-generated admin fields in Cdm)Our choices driven by what was in the MARC records and what we could fit in CONTENTdm we couldnt translate everything, but I made the choice to translate as much as possible we currently use 99 fields across all our collections (admittedly several have little or no metadata) just because I didnt want to lose the MARC record richnessCost: time-consuming data entry and management. No analytics on what visitors actually find usefulChoices probably would have been different if we hadnt started with MARC. My spreadsheets have been close to overwhelming

Japanese fast-food firm Lotterias Burger with Everything: The Burger with Everything on It (zenbunoseburger in Japanese) is both adieticians and linguists nightmare, as it manages to somehow be simultaneously enticing and intimidating. Its also a fever dream come to life, though, for big or indecisive eaters (http://en.rocketnews24.com/2015/04/29/truth-in-advertising-lotterias-monstrous-burger-with-everything-on-it-is-exactly-that/)

15

Its not bragging if you can back it up

hopelessSource: SCL Photo 37461Lesson 4: Its not bragging if you can back it up

Lesson 4

To paraphrase Muhammad AliIts not hopeless if you can back it up

Ive had images corrupted, images not transferred; metadata corrupted. Mostly recoverable, but not always. Recently I scrambled a spreadsheet with over 4000 lines and thought that I had a backup, but discovered way too late that my backup was also corrupted. Im still cleaning up the resulting problems

Jack Schofield, blogger for ZDNet: Schofield's Second Law of Computing states that data doesn't really exist unless you have at least two copies of it.Ideally, you should have at least three copies of everything, preferably on different media. It is a good idea to store one copy in the cloud, as then you have data "off premise" -- buildings have been known to flood or burn down -- as long as it's not your only copy. Having three copies means you can do file comparisons and therefore check if one of them has been corrupted. (http://www.zdnet.com/article/follow-schofields-three-laws-of-computing-and-avoid-disasters/)

SoBack up working copies of digital objects, spreadsheets, whatever and make sure you can identify versionsUse Windows backup if applicable

For digital objects, Archival backup we have images and other files backed up on a local archival server (that will soon be moving to the cloud) and older images also on CD-ROM off -siteMany platforms offer an archival backup option during the ingest process

16

Use the Right Tool(s) for the JobSource: WSCHS Item 11-358Lesson 5: Use the Right Tool(s) for the Job

Lesson 5

Best tools for a particular job depend on that jobNeeds change, Tools change. Sometimes custom tools are called for (RubyGem for MARC conversion)

I have a lot of favorite free or low cost tools:, including my workhorse programs:Libre Office Calc works better than Excel for BIG spreadsheets and has RegEx supportNotepad ++Notetab ProMarcEdit

Also use:GIMPPaint.NETEtc.

If you cant do a task with the tools at hand, someone has probably posted just the right tool to GitHub or elsewhere on the Web. Dont be afraid to explore and play.

17

The Best Laid Plans of Mice and Metadata Wranglers OftenAwry

GoSource: SCL Photo 33246Lesson 6: The Best Laid Plans of Mice and Metadata Wranglers Often Go Awry

Lesson 6

SHC launched with a custom JavaScript to present a corresponding StreetView (or map) for all photos with geocoordinates that we spent a long time gathering and entering. Very cool for then and now displays!Butin the meantime, Google changed their API and displayed locations were suddenly 70 miles to the southeast

Another example we planned to update metadata in only one place SHC -- but I couldnt make create a workflow to make that actually happen, so we update in both CONTENTdm and in our Horizon catalog and records arent necessarily in sync

Architectural plan for remodeling the house of Mrs. G. W. Connors of 742 Orchard Street, Santa Rosa prepared by J. C. Lindsay, ca. 1920 (SCL photo 33246)

18

Flexibility is a Good ThingSource: http://wikipedia.orgLesson 7: Flexibility is a Good Thing

Lesson 7

Pretty much applies to everythingGoogle StreetView was a cool part of SHC, but since I didnt have time to rewrite and test the script to use the new API, I turned it off for the time being.

I made quite a few decisions based on the current state of current platforms. Perhaps somewhat unavoidable, but my aim is to make metadata frameworks and content as reusable as possible. Metadata is currently based on the original AACR2 MARC records and structured for CONTENTdm, but now were on to RDA and were going to be moving to Islandora. Staying current demands flexibility no way around it!

19

Dancing is More Fun with the Right Partner(s)Source: SCL photo 22514Lesson 8: Dancing is More Fun with the Right Partner(s)

Lesson 8

By partners, I really mean everyone involved in our efforts staff, funders, partner organizations, vendors. Even if we havent yet achieved everything we originally envisioned, without our partners, we wouldnt be here today. Ive been the sole person slogging through our projects much of the time juggling them with my regular cataloging work -- but without my staff and colleagues whove helped at various times, we would have accomplished so much less. At least one former staff member is here in the audience thank you, Justine! and a talented former colleague who is now with Lyrasis, Mark Cooper, did essential programming for most of the projects.

We just got some very good news: our budget for the next FY was approved this week with money for a new professional cataloging position and we also will get part of a new library specialist position. I think well be dancing up a storm very soon!

Mr. Palmieri and dance partner, Santa Rosa, California, 1961 (SCL photo 22514)

20

International Wine Research Database

Turning to project developed and managed by Jon Haupt, Branch Manager21

The International Wine Research Database (IWRDB) strives to be the most comprehensive bibliography of wine literature in the world.1970s1999: Clippings file1999: LSTA GrantWinefiles.org2013: Another LSTA grantIWRDB

What is IWRDB?

22

What was Winefiles?Local History+Index to Wine Archive Periodicals

Staffing: Part time librarian + volunteersWine Librarian -> select articlesVolunteers -> photocopyVolunteers -> enter dataPT Assistant Librarian -> Clean up + authority control

Strictly a SCWL projectOne of many small-scale indexing projects underway among wine libraries and collectionsNot comprehensive, but still the MOST comprehensive

23

Winefiles.org screenshot

24

Winefiles DataSQL data:

INSERT INTO `article` (`a_id`,`author`,`title`,`language`,`publication_month`,`publication_day`,`publication_year`,`publisher`,`publication`,`volume`,`number`,`pages`,`item_type`,`fulltext_path`,`url`,`abstract`,`keywords`,`subject_folder`,`history_folder`,`company_folder`,`subject`,`business`,`business_contact`,`business_org`,`region`,`varietal`,`appellation`,`comments`,`username`,`entry_date`,`update_date`,`dbname`) VALUES(720,':Howie, Millie:','Wine Words: The Carneros difference','Eng','October','2','1992','','Healdsburg Tribune','','','','Article: Newspaper','','','The author describes the Carneros Quality Alliance and its efforts to promote the distinctive character of wines from the Carneros at an event held in the Stanford Court Hotel in San Francisco. The alliance was formed in 1985 and by the time of the article included 27 wineries and 60 growers in its membership. The event included panel discussions and a tasting of older vintages from the Carneros district. Details of speakers, topics, and wines tasted are included in the article.','lp wine words: the carneros . . . . [yada yada]

Basically, this data served its purpose OK, but was difficult to transfer into another system not very clearly-structured. As a result, the limitations of the interface (circa 1999) became more and more troublesome.

25

What Were Our Hopes and Dreams?Collaborative indexingNot enough staffEst. annual cost of indexing trade publications: $80,000 Our answer: Distributed indexingFour institutions providing $20K/yearEight institutions providing $10K/yearSixteen institutions providing $5K/yearBuild a global network of wine libraries, archives, and information centers: IAWL

26

What Were Our Hopes and Dreams?Data harvestingFreely available data (citations + abstracts)Publishers provide streamlined access to free data?Worst case scenario: harvest directly from web sitesHope to provide links directly to content if your institution has a subscriptionWe see possible pool of new customers for publishers

27

IWRDB Search Interface screenshot

28

IWRDB Results Interface screenshot

29

IAWL screenshot

30

What Dreams Were Dashed?Data harvestingPublishers are mostly unresponsiveThey do not see $$ reasons for helping our projectSome trade publications do not have indexing at all!They do appreciate we are not directly competing

Long-term cultivation approachSnowballing: Success may breed successPerhaps publishers will want to participate in IAWL

31

What Dreams Were Dashed?Collaborative IndexingSupportive institutions, but noncommittalNo bandwagon yetNo clear organization yet to rely on for dissemination of information (IAWL is too new)

Wed hoped the harvesting would serve as a springboardNow it looks like we need collaborative indexing firstWe may need to seek out independent consultants

32

Whats Next?Another New Wine Librarian at SCWLTag-teaming to build IAWL and establish workgroupContinue to develop other organizations rolesContinue review of our role in the project and how it aligns (or not) with our organizations Strategic Plan

33

Thank you

and stay on track

Source: WSCHS Item 01-093Thank You

Thank you for allowing us to share a our challenges and lessons learned in bringing our special collections to the world

Stan Strout and two conductors in front of Petaluma & Santa Rosa RR car No. 55 at the Forestville Station, circa 1904 (WSCHS Collection)

34

Resource List: SCL Digital Projects Sonoma Heritage Collections: http://sclib.us/shcInternational Wine Research Database: http://iwrdb.orgCommunity Sonoma Co. Authors: http://sclib.us/communitySonoma Co. Local History Index: http://sclib.us/lhiGuide to the Archives: http://sclib.us/archivesBeyond the LibraryFlickr Commons: http://bit.ly/flickr-sclYouTube Channel: http://bit.ly/yt-sclInternet Archive: http://bit.ly/ia-sclCalisphere: http://bit.ly/cali-sclOAC: http://bit.ly/oac-sclTwitter: @scldigital

35

Charles Scallione cip