finding isaac leeser improving access to text collections with tei markup nicole arbuckle, backstage...

20
FINDING ISAAC LEESER Improving Access to Text Collections with TEI Markup Nicole Arbuckle, Backstage Library Works & David McKnight, University of Pennsylvania

Upload: amos-sullivan

Post on 17-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

FINDING ISAAC LEESERImproving Access to Text Collections

with TEI Markup

Nicole Arbuckle, Backstage Library Works

&David McKnight,

University of Pennsylvania

• Context• Scope and Content • Work Plan• Collaboration

Part One

Context

Isaac Leeser (1806 – 1868)

• Scholar Betram Korn described Isaac Leeser thus, “Practically every form of Jewish activity which supports Jewish life today was either established or envisaged by this one man.”

• Born in Westphalia, emigrated to Richmond, Virginia in 1824; Moves to Philadelphia in 1829 where he is hired to serve as Cantor and Reader of the prayer Service of the Congregation Mikveh Israel.

• Over the next forty years, Leeser emerges as a major Jewish religious figure; in addition he is an author, translator, journal editor, correspondent and founder of the Jewish Publication Societ.

Context

• 2006 – Plans to produce a full-text repository of the correspondence, as well as, the contents of The Occident (periodical), and other publications and translation of Isaac Leeser.

• 2007 - Judaica Curator, Dr. Arthur Kiron identifies Leeser Collectors who are actively collecting Leeser correspondence, particularly stolen Leeser letters that appear at auction or for sale

• 2007 – 2010: Project team established to scan, transcribe and edit Leeser Correspondence; scan and process the entire contents of The Occident (1843 – 1868);

• Spin off: Penn Libraries participates in Lyrasis/Internet Archive scanning project and Leeser printed works are scanned an mounted on Internet Archive

Context

Context

• 2010: American Genizah Project formalized and funded by private donors. (Concept borrowed from Genizah Ms. Fragments dating to 11th century. Applied to traces of dispersed Leeser Correspondence and the notion identifying and aggregating his letters in a digital repository

• 2010: First discussions with Back Stage Library Works regarding possible collaboration.

Scope and Content

• Genres: Manuscripts, periodicals, printed books and pamphlets• 662 letters• Twenty-five periodical volumes• Digitized Leeser Content mounted in three web locations:

Ironically, project materials are available at three sites:

1. The Occident: Historic Jewish Periodical Project (Israel)2. Pamphlets: Internet Archive3. Correspondence and other Leeser materials available through Upenn. Goal to integrate all formats in a single web site.

Scope and Content

Work Plan

• Develop editorial guidelines• Develop TEI spec (P5 Lite)• Identify searchable entities: personal names, corporate

names; geographical names and date• Scan item, process and QA images• Transcribe correspondence • Partner with Tel Aviv University where The Occident will

processed and index using Olive Full-Text Software• Partner with Backstage Library Works to develop TEI

Header and encode correspondence

Interactive Map

Map of United States

Collaboration

• Penn possessed resources to scan, transcribe correspondence and build web site

• Penn possessed funding to either develop in house TEI tagging team / programming team; or outsource

• We opted for the latter. Why?• Opportunity to work with vendor interested in developing

a new service (from which Penn might benefit overtime)• Timeliness• While Vendor could not host the content; Penn worked

with Backstage to develop TEI Header and Tag documents

• 2011 Penn project manager hired; contract signed by Backstage and Penn; work begins to tag correspondence

• 2012 programmer hired to transform TEI documents into searchable website including an interactive map

• January 2013 Site goes live

Collaboration

Part Two

Backstage Interest:• We had minimal TEI experience so we saw the

project as an opportunity to improve our skills

• Redelivered XML files from pilot

• Proceeded to encode remainder

• QA ran concurrently with encoding

• Delivered 100-200 XML files per week until all letters were complete (a 3 month period)

Quality Assurance and Delivery