Download - Raiders of the Lost Archive
Raiders of the Lost Archive
INF 385R: Survey of DigitizationMay 3, 2012
Franny Gaede
The Warehouse
The Secret
Tapes
•Jun. 1967 - Dec. 1967•157 pp. of transcripts•Service Set• Copies• Sometimes copies of copies• of copies
•Correlate to 15 audio recordings
The ARchive
Raider
•Clamp• Model: Manfrotto 035RL Super
Clamp with 2908 Standard Stud
•Articulated Arm• Model: Manfrotto 196B-2 143BKT
2-Section Single Articulated Arm with Camera Bracket
•Camera• Model: Canon PowerShot SX130IS
THe
Photographs
•Dimensions: 3k x 4k•Camera Modes• Program (more options)• Automatic white balance ON• Macro• 2 second self-timer• No flash• Left zoom alone*
•Exported as TIFFs• ~36 MB each
The
Findings
•Set-up, image transfer was time intensive
•Extremely quick image capture
•A remote for the camera would be ideal• Canon Hack Development Kit
•Excellent quality• Produced accurate OCR
The Test
vs,
The
Findings
•ABBYY• 1 hour training custom user
pattern• ~40 minutes to correct OCR in
ABBYY and corresponding text file
•Dragon• About 5 minutes for initial
training, an additional 5 minutes to teach it additional words
• ~7 minutes per page, formatting is time intensive
The
Findings
•Conclusions• Dragon works best with minimal
formatting• Re-speaking from audio was
unsuccessful; transcription is to be preferred
• Re-speaking summaries was very efficient
• With some time spent on pattern training, ABBYY was superior in replicating more complicated formatting• Punctuation was a major
issue; esp. the full stop, despite extensive training
• The Warehouse - http://commons.wikimedia.org/wiki/File:LBJ_Library_and_Museum_interior.png
• Archive Raider, Secret Tapes - Personal photos• The Findings, Part I - http://cartoonwallpapersdownload.blogspot.com/
2011/03/lego-indiana-jones-poster.html• The Test - http://www.nuance.com/naturallyspeaking/resources/mediakit/
DNS10_MediaKit_Images.html, http://www.abbyy.co.il/?langId=2
http://bit.ly/JyRXAU
image credits