taking your genealogy collection to the world
TRANSCRIPT
Taking Your Taking Your Genealogy Genealogy
Collection to the Collection to the WorldWorldLoren FantinLoren FantinWalter LewisWalter Lewis
@@OurDigitalWorld.orgOurDigitalWorld.org
OurDigitalWorld.orgOurDigitalWorld.orgNot-for-profit organization that evolved out of Knowledge Ontario
we come from library and archive backgrounds
What we doWhat we do
We partner with organizations in managing, preserving and sharing our digital assets.
We make local community history globally accessible.
We advocate for digital cultural heritage as a public good: sharing, access, attribution, and reuse
How we do thatHow we do thatDiscoverable, online access to over 4 million digital assets and community stories: videos, scrapbooks, images, newsletters, etc.
Offer the VITA toolkit for creating and managing multimedia collections, indexing and newspaper projects, including hosting, individual and regional sites
The largest Ontario newspaper archives (one of the largest free newspaper archives in the world)
Easy search and access to over 33,000 government documents from the Legislative Library collection
Trusted resource for digitization project planning
VITA toolkitVITA toolkitV(ideo)
I(mage)
T(ext)
A(udio)
and a bunch of other things that didn’t make a good acronym
Layers of DiscoveryLayers of Discovery
––Johnny AppleseedJohnny Appleseed
search.OurOntario.
ca
news.OurOntario.canews.OurOntario.ca
Regional sitesRegional sites
the four Public Libraries of Halton Region
LocalLocal
ink.ourdigitalworld.orgink.ourdigitalworld.org
All from a single point of data entry
vitatoolkit.cavitatoolkit.ca
CopyrightCopyrightLife of author (likely the editor) + 50 years
Note: Although not based in law, some organizations have applied a 90-year rule to newspapers, so that newspapers more than 90 years old are considered to be in the public domain.
1926
Indexing doesn’t require copyright permission
What to digitize?What to digitize?
Indexes?
Clippings?
Microfilm?
Originals?
Card IndexesCard Indexes
ElectroniElectronic c
IndexingIndexing
Source files …Source files …
Authex, dBase (III to IV), FoxPro, MS Access, SQL Server, MySQL, PostgreSQL, InMagic
MS Excel, MS Word, Dynix (Pick)
XML, HTML, CSV
the more structure the better, but transforming is cheaper than re-keying
Why bother?Why bother?Your electronic indexes are better than any generation of OCR (the exceptions are born-digital resources)
we can clean up (or at least identify) marginal records
dates in 2061 or 984 or missing
mastheads from “Goergetown”
linked open data (titles; place of publication, etc. ) and other semantic web elements
ClippingsClippings
from the Vertical File…from the Vertical File…
Microfilm conversionMicrofilm conversion
does n’t require the originals
does require a good print (no scratches / good contrast)
machine processing is faster and cheaper
used vs unusedused vs unused
Microfilm mishapsMicrofilm mishaps
OriginalsOriginals
best source
risk of damage during digitization
more time; more expensive
generally, better OCR
Damaged originalsDamaged originals
Whitby Chronicle, 3 Sep 1857, p. 3
Cheap NewsprintCheap Newsprint
Multiple Sources …Multiple Sources …
indexing or issues
black and white or colour
born digital or digitized
… browse or search and transitioning between them
OCROCRHI 3 do LiOSl do Connor do 2 do Watson do 3 do do do
Cousin 1 do do do 8 Johnson Ho do do 1 do
Kavs do I do 2 do do
York 3 do 3 do Pearson do do
1 do I do
Richards do 1 do Wall do 1 do Ross do 1 do Rooney I do 1 do Larkin do 2 do Poirson 3 do A do
Ferguson 2 do 2 do Bailey 5 do 3 Jos Fcrg do do Smart 1 do 1 do Lost
Western Herald (Sandwich, CW), 22
July 1842, p. 2 quoting Courier (Montreal, CE)
<text backgroundColor="10921638"><par>
<line baseline="232" l="1182" t="209" r="1432" b="231"><formatting lang="EnglishUnitedStates" ff="Times New Roman"
fs="11." spacing="-23"><charParams l="1182" t="209" r="1201" b="230"
wordStart="true" wordFromDictionary="true" wordNormal="true" wordNumeric="false" wordIdentifier="false" wordPenalty="0" meanStrokeWidth="33" charConfidence="66" serifProbability="255">T</charParams>
<charParams l="1201" t="210" r="1220" b="230" wordStart="false" wordFromDictionary="true" wordNormal="true" wordNumeric="false" wordIdentifier="false" wordPenalty="0" meanStrokeWidth="33" charConfidence="85" serifProbability="100">H</charParams>
<charParams l="1220" t="209" r="1238" b="230" wordStart="false" wordFromDictionary="true" wordNormal="true" wordNumeric="false" wordIdentifier="false" wordPenalty="0" meanStrokeWidth="33" charConfidence="100" serifProbability="100">E</charParams>
<charParams l="1238" t="209" r="1243" b="231" suspicious="true"> </charParams>... </formatting>
</line></par>
</text>
<word x1="92" y1="1505" r="100" b="1516" fs="4." ff="ArialNormal">LEWIS<ends x2="156" y2="1513"/></word><word x1="158" y1="1506" r="171" b="1517" fs="4." ff="ArialNormal">Waller<ends x2="206" y2="1517"/></word><word x1="219" y1="1508" r="226" b="1517" fs="4." ff="ArialNormal">and<ends x2="245" y2="1517"/></word><word x1="311" y1="1509" r="318" b="1518" fs="4." ff="ArialNormal">are<ends x2="335" y2="1518"/></word><word x1="347" y1="1509" r="355" b="1521" fs="4." ff="ArialNormal">pleased<ends x2="402" y2="1520"/></word><word x1="91" y1="1522" r="98" b="1531" fs="4." ff="ArialNormal">announce<ends x2="158" y2="1532"/></word><word x1="168" y1="1521" r="174" b="1532" fs="4." ff="ArialNormal">the<ends x2="191" y2="1532"/></word><word x1="200" y1="1522" r="210" b="1533" fs="4." ff="ArialNormal">birth<ends x2="237" y2="1532"/></word><word x1="247" y1="1523" r="255" b="1534" fs="4." ff="ArialNormal">of<ends x2="260" y2="1533"/></word><word x1="313" y1="1523" r="322" b="1533" fs="4." ff="ArialNormal">daughter<ends x2="378" y2="1534"/></word><word x1="395" y1="1522" r="407" b="1534" fs="4." ff="ArialNormal">Erin<ends x2="428" y2="1534"/></word><word x1="92" y1="1536" r="101" b="1547" fs="4." ff="ArialNormal">Elizabeth<ends x2="158" y2="1547"/></word><word x1="171" y1="1539" r="178" b="1548" fs="4." ff="ArialNormal">on<ends x2="187" y2="1547"/></word><word x1="196" y1="1537" r="206" b="1548" fs="4." ff="ArialNormal">December<ends x2="268" y2="1548"/></word><word x1="302" y1="1538" r="306" b="1548" fs="4." ff="ArialNormal">1983<ends x2="329" y2="1550"/></word>
Acton Free Press (Acton, ON), 18 Jan 1984, p. 21
OCR: modern filmOCR: modern film
SearchingSearching
Neilson … in HaltonNeilson … in Halton
Robison … in KingstonRobison … in Kingston
Making connections…Making connections…
Good hunting!Good hunting!Walter Lewis ([email protected])
search.ourontario.ca
news.ourontario.ca
ink.ourdigitalworld.org
ourdigitalworld.org + vitatoolkit.ca