just for the record
DESCRIPTION
Just for the record. Bibliographic Data – where we were, where we are, where we’re going. Huw Jones libraries@cambridge. “Data about Data”. our metadata. Is it Newton?. NO. Is it Voyager?. NO. UL and Dependents Departments and Faculties A-E Departments and Faculties F-M - PowerPoint PPT PresentationTRANSCRIPT
Just for the record
Bibliographic Data – where we were, where we are,
where we’re going
Huw Jones
libraries@cambridge
“Data about Data”
our metadata
Is it Newton?
NO
Is it Voyager?
NO
Databases!
• UL and Dependents• Departments and Faculties A-E• Departments and Faculties F-M• Departments and Faculties O-Z• Colleges A-N• Colleges O-Z• Affiliated Institutions• Manuscripts
Hooke
Newton
Access Reports
Web Interfaces
Voyager
Where we were
8 databases
University Library: 4 M
Other libraries: 2.5 M
Data problems
Quality
Duplication
Quality - fullness
of 2.5 M records in our databases
1 M short records
Quality – coding
Duplication
Effects
• Difficulty in resource discovery
• Patchy retrieval
• Lack of authority control
• Difficulty with standard deduplication
• Burden on staff time
• Ties us to multiple database model
Where we are now
• Record sharing
• Short record enrichment
• Automated MARC correction
• Authority control
Record sharing
• Departments and Faculties A-E and O-Z moved to a record sharing model
• Drawing up of guidelines for Cataloguing
• Automated tools to change the ownership of 825,000 records
• Legacy duplication of records
Duplicates lists
Short record enrichment
Results
• Of 1M short records
• 200,000 records processed
• 106,175 records updated
• Will enrich half of our short records? 500,000?
Automated MARC correction
• Corrects MARC coding errors where it can do so without ambiguity
• In testing, 70,000 records processed in 2 days
• Over 200,000 errors corrected
Automated MARC Correction
How to get from this …
• =LDR 00472nam\\2200157\a\4500• =001 662002• =005 20071205064734.0• =008 071129s1985\\\\nyua\\\\\\\\\\001\0\eng\d• =020 \\$a9780961751111• =100 1\$aBroecker, W.S.,$d1931-• =245 10$aHow to build a habitable planet ;$cBy Wallace S. Broecker.• =260 \\$aNew York ;$bEldigio Press,$cc1985• =300 \\$a291p $bill $c23cm• =504 \\$aIncludes index.• =650 \0$aAstronomy.• =650 \0$aAstrophysics.
to this!
• =LDR 00453nam 2200157 a 4500• =001 662002• =005 20071205064734.0• =008 071129s1985\\\\nyua\\\\\\\\\\001\0\eng\d• =020 \\$a9780961751111• =100 1\$aBroecker, W. S.,$d1931-• =245 10$aHow to build a habitable planet /$cby Wallace S. Broecker.• =260 \\$aNew York :$bEldigio Press,$cc1985.• =300 \\$a291 p. :$bill. ;$c23 cm.• =504 \\$aIncludes index.• =650 \0$aAstronomy.• =650 \0$aAstrophysics.
Output
• Bib id: 662002• How to build a habitable planet ; By Wallace S. Broecker.• 100: UPDATE: Spaces inserted between initials in subfield _a• 245: UPDATE: By uncapitalised at start of subfield c• 245: UPDATE: Space forward slash inserted before subfield _c• 260: UPDATE: Full stop inserted at end of field• 260: UPDATE: Space colon inserted before subfield _b• 300: UPDATE: Full stop inserted after the p in pagination• 300: UPDATE: Full stop inserted at end of field• 300: UPDATE: Illustration abbreviation has been corrected• 300: UPDATE: Space colon inserted before subfield _b• 300: UPDATE: Space inserted between digits and cm• 300: UPDATE: Space inserted between digits and p in pagination• 300: UPDATE: Space semi-colon inserted before subfield c
Authority Control
• No authority control in libraries@cambridge databases
• Script written to identify unauthorised headings
• Used program to correct headings
Results
• DepFacOZ – 2,243 name and subject headings changed, affecting 41,944 records
• DepFacAE – 620 subject headings corrected, affecting 6,841 records
• Authority check incorporated into Bib Check program
Where we are
Fewer of these:
More of these:
Fewer records
Better records
Where are we going?
• One fully deduplicated database of full, well coded records?
• Catalogue will always be a work in progress
• Improvements to Catalogue important not only to solve current problems but also to support future developments
• Data exists independently of Voyager
• Future developments will rely on quality of data to work effectively– Pushing data out to i.e. discovery layers
(Primo, Acquabrowser), platforms (WorldCat, Talis Platform)
– Linking to data from outside i.e. RSS feeds, reading lists
– FRBR
• Mixture of automated solutions and traditional cataloguing
• Catalogue and the records it is made up of are useful tools for the discovery, location and use of our resources
• We will be ‘Cataloguing’ for a long time to come!