developments in catalogues and data sharing
DESCRIPTION
A talk given at the Bodleian libaries 'From cataloguing to metadata' event in November 2011 Personal opinions on changing trends in library metadata creation and consumption. Also considers the challenges and rewards associated providing and licensing data for re-use by machines and the people that program them.TRANSCRIPT
![Page 1: Developments in catalogues and data sharing](https://reader033.vdocuments.us/reader033/viewer/2022061118/5469d01daf7959ff128b52c9/html5/thumbnails/1.jpg)
• How our catalogues are evolving• Opening and sharing the data within them
• Ed Chamberlain• Systems Development Librarian – Cambridge University Library
![Page 2: Developments in catalogues and data sharing](https://reader033.vdocuments.us/reader033/viewer/2022061118/5469d01daf7959ff128b52c9/html5/thumbnails/2.jpg)
Systems Development Librarian at the other place
Data ‘munger’
Data consumer?
![Page 3: Developments in catalogues and data sharing](https://reader033.vdocuments.us/reader033/viewer/2022061118/5469d01daf7959ff128b52c9/html5/thumbnails/3.jpg)
Control over data creation
Control over data consumption
Control over data environment
Control over data technology
![Page 4: Developments in catalogues and data sharing](https://reader033.vdocuments.us/reader033/viewer/2022061118/5469d01daf7959ff128b52c9/html5/thumbnails/4.jpg)
![Page 5: Developments in catalogues and data sharing](https://reader033.vdocuments.us/reader033/viewer/2022061118/5469d01daf7959ff128b52c9/html5/thumbnails/5.jpg)
![Page 6: Developments in catalogues and data sharing](https://reader033.vdocuments.us/reader033/viewer/2022061118/5469d01daf7959ff128b52c9/html5/thumbnails/6.jpg)
No longer the single authority for content and data
Commercial, social and academic discovery mechanisms
Explosion of digital content
Illusion of ‘all on the web’
![Page 7: Developments in catalogues and data sharing](https://reader033.vdocuments.us/reader033/viewer/2022061118/5469d01daf7959ff128b52c9/html5/thumbnails/7.jpg)
![Page 8: Developments in catalogues and data sharing](https://reader033.vdocuments.us/reader033/viewer/2022061118/5469d01daf7959ff128b52c9/html5/thumbnails/8.jpg)
Studies into Google Generation / ‘Generation Y’ 1
Cambridge Arcadia IRIS report 2009 2
Preference for search engine over catalogue
Online over in-building
Trust tutors and peers over Librarian
Still respect the library ‘brand’ 1) ”The Google generation: the information behaviour of the researcher of the future”Aslib Proceedings, V60, issue 4 10.1108/00012530810887953
2) Arcadia IRIS Project report - http://arcadiaproject.lib.cam.ac.uk/docs/Report_IRIS_final.pdf
![Page 9: Developments in catalogues and data sharing](https://reader033.vdocuments.us/reader033/viewer/2022061118/5469d01daf7959ff128b52c9/html5/thumbnails/9.jpg)
So far …
Evolution of catalogues
Changes in exposure of data
To come? Greater sharing of data
Library data used in non-library environments
![Page 10: Developments in catalogues and data sharing](https://reader033.vdocuments.us/reader033/viewer/2022061118/5469d01daf7959ff128b52c9/html5/thumbnails/10.jpg)
Keyword based discovery services
New ways to exploit old data
Relevancy ranking
Rich faceting
Greater linking
Search is the new browse
Repositories and archives
Is the OPAC dead?
![Page 11: Developments in catalogues and data sharing](https://reader033.vdocuments.us/reader033/viewer/2022061118/5469d01daf7959ff128b52c9/html5/thumbnails/11.jpg)
![Page 12: Developments in catalogues and data sharing](https://reader033.vdocuments.us/reader033/viewer/2022061118/5469d01daf7959ff128b52c9/html5/thumbnails/12.jpg)
Citations
Abstracts
Table of Contents
![Page 13: Developments in catalogues and data sharing](https://reader033.vdocuments.us/reader033/viewer/2022061118/5469d01daf7959ff128b52c9/html5/thumbnails/13.jpg)
![Page 14: Developments in catalogues and data sharing](https://reader033.vdocuments.us/reader033/viewer/2022061118/5469d01daf7959ff128b52c9/html5/thumbnails/14.jpg)
Tags Public lists Reader reviews
Dramatic growth in access pointsInput from true subject specialists
o Lack of structureo No quality controlo Compromise of sanctity?
![Page 15: Developments in catalogues and data sharing](https://reader033.vdocuments.us/reader033/viewer/2022061118/5469d01daf7959ff128b52c9/html5/thumbnails/15.jpg)
Web scale - resource discovery concept taken further Primo Central Summon Ebsco Discovery Worldcat local
Hathi trust data can be used for full text searching of print collections
![Page 16: Developments in catalogues and data sharing](https://reader033.vdocuments.us/reader033/viewer/2022061118/5469d01daf7959ff128b52c9/html5/thumbnails/16.jpg)
Catalogue data is now: Consumed as keywords (not
left anchored) Facted (not browsed) Supplemented Transformed Merged Amalgamated
![Page 17: Developments in catalogues and data sharing](https://reader033.vdocuments.us/reader033/viewer/2022061118/5469d01daf7959ff128b52c9/html5/thumbnails/17.jpg)
![Page 18: Developments in catalogues and data sharing](https://reader033.vdocuments.us/reader033/viewer/2022061118/5469d01daf7959ff128b52c9/html5/thumbnails/18.jpg)
![Page 19: Developments in catalogues and data sharing](https://reader033.vdocuments.us/reader033/viewer/2022061118/5469d01daf7959ff128b52c9/html5/thumbnails/19.jpg)
Our local catalogues
National / international aggregations
Joe Public
Teenage software developer / hacker
Booksellers
Web start-ups
Search engines
Wikipedia
Other libraries
Research group website
![Page 20: Developments in catalogues and data sharing](https://reader033.vdocuments.us/reader033/viewer/2022061118/5469d01daf7959ff128b52c9/html5/thumbnails/20.jpg)
Bibliographic data linked to many aspects of successful teaching and research
Citation lists – measure output
Shared bibliography – core of research group work
Reading lists – backbone of undergraduate teaching
High quality data needed for re-use
Not all possible whilst data resides in the library ‘silo’
![Page 21: Developments in catalogues and data sharing](https://reader033.vdocuments.us/reader033/viewer/2022061118/5469d01daf7959ff128b52c9/html5/thumbnails/21.jpg)
“Library catalogues have imposed on them librarian or supplier-made decisions about what can/can’t be searched and in what way. Some of these decisions are limited by current cataloguing rules, but not all; often the data is recorded, but not in a usable way, or is there but isn’t tapped by the interface. For example, in most catalogues you can limit by publication type to newspapers, but you can’t limit by frequency of the issues.”
“Releasing data means that people can start to use it in the way they want to.”
![Page 22: Developments in catalogues and data sharing](https://reader033.vdocuments.us/reader033/viewer/2022061118/5469d01daf7959ff128b52c9/html5/thumbnails/22.jpg)
Success of distributed access outside of cultural heritage
Single point of discovery?
Taxpayer generated – give it back!
Why not share?
![Page 23: Developments in catalogues and data sharing](https://reader033.vdocuments.us/reader033/viewer/2022061118/5469d01daf7959ff128b52c9/html5/thumbnails/23.jpg)
Past few years have seen a massive release of public data in government and cultural heritage sectors Open Government Data - http://data.gov.uk Open Knowledge Foundation - http://okfn.org
EU Commission mandate to open data
Shared in ways for easy reuse and linking
![Page 24: Developments in catalogues and data sharing](https://reader033.vdocuments.us/reader033/viewer/2022061118/5469d01daf7959ff128b52c9/html5/thumbnails/24.jpg)
RLUK and JISC initiative
Galleries, libraries, archives, museums
The Discovery principles propose that:
'Open metadata creates the opportunity for enhancing impact through the release of descriptive data about library, archival and museum resources. It allows such data to be made freely available and innovatively reused to serve researchers, teachers, students, service providers and the wider community in the UK and internationally.'
http://discovery.ac.uk
![Page 25: Developments in catalogues and data sharing](https://reader033.vdocuments.us/reader033/viewer/2022061118/5469d01daf7959ff128b52c9/html5/thumbnails/25.jpg)
![Page 26: Developments in catalogues and data sharing](https://reader033.vdocuments.us/reader033/viewer/2022061118/5469d01daf7959ff128b52c9/html5/thumbnails/26.jpg)
Why not?
WorldCat has done this for years
Schema.org microdata– some semantic structure
Use case for catalogue data in an advertising environment?
Google taken 10% (so far)
![Page 27: Developments in catalogues and data sharing](https://reader033.vdocuments.us/reader033/viewer/2022061118/5469d01daf7959ff128b52c9/html5/thumbnails/27.jpg)
<h1 itemprop="name”>The Cambridge companion to Spenser edited by Andrew Hadfield. [electronic resource] /</h1>
<span style="display: none;" itemprop="publisher">Cambridge University Press,</span> <span style="display: none;" itemprop="datePublished">2001.</span>
![Page 28: Developments in catalogues and data sharing](https://reader033.vdocuments.us/reader033/viewer/2022061118/5469d01daf7959ff128b52c9/html5/thumbnails/28.jpg)
Application Programme Interface (API)
Layered over LMS
Expose catalogue data feeds for developers
Anyone can use them
Simple request, simple response
http://www.lib.cam.ac.uk/api
![Page 29: Developments in catalogues and data sharing](https://reader033.vdocuments.us/reader033/viewer/2022061118/5469d01daf7959ff128b52c9/html5/thumbnails/29.jpg)
http://www.lib.cam.ac.uk/api/voyager/newtonSearch.cgi?searchArg=darwin&databases=depfacaedb
![Page 30: Developments in catalogues and data sharing](https://reader033.vdocuments.us/reader033/viewer/2022061118/5469d01daf7959ff128b52c9/html5/thumbnails/30.jpg)
![Page 31: Developments in catalogues and data sharing](https://reader033.vdocuments.us/reader033/viewer/2022061118/5469d01daf7959ff128b52c9/html5/thumbnails/31.jpg)
COMET project
80% of CUL bib records converted to Resource Description Framework (RDF)
Enriched with direct links to the Library of Congress
Vocab in-line with British Library work
OCLC FAST and VIAF authority sources
http://data.lib.cam.ac.uk
![Page 32: Developments in catalogues and data sharing](https://reader033.vdocuments.us/reader033/viewer/2022061118/5469d01daf7959ff128b52c9/html5/thumbnails/32.jpg)
Marc21 …001 1000346245$aEarly medieval history of Kashmir : $b[with special reference to the Loharas] A.D. 1003-1171 /
DC XML …<dc:identifer>1000346</dc:identifer><dc:title>Early medieval history of Kashmir : [with special reference to the Loharas] A.D. 1003-1171</dc:title>
RDF triples …<http://data.lib.cam.ac.uk/id/entry/cambrdgedb_1000346> <http://purl.org/dc/terms/title> "Early medieval history of Kashmir : [with special reference to the Loharas] A.D. 1003-1171"
![Page 33: Developments in catalogues and data sharing](https://reader033.vdocuments.us/reader033/viewer/2022061118/5469d01daf7959ff128b52c9/html5/thumbnails/33.jpg)
1. <http://data.lib.cam.ac.uk/id/entry/cambrdgedb_1000346> <http://purl.org/dc/terms/title> "Early medieval history of Kashmir : [with special reference to the Loharas] A.D. 1003-1171" .2. <http://data.lib.cam.ac.uk/id/entry/cambrdgedb_1000346> <http://purl.org/dc/terms/type> <http://data.lib.cam.ac.uk/id/type/1cb251ec0d568de6a929b520c4aed8d1> .3. <http://data.lib.cam.ac.uk/id/entry/cambrdgedb_1000346> <http://purl.org/dc/terms/type> <http://data.lib.cam.ac.uk/id/type/46657eb180382684090fda2b5670335d> .4. <http://data.lib.cam.ac.uk/id/entry/cambrdgedb_1000346> <http://purl.org/dc/terms/identifier> "UkCU1000346" .5. <http://data.lib.cam.ac.uk/id/entry/cambrdgedb_1000346> <http://purl.org/dc/terms/issued> "1981" .6. <http://data.lib.cam.ac.uk/id/entry/cambrdgedb_1000346> <http://purl.org/dc/terms/creator> <http://data.lib.cam.ac.uk/id/entity/cambrdgedb_a5a6f7a184ff02e08b1befedc1b3a4d0> .7. <http://data.lib.cam.ac.uk/id/entry/cambrdgedb_1000346> <http://purl.org/dc/terms/language> <http://id.loc.gov/vocabulary/iso639-2/eng> .8. <http://data.lib.cam.ac.uk/id/entry/cambrdgedb_1000346> <http://RDVocab.info/ElementsplaceOfPublication> <http://id.loc.gov/vocabulary/countries/ii>
![Page 34: Developments in catalogues and data sharing](https://reader033.vdocuments.us/reader033/viewer/2022061118/5469d01daf7959ff128b52c9/html5/thumbnails/34.jpg)
![Page 35: Developments in catalogues and data sharing](https://reader033.vdocuments.us/reader033/viewer/2022061118/5469d01daf7959ff128b52c9/html5/thumbnails/35.jpg)
The Linking Open Data cloud diagram - http://richard.cyganiak.de/2007/10/lod
![Page 36: Developments in catalogues and data sharing](https://reader033.vdocuments.us/reader033/viewer/2022061118/5469d01daf7959ff128b52c9/html5/thumbnails/36.jpg)
Wikipedia
Archives Hub
British Library BNB
British Museum
Library of Congress
LOD at Bibliothèque nationale de France
BBC Nature
University of Southampton
Open University
![Page 37: Developments in catalogues and data sharing](https://reader033.vdocuments.us/reader033/viewer/2022061118/5469d01daf7959ff128b52c9/html5/thumbnails/37.jpg)
More data out there for cataloguers to reuse
More access points in records
Better mechanisms for record enrichment
Scope for revised cataloguing workflows
Records have a permanent identity on the web
![Page 38: Developments in catalogues and data sharing](https://reader033.vdocuments.us/reader033/viewer/2022061118/5469d01daf7959ff128b52c9/html5/thumbnails/38.jpg)
Initial attempts with RDF
Newer lightweight formats and databases
Focus on citation metadata for the sciences
New ways for scientists to share and work with bibliography
http://openbiblio.net/
http://openbiblio.net/principles/
![Page 39: Developments in catalogues and data sharing](https://reader033.vdocuments.us/reader033/viewer/2022061118/5469d01daf7959ff128b52c9/html5/thumbnails/39.jpg)
If developers are now consumers of our data …
![Page 40: Developments in catalogues and data sharing](https://reader033.vdocuments.us/reader033/viewer/2022061118/5469d01daf7959ff128b52c9/html5/thumbnails/40.jpg)
Most Cambridge data could be released under a permissive license (PDDL)
Europeana Digital Library approve Creative Commons ‘Zero’ licensing of data
British Library BNB – Creative Commons ‘Zero’
OCLC looking at attribution only licensing
Move away from ‘non-commercial’ wording
Open Data Commons Public Domain Dedication and License
(PDDL)
![Page 41: Developments in catalogues and data sharing](https://reader033.vdocuments.us/reader033/viewer/2022061118/5469d01daf7959ff128b52c9/html5/thumbnails/41.jpg)
No one wants OCLC to go under (partners on COMET)
Valued partners
Focus on sharing ‘non-marc21’ formats of greater use to the non-Librarian
Vendors aim to profit from services based on data rather than data for its own sake?
![Page 42: Developments in catalogues and data sharing](https://reader033.vdocuments.us/reader033/viewer/2022061118/5469d01daf7959ff128b52c9/html5/thumbnails/42.jpg)
Based on a 40 year old format
Based on a need to print a human readable card
Syntax, vocabulary, field names and content all intertwined
According to OCLC Research : Only 10% of all Marc tags in Worldcat
appear in 100% of all Worldcat records
65% of tags appear in less that 1% of records.
![Page 43: Developments in catalogues and data sharing](https://reader033.vdocuments.us/reader033/viewer/2022061118/5469d01daf7959ff128b52c9/html5/thumbnails/43.jpg)
AACR2 / MARC21 uses punctuation to denote content (100$d)
Mixed fields (text and numbers) (020$a)
Duplication author name format One hundred notes fields (or close
enough) ?
df100$aBradford, Gamaliel$d1863 - 1932. <authorParsed><surname>Bradford</surname><restOfName> Gamaliel</restOfName><birthDate>1863</birthDate><birthDateNormalised>18630101</birthDateNormalised><deathDate>1932</deathDate><deathDateNormalised>19320101</deathDateNormalised></authorParsed>
![Page 44: Developments in catalogues and data sharing](https://reader033.vdocuments.us/reader033/viewer/2022061118/5469d01daf7959ff128b52c9/html5/thumbnails/44.jpg)
Marc21 is binary encoded
Web-friendly standards are now the norm (XML/JSON) 1
Numbers for field names?
Bad character encoding allowed
![Page 45: Developments in catalogues and data sharing](https://reader033.vdocuments.us/reader033/viewer/2022061118/5469d01daf7959ff128b52c9/html5/thumbnails/45.jpg)
LOC Bibliographic Framework Transition declares a shift away from Marc21
Is the delay in introduction of RDA until we get a ‘better container’ ?
No system vendor is going forward with Marc21
Will take 10+ years
What is to come next?
![Page 46: Developments in catalogues and data sharing](https://reader033.vdocuments.us/reader033/viewer/2022061118/5469d01daf7959ff128b52c9/html5/thumbnails/46.jpg)
Steering for RDA and Marc replacement needs non-librarian input or ownership
Offer from NISO to take the work on
Karen Coyle criticises the Marc21 Bibliographic Framework Transition Initiative for not including museums, publishing, and IT professionals …
She argues that our data is not just for us to consume alone …
“The next data carrier for libraries needs to be developed as a truly open effort. It should be led by a neutral organization (possibly ad hoc) that can bring together the wide range of interested parties and make sure that all voices are heard. Technical development should be done by computer professionals with expertise in metadata design. The resulting system should be rigorous yet flexible enough to allow growth and specialization.”
http://kcoyle.blogspot.com/2011/08/bibliographic-framework-transition.html
![Page 47: Developments in catalogues and data sharing](https://reader033.vdocuments.us/reader033/viewer/2022061118/5469d01daf7959ff128b52c9/html5/thumbnails/47.jpg)
It becomes (even) easier to go to Amazon
Our status as authoritative data providers will be (further) eroded
No-one will want to play with us if we cannot learn to share
![Page 48: Developments in catalogues and data sharing](https://reader033.vdocuments.us/reader033/viewer/2022061118/5469d01daf7959ff128b52c9/html5/thumbnails/48.jpg)
http://www.discovery.ac.uk - Discovery
Ncg4lib mailing list
http://okfn.org - Open Knowledge Foundation
http://data.lib.cam.ac.uk