metadata is back!
TRANSCRIPT
![Page 1: Metadata is back!](https://reader034.vdocuments.us/reader034/viewer/2022052619/555069b1b4c90524138b465d/html5/thumbnails/1.jpg)
Metadata is back!
Bernhard Haslhofer - Cornell University
JCDL 2011 - Semantic Web Technologies for Libraries and Readers WorkshopOttawa, CanadaThursday, June 16th 2011
![Page 2: Metadata is back!](https://reader034.vdocuments.us/reader034/viewer/2022052619/555069b1b4c90524138b465d/html5/thumbnails/2.jpg)
![Page 3: Metadata is back!](https://reader034.vdocuments.us/reader034/viewer/2022052619/555069b1b4c90524138b465d/html5/thumbnails/3.jpg)
![Page 4: Metadata is back!](https://reader034.vdocuments.us/reader034/viewer/2022052619/555069b1b4c90524138b465d/html5/thumbnails/4.jpg)
<img src="catcher-in-the-rye-book-cover.jpg" />The Catcher in the Rye - Mass Market Paperbackby <a href="/author/jd_salinger.html">J.D. Salinger</a>
Price: $6.99In Stock
Product details224 pagesPublisher: Little, Brown, and Company - May 1, 1991Language: EnglishISBN-10: 0316769487
schema.org Book Example
![Page 5: Metadata is back!](https://reader034.vdocuments.us/reader034/viewer/2022052619/555069b1b4c90524138b465d/html5/thumbnails/5.jpg)
<div itemscope itemtype="http://schema.org/Book">
<img itemprop="image" src="catcher-in-the-rye-book-cover.jpg"/><span itemprop="name">The Catcher in the Rye</span> - <link itemprop="bookFormat" href="http://schema.org/Paperback">Mass Market Paperback by <a itemprop="author" href="/author/jd_salinger.html">J.D. Salinger</a>
<div itemprop="offers" itemscope itemtype="http://schema.org/Offer">
Price: <span itemprop="price">$6.99</span> <meta itemprop="priceCurrency" content="USD" /> <link itemprop="availability" href="http://schema.org/InStock">In Stock<link itemprop=”url” href=”http://en.wikipedia.org/wiki/The_Catcher_in_the_Rye”></div>
...
</div>
schema.org Book Example
![Page 6: Metadata is back!](https://reader034.vdocuments.us/reader034/viewer/2022052619/555069b1b4c90524138b465d/html5/thumbnails/6.jpg)
The story so far...
![Page 7: Metadata is back!](https://reader034.vdocuments.us/reader034/viewer/2022052619/555069b1b4c90524138b465d/html5/thumbnails/7.jpg)
Library Catalogue
(c) Bill Steele/Cornell Chronicle
(c) Vienna University Library
(c) Vienna University Library
Identifier
Metadata
Controlled Vocabulary
![Page 8: Metadata is back!](https://reader034.vdocuments.us/reader034/viewer/2022052619/555069b1b4c90524138b465d/html5/thumbnails/8.jpg)
OPAC
Identifier
Metadata
Controlled Vocabulary
![Page 9: Metadata is back!](https://reader034.vdocuments.us/reader034/viewer/2022052619/555069b1b4c90524138b465d/html5/thumbnails/9.jpg)
WWW / Wikipedia / Search Engines
Metadata?
Identifier?
Controlled Vocabulary?
![Page 10: Metadata is back!](https://reader034.vdocuments.us/reader034/viewer/2022052619/555069b1b4c90524138b465d/html5/thumbnails/10.jpg)
getMetadata(Web): void
![Page 11: Metadata is back!](https://reader034.vdocuments.us/reader034/viewer/2022052619/555069b1b4c90524138b465d/html5/thumbnails/11.jpg)
Semantic Web - Early Vision
“The Semantic Web will bring structure to the meaningful content of Web pages, creating an environment where software agents roaming
from page to page can readily carry out sophisticated tasks for users”
“For the semantic web to function, computers must have access to structured collections of information and sets of inference rules that
they can use to conduct automated reasoning.”
"Mom needs to see a specialist and then has to have a series of physical therapy sessions.
Biweekly or something. I'm going to have my agent set up the appointments."
~2000 2011
![Page 12: Metadata is back!](https://reader034.vdocuments.us/reader034/viewer/2022052619/555069b1b4c90524138b465d/html5/thumbnails/12.jpg)
Semantic Web Technologies
URI Unicode
XML
Data Model: RDF
RDF-S
Rules: RIFOntology:
OWLQuery: SPARQL
Unifying Logic
Proof
Crypto
Trust
User Interface & Applications
~2000 2011
![Page 13: Metadata is back!](https://reader034.vdocuments.us/reader034/viewer/2022052619/555069b1b4c90524138b465d/html5/thumbnails/13.jpg)
RDFa & Microformats
• Mechanisms to embed structured metadata in Web pages
• Define and/or reuse (X)HTML attributes to augment information in Websites with machine-readable semantics
~2000 2011
![Page 14: Metadata is back!](https://reader034.vdocuments.us/reader034/viewer/2022052619/555069b1b4c90524138b465d/html5/thumbnails/14.jpg)
RDFa Example
<div xmlns="http://www.w3.org/1999/xhtml" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xmlns:v="http://www.w3.org/2006/vcard/ns#"> <div about="http://example.com/me/behas" typeof="v:VCard"> <span property="v:fn">Bernhard Haslhofer</span> <span property="v:nickname">behas</span> <div rel="v:adr"> <div typeof="v:Address v:Work"> <span property="v:street-address">301 College Avenue</span> <span property="v:locality">Ithaca</span>, <span property="v:postal-code">14850</span>, <span property="v:country-name">United States</span>. </div> </div> <a rel="v:email" href="mailto:[email protected]">[email protected]</a>. </div></div>
~2000 2011
![Page 15: Metadata is back!](https://reader034.vdocuments.us/reader034/viewer/2022052619/555069b1b4c90524138b465d/html5/thumbnails/15.jpg)
Microformats Example
<div class="vcard">
<span class="fn">Bernhard Haslhofer</span>
<div class="adr"><div class="street-address">301 College Avenue</div><span class="locality">Ithaca</span><span class="postal-code">14850</span><span class="country-name">United States</span>
</div>
<a class="email" href="mailto:[email protected]">[email protected]</a>
</div>
~2000 2011
![Page 16: Metadata is back!](https://reader034.vdocuments.us/reader034/viewer/2022052619/555069b1b4c90524138b465d/html5/thumbnails/16.jpg)
• There is lots of information on the Web
• ... valuable information that can be (re-)used
• Problem• information is usually expressed in the form of HTML
documents
• the underlying raw data are locked in closed data silos (mostly DBMS)
Linked Data
~2000 2011
![Page 17: Metadata is back!](https://reader034.vdocuments.us/reader034/viewer/2022052619/555069b1b4c90524138b465d/html5/thumbnails/17.jpg)
Why Linked Data?
• The Web is successful because it provides• Uniform encoding (HTML)
• Uniform addressing (URI)
• Uniform transportation (HTTP)
for the exchange of documents.
• Why not apply the same mechanism to the underlying data?
~2000 2011
![Page 18: Metadata is back!](https://reader034.vdocuments.us/reader034/viewer/2022052619/555069b1b4c90524138b465d/html5/thumbnails/18.jpg)
What is Linked Data?
• A pragmatic method to build a Web of Data
• Architectural style based on SW standards
• Intelligent agents not primary focus
Web
~2000 2011
![Page 19: Metadata is back!](https://reader034.vdocuments.us/reader034/viewer/2022052619/555069b1b4c90524138b465d/html5/thumbnails/19.jpg)
Publishing Data
• Distinguish between non-information and information resource
• Sample non-information resource• http://dbpedia.org/resource/The_Catcher_in_the_Rye
• Sample information resource• http://dbpedia.org/page/The_Catcher_in_the_Rye - HTML
• http://dbpedia.org/data/The_Catcher_in_the_Rye - RDF
~2000 2011
![Page 20: Metadata is back!](https://reader034.vdocuments.us/reader034/viewer/2022052619/555069b1b4c90524138b465d/html5/thumbnails/20.jpg)
Retrieving Linked Data
~2000 2011
![Page 21: Metadata is back!](https://reader034.vdocuments.us/reader034/viewer/2022052619/555069b1b4c90524138b465d/html5/thumbnails/21.jpg)
Microdata (HTML5)
• A very young HTML 5 proposition that extends Microformats and addresses its shortcomings
• Items are created within an itemscope
• Ever item is assigned an arbitrary number of properties (itemprop)
• Uses global identifiers for typing and naming items
~2000 2011
![Page 22: Metadata is back!](https://reader034.vdocuments.us/reader034/viewer/2022052619/555069b1b4c90524138b465d/html5/thumbnails/22.jpg)
Microdata Example
<div itemscope itemtype="http://data-vocabulary.org/Person">
<span itemprop="name">Bernhard Haslhofer</span>, <span itemprop="nickname">behas</span>.
<div itemprop="address" itemscope itemtype="http://data-vocabulary.org/Address"><span itemprop="street-address">301 College Avenue</span><span itemprop="locality">Ithaca</span><span itemprop="country-name">United States</span>
</div>
</div>
~2000 2011
![Page 23: Metadata is back!](https://reader034.vdocuments.us/reader034/viewer/2022052619/555069b1b4c90524138b465d/html5/thumbnails/23.jpg)
Google Rich Snippets / SEO
~2000 2011
![Page 24: Metadata is back!](https://reader034.vdocuments.us/reader034/viewer/2022052619/555069b1b4c90524138b465d/html5/thumbnails/24.jpg)
~2000 2011
![Page 25: Metadata is back!](https://reader034.vdocuments.us/reader034/viewer/2022052619/555069b1b4c90524138b465d/html5/thumbnails/25.jpg)
~2000 2011
![Page 26: Metadata is back!](https://reader034.vdocuments.us/reader034/viewer/2022052619/555069b1b4c90524138b465d/html5/thumbnails/26.jpg)
schema.org
~2000 2011
![Page 27: Metadata is back!](https://reader034.vdocuments.us/reader034/viewer/2022052619/555069b1b4c90524138b465d/html5/thumbnails/27.jpg)
~2000 2011
technical / conceptual complexity
RDFa
Microdata
URI Unicode
XML
Data Model: RDF
RDF-S
Rules: RIFOntology:
OWLQuery: SPARQL
Unifying Logic
Proof
Crypto
Trust
User Interface & Applications
Microformats
![Page 28: Metadata is back!](https://reader034.vdocuments.us/reader034/viewer/2022052619/555069b1b4c90524138b465d/html5/thumbnails/28.jpg)
Where are we now?
![Page 29: Metadata is back!](https://reader034.vdocuments.us/reader034/viewer/2022052619/555069b1b4c90524138b465d/html5/thumbnails/29.jpg)
![Page 30: Metadata is back!](https://reader034.vdocuments.us/reader034/viewer/2022052619/555069b1b4c90524138b465d/html5/thumbnails/30.jpg)
![Page 31: Metadata is back!](https://reader034.vdocuments.us/reader034/viewer/2022052619/555069b1b4c90524138b465d/html5/thumbnails/31.jpg)
![Page 32: Metadata is back!](https://reader034.vdocuments.us/reader034/viewer/2022052619/555069b1b4c90524138b465d/html5/thumbnails/32.jpg)
(c) http://wiki.bib.uni-mannheim.de/dc-provenance/lib/exe/detail.php?id=europeana_example&media=europeana-ore.png
![Page 33: Metadata is back!](https://reader034.vdocuments.us/reader034/viewer/2022052619/555069b1b4c90524138b465d/html5/thumbnails/33.jpg)
What next?
![Page 34: Metadata is back!](https://reader034.vdocuments.us/reader034/viewer/2022052619/555069b1b4c90524138b465d/html5/thumbnails/34.jpg)
Deal with with schema.org
• Ignore it?
• Adopt it?
• Align existing library models with schema.org?
• Schema.org provides an extension mechanism for• properties
• classes
![Page 35: Metadata is back!](https://reader034.vdocuments.us/reader034/viewer/2022052619/555069b1b4c90524138b465d/html5/thumbnails/35.jpg)
![Page 36: Metadata is back!](https://reader034.vdocuments.us/reader034/viewer/2022052619/555069b1b4c90524138b465d/html5/thumbnails/36.jpg)
Data Quality / Resource Sync
• The Web is not static
• Resources and their representations might change or disappear over time
• Make sure that• applications can synchronize resources and learn about
changes
• go back in time
![Page 37: Metadata is back!](https://reader034.vdocuments.us/reader034/viewer/2022052619/555069b1b4c90524138b465d/html5/thumbnails/37.jpg)
Use Web Data in Apps
• Aggregate Web resources into special collections
• DBpedia provides resource descriptions translated into 90+ languages!!!
• Use URIs instead of labels for tagging
• Combine and mesh up data
• Analyze data ...
![Page 38: Metadata is back!](https://reader034.vdocuments.us/reader034/viewer/2022052619/555069b1b4c90524138b465d/html5/thumbnails/38.jpg)
Summary
![Page 39: Metadata is back!](https://reader034.vdocuments.us/reader034/viewer/2022052619/555069b1b4c90524138b465d/html5/thumbnails/39.jpg)
Metadata is back
• Metadata was introduced in the 19th century to deal with the information overload
• Cataloguing rules and workflows evolved over time
• The Web seemed to work pretty well without metadata (info retrieval, nat.lang processing)
• Now we have strong indicators that structured metadata on the Web will play an important role in future
• Shouldn’t libraries / librarians be part of that?
![Page 40: Metadata is back!](https://reader034.vdocuments.us/reader034/viewer/2022052619/555069b1b4c90524138b465d/html5/thumbnails/40.jpg)
References
• Coyle, K.: Library Data in a Modern Context. In: Understanding the Semantic Web: Bibliographic Data and Metadata. Library Technology Reports. January 2010
• http://blog.mediaspaces.info/ (Linked Data in Libraries State-of-the-Art)
![Page 41: Metadata is back!](https://reader034.vdocuments.us/reader034/viewer/2022052619/555069b1b4c90524138b465d/html5/thumbnails/41.jpg)
BACKUP
![Page 42: Metadata is back!](https://reader034.vdocuments.us/reader034/viewer/2022052619/555069b1b4c90524138b465d/html5/thumbnails/42.jpg)
Metadata Building Blocks
Metadata
Metadata Schema
Schema Definition Language
Title
Author
Genre
The Catcher in the Rye
Salinger, J.D.
Fiction (Digital / Non-Digital)Information Object
class
property
relationship
Title GenreAuthor
![Page 43: Metadata is back!](https://reader034.vdocuments.us/reader034/viewer/2022052619/555069b1b4c90524138b465d/html5/thumbnails/43.jpg)
Google Rich Snippet Types
• Reviews
• People
• Products
• Businesses and organizations
• Recipes
• Events
~2000 2011
![Page 44: Metadata is back!](https://reader034.vdocuments.us/reader034/viewer/2022052619/555069b1b4c90524138b465d/html5/thumbnails/44.jpg)
![Page 45: Metadata is back!](https://reader034.vdocuments.us/reader034/viewer/2022052619/555069b1b4c90524138b465d/html5/thumbnails/45.jpg)
http://tripletalk.wordpress.com/2011/01/25/rdfa-deployment-across-the-web/
![Page 46: Metadata is back!](https://reader034.vdocuments.us/reader034/viewer/2022052619/555069b1b4c90524138b465d/html5/thumbnails/46.jpg)
~2000 2011
![Page 47: Metadata is back!](https://reader034.vdocuments.us/reader034/viewer/2022052619/555069b1b4c90524138b465d/html5/thumbnails/47.jpg)
cp.: http://evan.prodromou.name/RDFa_vs_microformats
Microformats RDFa
flat namespace XML namespaces
support HTML4, XHTML 1.1, and HTML 5 support for XHTML 1.1
use latent HTML attributes introduces new metadata attributes
vocabulary defined by one organization/community open to any RDF-based vocabulary
~2000 2011
![Page 48: Metadata is back!](https://reader034.vdocuments.us/reader034/viewer/2022052619/555069b1b4c90524138b465d/html5/thumbnails/48.jpg)
Publishing Data
~2000 2011
GET http://dbpedia.org/resource/The_Catcher_in_the_RyeAccept: application/rdf+xml
303 See OtherLocation: http://dbpedia.org/data/The_Catcher_in_the_Rye
GET http://dbpedia.org/data/The_Catcher_in_the_RyeAccept: application/rdf+xml
200 OK...<?xml version="1.0" encoding="utf-8"?><rdf:RDF ...
![Page 49: Metadata is back!](https://reader034.vdocuments.us/reader034/viewer/2022052619/555069b1b4c90524138b465d/html5/thumbnails/49.jpg)