emtacl12 - ntnu€¦ · emtacl12 . consert book newspaper radio show tv show is interviewed in has...
TRANSCRIPT
emtacl12
Consert
Book
Newspaper
Radio show
TV show
is interviewed in
has written has participated in
has played
Broadcasting domain Text domain
Person
Music domain
Song
has created
is played in is mentioned in
is reviewed in
is played in
is reviewed in
Band
is member of
black metal
repository
Last FM
DBpedia
MusicBrainz
BBC Music
Deichman
record level registrator
registration standard (AACR2 etc)
schema level structure
semantics
mapping
repository level (Linked data level) cross collection retrieval
Chan, L. M., & Zeng, M. L. (2006). Metadata interoperability and standardization: A study of methodology part I. D-Lib Magazine, 12(6).
how to make an infrastructure for searching and browsing Norwegian black metal, based on existing library data?
problem I: incomplete metadata collections
problem II: heterogenous metadata and metadata schemas, locally and externally
problem III: insufficient and inconsistent use of identifiers
sollution: linked data (problem I) + mapping to RDF (problem II) + graph matching (problem II/III)
seed collection: the national discography (Nordisko)
retrieve MARC(XML) records (z39.50/OAI/SRU) matching a list of preselected black metal bands
RDF
make a simple ”black metal” ontology based on existing vocabularies and convert the MARC records into RDF triples using XSLT upload the triples to a Virtouso triple store
graph matching
use SPARLQ (and PHP) to match clusters of nodes and edges in our seed data against similiar clusters in rich target collections providing SPARQL endpoints
use matching data in target collection for cleaning up and enriching metadata in seed collection
Burzum
Darkthrone
Emperor
Gorgoroth
Immortal
Satyricon
Thorn
99 MARCXML-records were retrieved from Nordisko as a respons to queries based on pre-selected black metal bands -why black metal? -complex interlinking!
<marc:datafield tag="700”>
<marc:subfield code="a">Maniac</marc:subfield>
</marc:datafield>
<marc:datafield tag="700”>
<marc:subfield code="a">Blasphemer</marc:subfield>
</marc:datafield>
<marc:datafield tag="700”>
<marc:subfield code="a">Hellhammer</marc:subfield>
</marc:datafield>
<marc:datafield tag="700”>
<marc:subfield code="a">Necrobutcher</marc:subfield>
</marc:datafield>
<marc:datafield tag="710”>
<marc:subfield code="a">Mayhem</marc:subfield>
</marc:datafield>
<marc:datafield tag="740”>
<marc:subfield code="a">Carnage</marc:subfield>
</marc:datafield>
<marc:datafield tag="740”>
<marc:subfield code="a">Necrolust</marc:subfield>
</marc:datafield>
<marc:datafield tag="740”>
<marc:subfield code="a">Deathcrush</marc:subfield>
</marc:datafield>
<marc:datafield tag="740”>
<marc:subfield code="a">Ancient skin</marc:subfield>
</marc:datafield>
<marc:datafield tag="740”>
<marc:subfield code="a">Freezing moon</marc:subfield>
</marc:datafield>
<marc:datafield tag="740”>
<marc:subfield code="a">Fall of seraphs</marc:subfield>
</marc:datafield>
<marc:datafield tag="740”>
<marc:subfield code="a">Chainsaw gutsfuck<marc:subfield>
</marc:datafield>
…
<marc:datafield tag="900">
<marc:subfield code="a">Eriksen, Rune</marc:subfield>
<marc:subfield code="z">Blasphemer</marc:subfield>
</marc:datafield>
<marc:datafield tag="900”>
<marc:subfield code="a">Stubberud, Jørn</marc:subfield>
<marc:subfield code="z">Necrobutcher</marc:subfield>
</marc:datafield>
<marc:datafield tag="900”>
<marc:subfield code="a">Kristiansen, Sven-Erik<marc:subfield>
<marc:subfield code="z">Maniac</marc:subfield>
</marc:datafield>
<marc:datafield tag="900”>
<marc:subfield code="a">Blomberg, Jan Axel</marc:subfield>
<marc:subfield code="z">Hellhammer</marc:subfield>
</marc:datafield>
<marc:datafield tag="110”>
<marc:subfield code="a">Kvikksølvguttene</marc:subfield>
</marc:datafield>
<marc:datafield tag="245”>
<marc:subfield code="a">Krieg</marc:subfield>
<marc:subfield code="h">lydopptak</marc:subfield>
</marc:datafield>
…
<marc:datafield tag="505”>
<marc:subfield code="a">Innhold: In den Arsch gefickt / Kvikksølvguttene. Torture/ Kvikksølvguttene, Vomit. Krieg / Kvikksølvguttene, Vomit (Ztalin, elgitar). More murder / Kvikksølvguttene (Ztalin, elgitar). Anger / Kvikksølvguttene. Ghoul / Kvikksølvguttene, Mayhem. Sluts / Kvikksølvguttene (Ztalin, elgitar). Violent death / Kvikksølvguttene. Fisted sisters / Kvikksølvguttene. Naglekamp / Kvikksølvguttene (Ztalin, elgitar)</marc:subfield>
</marc:datafield>
<marc:datafield tag="700”>
<marc:subfield code="a">Necro</marc:subfield>
</marc:datafield>
<marc:datafield tag="700”>
<marc:subfield code="a">Zathan</marc:subfield>
</marc:datafield>
<marc:datafield tag="700”>
<marc:subfield code="a">Ztalin</marc:subfield>
</marc:datafield>
<marc:datafield tag="700”>
<marc:subfield code="a">H.M.P.D.K.</marc:subfield>
</marc:datafield>
<marc:datafield tag="700”>
<marc:subfield code="a">Andreassen, Ole Petter</marc:subfield>
</marc:datafield>
<marc:datafield tag="710”>
<marc:subfield code="a">Vomit</marc:subfield>
<marc:subfield code="t">Krieg</marc:subfield>
</marc:datafield>
<marc:datafield tag="710”>
<marc:subfield code="a">Mayhem</marc:subfield>
<marc:subfield code="t">Ghoul</marc:subfield>
</marc:datafield>
<marc:datafield tag="710”>
<marc:subfield code="a">Vomit</marc:subfield>
<marc:subfield code="t">Torture</marc:subfield>
</marc:datafield>
<marc:datafield tag="710”>
<marc:subfield code="a">Kvikksølvguttene</marc:subfield>
<marc:subfield code="t">Krieg</marc:subfield>
</marc:datafield>
<marc:datafield tag="710”>
<marc:subfield code="a">Kvikksølvguttene</marc:subfield>
<marc:subfield code="t">Ghoul</marc:subfield>
</marc:datafield>
challenges
«Dauði Baldrs» «Hermoðr á Helferð» «Bálferð Baldrs» «Í Heimr Heljar» «Illa Tiðandi» «Móti Ragnarokum»
Erickson, Rune Eriksen, Rune Espedal, Kristian Eivind Euronymous Fachtal, Arataus Faust Fenris Fenriz Finstad, Børge Frost Gaahl Garbarek, Anja Goat Goatpervertor Greifi Grishnack Greishnackh, Greifi Grim Grishnackh, Greifi Grutle, Kjetil H.M. Daiomonion H.M.P.D.K. Haraldstad, Kjell Vidar Haraldstad, Kjetil Vidar
ambiguity from resource, registrator, registration standard, metadata structure, ontology or transformation?
comparing graph structures/ontologies
pattern recognition
semantic correspondences
Raimond, Y., Sutton, C., & Sandler, M. (2008). Automatic interlinking of music datasets on the semantic web. Linked Data on the Web (LDOW2008).
A B
r
r
r
r
r
s s s
s s
rx sy
rx = http://blackmetal.no/mayhem
black metal repository MusicBrainz
sy = http://musicbrainz.org/artist/c5f9e699-7b0d-4030-86dd-7acc8250d147
owl:sameAs
?
Problem
4 Artist
”Mayhem”
foaf:name
7 Artist
”Mayhem”
foaf:name
G2 G3
1 Artist
”Mayhem”
foaf:name
G1
= =
Graph matching
matching literals
comparing literals directly G1:”Mayhem” (node 1) G2:”Mayhem” (node 4) G1:”Deathcrush” (node 2) G2:”Deathcrush” (node 5) G1:”De Mysteriis Dom Sathanas” (node 3) G2:”De Mysteris Dom Sathanas” (node 6) G1:”Mayhem” (node 1) G3:”Mayhem” (node 7) G1:”Deathcrush” (node 2) G3:”Gentle murder” (node 8) G1:”De Mysteriis Dom Sathanas” (node 3) G3:”Pulling Puppet Strings” (node 9)n n literal1 literal2 similarity
1 4 1
2 5 1
3 6 0,9
1 7 1
2 8 0,2
n
1 Artist
3 Album
2 Album
”Mayhem”
”De Mysteriis Dom
Sathanas”
”Deathcrush”
dc:creator
foaf:name
foaf:name
4 Artist
6 Album
5 Album
”Mayhem”
”De Mysteris Dom
Sathanas”
”Deathcrush”
foaf:made
foaf:name
foaf:name
7 Artist
9 Album
8 Album
”Mayhem”
”Pulling Puppet Strings”
”Gentle
murder”
foaf:made
foaf:name
foaf:name
G1 G2 G3
black metal repository (collection A) Musicbrainz (collection B)
basic similarity measure for graphs:
graphs matching similarity
G1 G2 MG1:G2a = (1, 4), (2, 5), (3, 6) (1+1+0,9)/3=0,96
G1 G2 MG1:G2b = (1, 4), (2, 6), (3, 5) (1+0,2+0,2)/3=0,46
G1 G3 MG1:G3a = (1, 7), (2, 8), (3, 9) (1+0,2+0,1)/3=0,43
n
proof of concept/protoype: http://bibin.hioa.no/blackmetal/graph/
: existing interoperability problems at different levels
: Linked data+graph matching provides
disambiguation both locally and externally
tool for cleaning up local metadata
automatic interlinking
extended local data collection
thank you! on behalf of Kim Tallerås ([email protected]) Nils Pharo Jørn-Helge Dahl David Massey