agora: putting museum objects into their art-historic context
Post on 15-Jan-2015
3.147 Views
Preview:
DESCRIPTION
TRANSCRIPT
EURECOM July 2012
Agora: putting museum objects into their art-historic context
Marieke van Erpmarieke@cs.vu.nl
Introduction
• BA, MA & PhD Computational Linguistics/Information Extraction @Tilburg University
• Since 2009: SemWeb group @VU University Amsterdam
Overview
• The Agora Project
• Digital Hermeneutics
• Building an Event Thesaurus for Dutch• Experiments & Results• Outlook
Image src: http://www.artrage.com.au/dreamgirl/filesend/223/EarthFromAbove_EXPOTVDC212_prog.jpg
• Collaboration VU CS & History departments, Netherlands Institute for Sound and Vision and Rijksmuseum Amsterdam
• Facilitate and investigate digitally mediated public history
The Agora Project
Digitising Heritage
• Galleries, libraries, archives and museums (GLAMS) are digitising their data and presenting it online• This changes the role of GLAMS
from information interpreters to information providers• In the online setting, objects can
easily start to lead their own lives
Image source: http://terracebay.library.on.ca/wp-content/uploads/2011/04/clip_image002.jpg
Digital Hermeneutics
• An object on its own has no meaning; event descriptions provide historical context• A single event only gives part
of the historical context; chains of events (narratives) provide a more complete overview
Image src: http://3.bp.blogspot.com/-7nXcVdW0_wc/Th0JDRIT1GI/AAAAAAAAIEk/IoPReKrojkY/s1600/42st.jpg
sem:Actor
sem:Place
Yogyakarta
sem:Event
19/12/1948
The Attack on Yogyakartardf:type
rdf:type
rdf:type
sem:hasActor
agora:depictsEvent
sem:hasBeginTimeStamp
Netherlands
Painting: Three Fighter Aircraft in the Sky
Mohammed Toha Paints "Three Fighter Aircraft in the Sky"
agora:createsEvent
sem:Actor
rdf:type
sem:hasActor
Mohammed Toha
sem:Eventrdf:type
rma:creationDate
sem:hasPlacesem:hasPlace
sem:hasBeginTimeStamp
rma:maker
rma:creationPlace
Event Dimension
Indonesia
1945 - 1946
The Attack on Yogyakarta
KNIL
sem:hasPlace
sem:eventType
Armed Conflict
Sumatra
19/12/1948 - 31/12/1948
Operation Crow
KNIL
sem:hasPlace
sem:eventType
Armed Conflict
Yogyakarta
01/03/1949
The Attack on Yogyakarta
KNIL
sem:hasPlace
sem:eventType
Attack
agora:hasBiographicalRelation
agora:hasBiographicalRelation
sem:hasActor
sem:hasActor
sem:hasActor
sem:hasTimeStamp
sem:hasTimeStamp
sem:hasTimeStamp
Narratives
Event-driven Browsing
Event-driven Browsing
Event-driven Browsing
Building an Event Thesaurus
• There are no extensive structured event descriptions• Rijksmuseum Amsterdam has a
flat list of 1,693 ‘events’: only names and very much focused on 17th century Holland • Our goal: • create a list of historically
relevant events• provide actors, locations,
times & types for each event
Image src: http://www.collinsdictionary.com/static/graphics/default.png
First Attempt• Pattern based event-name
extraction• In Dutch Wikipedia we
found 2,444 event candidates • 1209 (56.3%) correct• 169 (13.9%) partially
correct• Off-the-shelf named entity
recognition (P/R/F1)• Person 77/77/77• Location 75/58/66• Organisation 32/37/34
Image src: http://www.spaceg.com/multimedia/collection/explosions/atomic%20explosion%205.jpg
First Attempt• Co-occurrence based event-
relation finder• only actor, location and/
or date found for 392 events• 49.6% actor is correct• 41.1% location is correct• 51.5% date is correct
Image src: http://www.spaceg.com/multimedia/collection/explosions/atomic%20explosion%205.jpg
First Attempt
• Problems event element recognition:• Shallow grammatical
processing (post-war rebuilding and during the North sea flood recognised as 1 event)
• Missing locations (Battle of LOC pattern fails)
• No distinction between entities and action nouns (German Occupancy vs German Occupants look the same for the approach)
• Named Entity Recogniser not suited for domain
Image src: http://www.spaceg.com/multimedia/collection/explosions/atomic%20explosion%205.jpg
First Attempt
• Problems event relation finder:• Relies on redundancy in
the data, only works for ‘popular’ events• Too coarse-grained (who
were the actors/locations in WWII)• Evaluation is hard!
Image src: http://www.spaceg.com/multimedia/collection/explosions/atomic%20explosion%205.jpg
Back to the drawing board...• Analysis of event names
• Combinations of sortal nouns with a PP and a named entity e.g., Battle of Stalingrad, Death of John Lennon
• Combinations of nominalized verbs with a PP and a named entity e.g, Excavation of Troy, Election of Obama.
• Combinations of a referential adjective with an event type and named entity e.g., the American invasion of Iraq.
• Transparent proper names: Great War
• Opaque proper names: Event names that can not be decomposed on morphological grounds e.g., Holocaust, Spanish Fury
Image src: http://www.northescambia.com/wp-content/uploads/2010/01/molinotrashfire10.jpg
Back to the drawing board...
• Improve Named Entity Recognition• Add gazetteers for
historical names• Post-processing for titles
and improved NE boundaries
Image src: http://www.northescambia.com/wp-content/uploads/2010/01/molinotrashfire10.jpg
Back to the drawing board...
• Finding Event Relations• Use structure Wikipedia/
DBpedia• Shallow parsing• Hierarchies of actors &
locations
Image src: http://www.northescambia.com/wp-content/uploads/2010/01/molinotrashfire10.jpg
Current Work
Spotlight (P/R/F)
Person
Location
Organisation
54.05/7.52/13.20
64.52/30.77/41.67
0/0/0
Stanford (P/R/F1)
58.60/34.46/43.40
67.19/66.15/66.67
9.78/25.71/14.17
Freire (P/R/F1)
79.17/71.16/74.95
80.00/61.54/69.57
89.66/74.29/81.25
• Still some work to be done, but Freire et al. (2012) shows that smart features can work with small amounts of training data• Combine classifiers• Add post-processing• MISC Class remains to be done...
Current WorkWord POS CHUNK NERU.N. NNP I-NP I-ORG official NN I-NP O Ekeus NNP I-NP I-PER heads VBZ I-VP O for IN I-PP O Baghdad NNP I-NP I-LOC . . O O
focus,minthree,mintwo,minone,plusone,plustwo,fnfreq,lnfreq,ncfreq,orgfreq,geo,n,v,a,adv,pn,cap,allcaps,beg,end,length,capfreq,class"is","wood",")","and","painted","dark",0,0,0,2.45253198865684,0,0,0,1,0,0,0,0,0,0,2,0,"O""painted",")","and","is","dark","grey",0,0,0,0,0,0,0,0,1,0,0,0,0,0,7,0,"O""dark","and","is","painted","grey",".",0,0,0,0.493875418347986,0,0,1,0,1,0,0,0,0,0,4,0,"O""grey","is","painted","dark",".","William",0,0,0,0.0768052510316108,0,1,1,1,1,0,0,0,0,0,4,0,"O"".","painted","dark","grey","William","Herschel",0,0,0,2.36647279037729,0,0,0,0,0,0,0,1,0,0,1,0,"O""William","dark","grey",".","Herschel","made",8.2034429051892,3.27892030900003,0,4.67158565874127,0,0,0,0,0,0,1,0,0,0,7,0,"B-PER""Herschel","grey",".","William","made","many",2.36726761611533,2.39936346938848,0,0.443930767784,0,1,1,0,0,0,1,0,0,0,8,0,"I-PER""made",".","William","Herschel","many","telescopes",0,0,0,0.493875418347986,0,0,0,1,1,0,0,0,0,0,4,0,"O""many","William","Herschel","made","telescopes","of",0,0,0,0.0768052510316108,0,0,0,0,1,0,0,0,0,0,4,0,"O""telescopes","Herschel","made","many","of","this",0,0,0,0,0,0,0,0,0,0,0,0,0,0,10,0,"O"
[CoNLL2003]
[Freire et al. 2012]
Current Work
• Build smarter extractors for event names• First focus on ‘regular’
event names (e.g., Battle of LOC, War of YEAR) • Use knowledge about
action nouns vs static nouns (WordNet)
The Story So Far
• It takes time to learn to communicate in an interdisciplinary project• Don’t try to solve too much
in one go • Cycles of error analysis • Domain adaptation is difficult:
optimise for precision
Outlook
• Redesign of Agora demo (new version autumn/winter) • Include different perspectives
(together with Semantics of History)• Ship model use case• Historical Named Entity
Recognition for English & Dutch • 2nd round user studies (spring
2013)
Questions?
Image Source: http://www.amichelleblakeley.com/storage/question%20marks.jpg?__SQUARESPACE_CACHEVERSION=1295297003883
marieke@cs.vu.nlhttp://www.cs.vu.nl/~marieke
? ?
?
¿
¿
¿Image src: http://www.rijksmuseum.nl/collectie/SK-A-2963/portret-van-don-ram%C3%B3n-satu%C3%A9-1765-1824
top related