digital classicist london seminars 2013 - seminar 10 - agnes thomas et al
DESCRIPTION
Agnes Thomas, Francesco Mambrini & Matteo Romanello (DAI, Berlin) 'Insights in the World of Thucydides: The Hellespont Project as a research environment for Digital History'. Digital Classicist London & Institute of Classical Studies seminar 2013, Friday August 9th. The Hellespont Project (German Archaeological Institute and Tufts University) aims to integrate two of the largest online collections for the study of Antiquity, the Perseus Digital Library and the Arachne archaeological database, in a dynamic digital research environment. Historians will have access to materials and resources of heterogeneous type, like ancient texts, archaeological evidence, historical background, and modern scholarly literature, while the documents related to each single historical event taken from the textual evidence will be interconnected through the CIDOC-CRM model. Hellespont as a case study focuses on a limited historical period, the 50-year period in the history of Athens between the end of the Persian Wars (479 BCE) and the outburst of the Peloponnesian War (431 BCE). Furthermore, it follows the narration presented by the most important written source, chapters 1.89-118 of the Histories of Thucydides, who was a contemporary to some of the facts. One of the point of departure for the project is the annotation of Thucydides' text with multiple layers of linguistic information. Our goal is really to create a "digital sourcebook" including a lot of machine-actionable information, where historians can go to find references to sources, and tools to help linguistic analysis of the original texts. Documents are bridged using the event-based CIDOC-CRM. We are working with two different concepts of events. In CIDOC ontology, events encompass all changes of states in cultural systems: they are identified by reference to historical scholarship. In Ancient History, where event reconstruction is mostly based on the interpretation of written sources, this definition isinsufficient. We are therefore implementing a data-driven approach, based on the semantic/syntactic strategies that express mutation in the external words through language. We aim to identify such strategies through a fine-grained semantic annotation of the written ancient texts. We are going to present the digitally analysed text of Thucydides including different kind of additional information in a single Virtual Research Environment (VRE). The interface, which is currently still being implemented, is based on the same idea of GapVis, that is a visual interface for reading texts providing the user with multiple views on the same passage of text. In the presentation we will show the most important parts of the different views the user will access in the interface.TRANSCRIPT
IntroductionThe GapVis Interface
Event annotationSecondary Literature
Insights in the World of ThucydidesThe Hellespont Project as a research environment for Digital
History
A. Thomasab F. Mambrinib M. Romanellobc
aUniversität zu Köln
bDeutsches Archäologisches Institut, Berlin
cKing’s College, London
August 9, 2013
Thomas, Mambrini, Romanello The Hellespont Project
IntroductionThe GapVis Interface
Event annotationSecondary Literature
Outline
1 Introduction
2 The GapVis Interface
3 Event annotationManual event annotationLinguistic annotation
4 Secondary Literature
Thomas, Mambrini, Romanello The Hellespont Project
IntroductionThe GapVis Interface
Event annotationSecondary Literature
The Hellespont ProjectIntegrating Arachne and Perseus
October 2010 - September 2013
http://arachne.uni-koeln.de/drupal/?q=de/node/231
Thomas, Mambrini, Romanello The Hellespont Project
IntroductionThe GapVis Interface
Event annotationSecondary Literature
Cooperating Institutions and Persons
German ArchaeologicalInstitute Berlin:Ortwin DallyReinhard FörtschFrancesco MambriniMatteo RomanelloWolfgang Schmidle
The Perseus Project:Bridget AlmasAlison BabeuLisa CerratoGregory Crane
Cologne DigitalArchaeology Laboratory:Carina BerningRobert KummerAlexander RechtMarcel RiedelKaren SchwaneAgnes Thomas
Thomas, Mambrini, Romanello The Hellespont Project
IntroductionThe GapVis Interface
Event annotationSecondary Literature
GapVis for Hellespont
Named entities, linguistic information, event annotation, andbibliography connected in one interface:
A case study on Thuc. 1.89-118Different formats (TEI, CIDOC-CRM, AGDT, PML. . . )User interface based on GapVis:
http://nrabinowitz.github.io/gapvis
Thomas, Mambrini, Romanello The Hellespont Project
IntroductionThe GapVis Interface
Event annotationSecondary Literature
Book Summary
Thomas, Mambrini, Romanello The Hellespont Project
IntroductionThe GapVis Interface
Event annotationSecondary Literature
Entity Detail
Thomas, Mambrini, Romanello The Hellespont Project
IntroductionThe GapVis Interface
Event annotationSecondary Literature
Arachne Topography
Thomas, Mambrini, Romanello The Hellespont Project
IntroductionThe GapVis Interface
Event annotationSecondary Literature
Related Entities
Thomas, Mambrini, Romanello The Hellespont Project
IntroductionThe GapVis Interface
Event annotationSecondary Literature
Reading View
Thomas, Mambrini, Romanello The Hellespont Project
IntroductionThe GapVis Interface
Event annotationSecondary Literature
Manual event annotationLinguistic annotation
Outline
1 Introduction
2 The GapVis Interface
3 Event annotationManual event annotationLinguistic annotation
4 Secondary Literature
Thomas, Mambrini, Romanello The Hellespont Project
IntroductionThe GapVis Interface
Event annotationSecondary Literature
Manual event annotationLinguistic annotation
Going through secondary literature
Thomas, Mambrini, Romanello The Hellespont Project
IntroductionThe GapVis Interface
Event annotationSecondary Literature
Manual event annotationLinguistic annotation
Event List
Thomas, Mambrini, Romanello The Hellespont Project
IntroductionThe GapVis Interface
Event annotationSecondary Literature
Manual event annotationLinguistic annotation
Oinophyta Event
Thomas, Mambrini, Romanello The Hellespont Project
IntroductionThe GapVis Interface
Event annotationSecondary Literature
Manual event annotationLinguistic annotation
Myronides as a general
Thomas, Mambrini, Romanello The Hellespont Project
IntroductionThe GapVis Interface
Event annotationSecondary Literature
Manual event annotationLinguistic annotation
Outline
1 Introduction
2 The GapVis Interface
3 Event annotationManual event annotationLinguistic annotation
4 Secondary Literature
Thomas, Mambrini, Romanello The Hellespont Project
IntroductionThe GapVis Interface
Event annotationSecondary Literature
Manual event annotationLinguistic annotation
Natural language and events
Thuc. 1.102.2μάλιστα δ᾿ αὐτοὺς ἐπεκαλέσαντο ὅτι τειχομαχεῖν ἐδόκουν δυνατοὶ
εἶναι, τοῖς δὲ πολιορκίας μακρᾶς καθεστηκυίας τούτου ἐνδεᾶ
ἐφαίνετο: βίᾳ γὰρ ἂν εἷλον τὸ χωρίον.
Translation
[The siege of Ithome proved tedious, and the Lacedaemonianscalled in, among other allies, the Athenians . . . ]
[They] invited them especially because [they] considered [them]particularly skilled in siege operations, while, since the siege forthem was dragging on, [their] own deficiency in that sort ofwarfare was clear: for otherwise [they] would have taken theplace by force.
Thomas, Mambrini, Romanello The Hellespont Project
IntroductionThe GapVis Interface
Event annotationSecondary Literature
Manual event annotationLinguistic annotation
Natural language and events
Thuc. 1.102.2μάλιστα δ᾿ αὐτοὺς ἐπεκαλέσαντο ὅτι τειχομαχεῖν ἐδόκουν δυνατοὶ
εἶναι, τοῖς δὲ πολιορκίας μακρᾶς καθεστηκυίας τούτου ἐνδεᾶ
ἐφαίνετο: βίᾳ γὰρ ἂν εἷλον τὸ χωρίον.
Translation[The siege of Ithome proved tedious, and the Lacedaemonianscalled in, among other allies, the Athenians . . . ][They] invited them especially because [they] considered [them]particularly skilled in siege operations, while, since the siege forthem was dragging on, [their] own deficiency in that sort ofwarfare was clear: for otherwise [they] would have taken theplace by force.
Thomas, Mambrini, Romanello The Hellespont Project
IntroductionThe GapVis Interface
Event annotationSecondary Literature
Manual event annotationLinguistic annotation
Natural language and events
Thuc. 1.102.2μάλιστα δ᾿ αὐτοὺς ἐπεκαλέσαντο ὅτι τειχομαχεῖν ἐδόκουν δυνατοὶ
εἶναι, τοῖς δὲ πολιορκίας μακρᾶς καθεστηκυίας τούτου ἐνδεᾶ
ἐφαίνετο: βίᾳ γὰρ ἂν εἷλον τὸ χωρίον.
Translation[The siege of Ithome proved tedious, and the Lacedaemonianscalled in, among other allies, the Athenians . . . ][They] invited them especially because [they] considered [them]particularly skilled in siege operations, while, since the siege forthem was dragging on, [their] own deficiency in that sort ofwarfare was clear: for otherwise [they] would have taken theplace by force.
Thomas, Mambrini, Romanello The Hellespont Project
IntroductionThe GapVis Interface
Event annotationSecondary Literature
Manual event annotationLinguistic annotation
Natural language and events
Thuc. 1.102.2μάλιστα δ᾿ αὐτοὺς ἐπεκαλέσαντο ὅτι τειχομαχεῖν ἐδόκουν δυνατοὶ
εἶναι, τοῖς δὲ πολιορκίας μακρᾶς καθεστηκυίας τούτου ἐνδεᾶ
ἐφαίνετο: βίᾳ γὰρ ἂν εἷλον τὸ χωρίον.
Translation[The siege of Ithome proved tedious, and the Lacedaemonianscalled in, among other allies, the Athenians . . . ][They] invited them especially because [they] considered [them]particularly skilled in siege operations, while, since the siege forthem was dragging on, [their] own deficiency in that sort ofwarfare was clear: for otherwise [they] would have taken theplace by force.
Thomas, Mambrini, Romanello The Hellespont Project
IntroductionThe GapVis Interface
Event annotationSecondary Literature
Manual event annotationLinguistic annotation
NLP Pipeline
Tokenization POS-TaggingSyntacticParsing
ThematicRoles
InformationStructure
CoreferenceResolution
Thomas, Mambrini, Romanello The Hellespont Project
IntroductionThe GapVis Interface
Event annotationSecondary Literature
Manual event annotationLinguistic annotation
NLP Pipeline
NLP Process Ancient Greek?
Chunking
Lemmatization
POS-tagging
Syntactic parsing
Word-sense disambiguation
Co-reference resolution
Semantic role annotation
Thomas, Mambrini, Romanello The Hellespont Project
IntroductionThe GapVis Interface
Event annotationSecondary Literature
Manual event annotationLinguistic annotation
Using and Enhancing the available resourcesThe Ancient Greek Dependency Treebank
AGDT: treebank with word-by-word morphological anddependency-based syntactical description
a step forward: semantic information
Thomas, Mambrini, Romanello The Hellespont Project
IntroductionThe GapVis Interface
Event annotationSecondary Literature
Manual event annotationLinguistic annotation
Analytical Level“Surface” syntax
a-a-1999.01.0199_book1-chapter89_3AuxS
οἱAtr
γὰρAuxY
ἈθηναῖοιSb
τρόπῳAdv
τοιῷδεAtr
ἦλθονPred
ἐπὶAuxP
τὰAtr
πράγματαObj
ἐνAuxP
οἷςAdv
ηὐξήθησανAtr
.AuxK
Thomas, Mambrini, Romanello The Hellespont Project
IntroductionThe GapVis Interface
Event annotationSecondary Literature
Manual event annotationLinguistic annotation
Valency
The verbal node expresses a little drama. As adrama, it implies a process and, most of the times,actors and circumstances
L. Tesnière
Thomas, Mambrini, Romanello The Hellespont Project
IntroductionThe GapVis Interface
Event annotationSecondary Literature
Manual event annotationLinguistic annotation
Tectogrammatical annotation
t-t_tree-grc-s1-rootroot
γάρ1PRECatom
Ἀθηναῖος1ACTn.denot
ἔρχομαι1 enuncPREDv
πρᾶγμα1DIR3 staten.denot
ὅς1ACMP circn.denot
#PersPronACTn.denot
αὐξάνω1RSTRv
τρόπος1MANNn.denot
τοιόσδε1RSTRadj.pron.def.demon
.
.
.
Thomas, Mambrini, Romanello The Hellespont Project
IntroductionThe GapVis Interface
Event annotationSecondary Literature
Manual event annotationLinguistic annotation
From treebanks to event data-bases
Thomas, Mambrini, Romanello The Hellespont Project
IntroductionThe GapVis Interface
Event annotationSecondary Literature
Manual event annotationLinguistic annotation
What can you do with multi-layer trees?“Meaningful” relations between NEs
[The Athenians]. . . broughtthe territories of Boeotia andPhocis under their obedience,and withal razed the walls ofTanagra and took of thewealthiest of the Locrians ofOpus a hundred hostages,and finished also at the sametime their long walls at home(1.108.3)
Thomas, Mambrini, Romanello The Hellespont Project
IntroductionThe GapVis Interface
Event annotationSecondary Literature
Manual event annotationLinguistic annotation
Maps with semantically relevant relationsE.g. travels by sea
πλέω(sail)
Actor
DIR 3 (to)
DIR1 (from)
The Athenians
Other NE's
Thomas, Mambrini, Romanello The Hellespont Project
IntroductionThe GapVis Interface
Event annotationSecondary Literature
Manual event annotationLinguistic annotation
What can you do with multi-layer trees?Extraction and analysis of events
What actions do the Athenians perform?
Thomas, Mambrini, Romanello The Hellespont Project
IntroductionThe GapVis Interface
Event annotationSecondary Literature
Manual event annotationLinguistic annotation
What can you do with multi-layer trees?Extraction and analysis of events
What actions do the Spartans perform?
Thomas, Mambrini, Romanello The Hellespont Project
IntroductionThe GapVis Interface
Event annotationSecondary Literature
Related Secondary Literature (from JSTOR)
Figure : http://tiny.cc/GapVis-SecLit
Thomas, Mambrini, Romanello The Hellespont Project
IntroductionThe GapVis Interface
Event annotationSecondary Literature
Mining JSTOR:Where is Thuc. “hiding”?
A meaningful subsample
mining citations from all ~171k journal articles, not the best approach
curated bibliography (2009) before project started (CiteULike)
articles in JSTOR related to Thuc 1.89-118343 articles, 62 journalsjournals from bibliography as “seeds”samples ~73k articles (out of ~171k)
top-down vs bottom-up bibliographic approach
Pros and Cons
comprehensive coverage; > 2 centuries; multilingual
data not openly licensed
Thomas, Mambrini, Romanello The Hellespont Project
IntroductionThe GapVis Interface
Event annotationSecondary Literature
Extracting Citations
Thomas, Mambrini, Romanello The Hellespont Project
IntroductionThe GapVis Interface
Event annotationSecondary Literature
NLP Pipeline
Thomas, Mambrini, Romanello The Hellespont Project
IntroductionThe GapVis Interface
Event annotationSecondary Literature
Extracting Citations: Challenges
sentence segmentationsentence = sensible unit of contextboth for extraction and data analysis (co-citation)
dirty OCRinvalid character sequences (e.g. \n)
“inconsistent” use of punctuation1, 110-15 ; 1.89.1, 1.90 ; I 1, 102, 1solution: reason based on domain knowledge
similar references, surface similarityfragments, papyri, inscriptions
Thomas, Mambrini, Romanello The Hellespont Project
IntroductionThe GapVis Interface
Event annotationSecondary Literature
Thank you!Our contacts and temporary development server
[email protected]@[email protected]://www.tiny.cc/GapVis-Hellespont
Thomas, Mambrini, Romanello The Hellespont Project