abstracts dhbenelux wednesday - universiteit utrecht · abstracts dhbenelux 2017 conference...

83
1 Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW Wido van Peursen Eep Talstra Centre for Bible and Computer (ETCBC), Vrije Universiteit Amsterdam In the NWO-funded project “Tracing Syntactic Diversity in Biblical Hebrew Texts”, the Eep Talstra Centre for Bible and Computer investigates the various variables that account for the linguistic variation that can be observed in the Hebrew Bible, a collection of writings that were composed over a period of about a millennium. One of the parameters taken into account is text type. Various types of communication show different usages of the language. The language of narratives is different from that of legal or sapiential texts. However, for a linguistic analysis a classification based on genre (“story”, “laws”) is insufficient. Genre may suggest a certain text type (e.g. a fairy tale is “narrative”), but within one text various text types may occur: in a story, the characters may use discursive text in quoted speech; in psalms or direct speech sections, a short story may develop. Sometimes the narrator addresses the listeners/readers directly and switches from narrative to discursive speech. This happens, e.g., when a fairy tale ends with Und wenn sie nicht gestorben sind, so leben sie heute noch. 1 In Biblical Hebrew, there is a “narrative tense” (wayyiqtol) similar to, e.g., the French passé simple, which Harald Weinrich used for distinguishing two Tempusregister: besprechen and erzählen (corresponding more or less to Émile Benveniste’s histoire and discours). Weinrich’s focus was on Roman languages. Through the work of the Semitic and Egyptain scholar Hans-Jakob Polotsky his work found an entrance in Egyptian and Semitic studies and through Ariel Shisha-Halevy, one of Polotsky’s students, also in Celtic studies. Wolfgang Schneider 2 introduced Weinrich’s insights into Biblical studies, where they were further developed by Eep Talstra, 3 Alviero Niccacci, 4 Gino Kalkman 5 and others. Building of the work of these scholars, we use the category “text type” as a feature in our linguistic database of the Hebrew Bible. We distinguish between Narrative (N), Quotation (Q) and Discursive (D). We consider text type a feature of a clause, rather than of a larger literary unit, so that we can easily handle text type shifts within one and the same literary unit. We assign the labels on the basis of syntax, rather than literary considerations. E.g.: a clause containing wayyiqtol is assigned the text type label N; a clause containing a vocative or a 1 st or 2 nd person reference the label Q; the so-called Hebrew Imperfect interrupting a narrative yields the label D, indicating those cases where the narrator directly addresses the readers. 1 Harald Weinrich, Tempus: Besprochene und erzählte Welt (München: H.C. Beck, 1964). 2 Wolfgang Schneider, Grammatik des biblischen Hebräisch (München: Claudius, 1974). 3 E.g., Eep Talstra, “Text Grammar and Hebrew Bible I”, Bibliotheca Orientalis 35 (1978), 168–175. 4 E.g., Alviero Niccacci, The Syntax of the Verb in Classical Hebrew Prose (Sheffield: Sheffield Academic Press, 1990). 5 G.J. Kalkman, Verbal Forms in Biblical Hebrew Poetry: Poetic Freedom or Linguistic System? (PhD dissertation, Vrije Universiteit Amsterdam, 2015).

Upload: nguyenxuyen

Post on 14-Oct-2018

218 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

1

AbstractsDHBenelux2017conferenceWednesday5July2017

SessionA

1.TRACINGTEXTTYPESINBIBLICALHEBREWWidovanPeursenEepTalstraCentreforBibleandComputer(ETCBC),VrijeUniversiteitAmsterdam

IntheNWO-fundedproject“TracingSyntacticDiversityinBiblicalHebrewTexts”,theEepTalstraCentreforBibleandComputerinvestigatesthevariousvariablesthataccountforthelinguisticvariationthatcanbeobservedintheHebrewBible,acollectionofwritingsthatwerecomposedoveraperiodofaboutamillennium.Oneoftheparameterstakenintoaccountistexttype.

Varioustypesofcommunicationshowdifferentusagesofthelanguage.Thelanguageofnarrativesisdifferentfromthatoflegalorsapientialtexts.However,foralinguisticanalysisaclassificationbasedongenre(“story”,“laws”)isinsufficient.Genremaysuggestacertaintexttype(e.g.afairytaleis“narrative”),butwithinonetextvarioustexttypesmayoccur:inastory,thecharactersmayusediscursivetextinquotedspeech;inpsalmsordirectspeechsections,ashortstorymaydevelop.Sometimesthenarratoraddressesthelisteners/readersdirectlyandswitchesfromnarrativetodiscursivespeech.Thishappens,e.g.,whenafairytaleendswithUndwennsienichtgestorbensind,solebensieheutenoch.1

InBiblicalHebrew,thereisa“narrativetense”(wayyiqtol)similarto,e.g.,theFrenchpassésimple,whichHaraldWeinrichusedfordistinguishingtwoTempusregister:besprechenanderzählen(correspondingmoreorlesstoÉmileBenveniste’shistoireanddiscours).Weinrich’sfocuswasonRomanlanguages.ThroughtheworkoftheSemiticandEgyptainscholarHans-JakobPolotskyhisworkfoundanentranceinEgyptianandSemiticstudiesandthroughArielShisha-Halevy,oneofPolotsky’sstudents,alsoinCelticstudies.WolfgangSchneider2introducedWeinrich’sinsightsintoBiblicalstudies,wheretheywerefurtherdevelopedbyEepTalstra,3AlvieroNiccacci,4GinoKalkman5andothers.

Buildingoftheworkofthesescholars,weusethecategory“texttype”asafeatureinourlinguisticdatabaseoftheHebrewBible.WedistinguishbetweenNarrative(N),Quotation(Q)andDiscursive(D).Weconsidertexttypeafeatureofaclause,ratherthanofalargerliteraryunit,sothatwecaneasilyhandletexttypeshiftswithinoneandthesameliteraryunit.Weassignthelabelsonthebasisofsyntax,ratherthanliteraryconsiderations.E.g.:aclausecontainingwayyiqtolisassignedthetexttypelabelN;aclausecontainingavocativeora1stor2ndpersonreferencethelabelQ;theso-calledHebrewImperfectinterruptinganarrativeyieldsthelabelD,indicatingthosecaseswherethenarratordirectlyaddressesthereaders.

1 HaraldWeinrich,Tempus:BesprocheneunderzählteWelt(München:H.C.Beck,1964).2 WolfgangSchneider,GrammatikdesbiblischenHebräisch(München:Claudius,1974).3 E.g.,EepTalstra,“TextGrammarandHebrewBibleI”,BibliothecaOrientalis35(1978),168–175.4 E.g.,AlvieroNiccacci,TheSyntaxoftheVerbinClassicalHebrewProse(Sheffield:SheffieldAcademicPress,

1990).5 G.J.Kalkman,VerbalFormsinBiblicalHebrewPoetry:PoeticFreedomorLinguisticSystem?(PhD

dissertation,VrijeUniversiteitAmsterdam,2015).

Page 2: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

2

Theuseofonetexttypewithinanother,resultsinaccumulativelabels.Thus,adirectspeechwithinanarrativedomainreceivesthelabelNQ.Sometimesadirectspeechmaybequotedinanotherdirectspeechsection,resultinginthelabelNQQ.Withinadirectspeechanarrativetexttypemayappear,whichintroducesashortstoryinthemouthofoneofthecharacters,resultinginthelabelNQN.SuchaSprosserzählung(Schneider)mayagaincontainadirectspeech,resultinginaNQNQtexttype.Cf.2Kings1:6.

N They[themessengers]said(wayyiqtol)tohim[theking]

NQ “Therecameamantomeetus,

NQN andsaid(wayyiqtol)tous

NQNQ “Gobacktothekingwhosentyou,andsaytohim,

NQNQQ “ThussaystheLord

NQNQQQ “IsitbecausethereisnoGodinIsraelthatyouaresendingtoinquireofBa’al-zebub,thegodofEkron?

InanearlierphaseoftheETCBC,thetexttypeswereassignedbyhumanresearchersininteractiveprocedures.Recentlywechangedthisworkflowanddevelopedalgorithmsforautomaticallyassigningtexttypesbasedontheabove-mentionedsyntacticobservations.Itisinstructivetoseethedifferencesbetweentheformerassignmentoftexttypesinthehuman-computerinteractionandthecurrentautomaticassignment.OnesuchcasewheretheautomatictexttypeassignmentleadstoadifferentanalysisoccursinIsaiah3:14–15,wherewefindQwithoutanexplicitdirectspeechintroduction:“TheLordwillbringhischargeagainsttheeldersandofficersofHispeople:“Itisyou…”.”HeretheprogramassignedaQonthebasisofthe2ndpersonform.Thishadescapedthehumanresearcher.

Theautomaticassignmentoftexttypeshasseveraladvantages:

• Thewell-definedformalcriteriarenderthetexttypeassignmentstraceableandrepeatable.• Thestrictapplicationofformalcriteriarevealsphenomenathatescapehumanintuition(cf.

examplefromIsaiah3:14–15).• Intheabove-mentionedNWOprojectitprovedtobeagoodstartingpointforinvestigatingto

whatextenttexttypeaccountsforlinguisticvariationintheBible.• Itmakescomplextextuallayersandembeddingvisible(cf.2Kings1:6)

However,therearealsosomechallenges:

• Thereistheriskofcircularargumentbasedoninterrelatedlabelssuchas“narrativetense”and“narrativetexttype”.

• Thetexttypeassignmentsmayleadtocounter-intuitivecomplexlabelssuchasQNDNDN(Psalm78:45)orNDNDN(Isaiah9:19).

• Thisapproach,whichweinheritedfromthestartoftheETCBC40yearsago(whenthecreationofourHebrewdatabasestarted)runstheriskofbeingsomewhatidiosyncratic,relyingtoomuchononesinglestudyfromthe1960s.

Wetrytoparrythesechallengesby

• Analysingtexttypesininteractionwithotherparameters,suchasgenre.InourstatisticalanalysisofthedistributionoflinguisticphenomenainR,wetakeintoaccountallotherkindsofvariables,aswellastheirpossibleinterdependence.

• InvestigatingifandhowwecanharmonizeWeinrich’suseful,butalsosomewhatprovocativeandoutdatedviews(e.g.hisdenialoftheexpressionoftenseandaspectasfunctionsoftheverb)

Page 3: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

3

withcurrentinsightsabouttenseandaspectinBiblicalHebrew,6andmorerecentstudieson“DiscourseModes”andtheirlinguisticcorrelates.7

Thisresearchprovidesthusaninterestingcasestudyoftheinteractionbetweenlinguistictheoryandtextualanalysis,oftheconfrontationbetweenresearchtraditionswithinacertaindisciplineandadata-drivenapproach,andofthepotentialandlimitationsofautomatedanalysisofancienttexts.

2.Automatinggenreclassificationofhistoricalnewspaperarticles.Mappingthedevelopmentofjournalism’smodesofexpressionFrankHarbers–UniversityofGroningenJulietteLonij–DutchNationalLibrary(KB)

ThispaperdiscussesamachinelearningapproachtoautomatethegenreclassificationofDutchhistoricalnewspaperarticlesandreflectsonthechallengesanditsvalue.First,wediscusshowweusedanexistingsetofmetadatatocreateatrainingsetforthegenreclassifierandthechallengeswefacedinconnectingthemetadatatotheoriginaldigitizedhistoricalnewspaperarticles.Subsequently,thepaperoutlinesamachinelearningapproachtopredictthegenreofanewspaperarticles,discussingandevaluatingthedifferenttoolsthatweretestedintheprocess8.Finally,itreflectsonthewayatraditionalrule-basedapproachtodetermininggenrerelatestoamachinelearningapproach.

ExamininggenreDefinedas“languageuseinaconventionalizedcommunicativesettinginordertogiveexpressiontoaspecificsetofcommunicativegoalsofadisciplinaryorsocialinstitution,whichgiverisetostablestructuralforms”(Handford2010),genrecanelucidatetheunderlyinggoals,normsandpracticesofjournalismasadiscourse.Examiningjournalisticgenresfromahistoricalperspectivethereforeelucidateshownewspapers’conceptionofjournalismdeveloped.Yet,thistypeoflongitudinaltextualresearchishighlytimeconsumingandstillscarce.Moreover,thefewattemptstosystematicallyexaminenewspapermaterial,usingsocialscientificmethodssuchasquantitativecontentanalysis,stillonlycoverafractionoftheavailablematerial(Broersma,2011;Harbers,2014).

Automatingsuchcontentanalyseswouldbehighlybeneficialforresearchintothediscursivedevelopmentofnewspaperjournalism.Thispaperthereforecriticallydiscussesanapproachtoautomategenreclassification.Thisisadauntingtaskasgenresaredynamicandcanchangeorfadeawayovertimewhilenewonescanemerge.Moreover,genresareidealtypes,whichmeansthetextualmanifestationsdonotalwaysmatchallthecharacteristicsperfectly,norcantheyalwaysbeclearlydelineatedfromothergenres.

6 E.g.,JanJoosten,TheVerbalSystemofBiblicalHebrew.ANewSynthesisElaboratedontheBasisof

ClassicalProse(JerusalemBiblicalStudies10;Jerusalem:Simor,2012).7 E.g.,CarlotaS.Smith,ModesofDiscourse.TheLocalStructureofTexts(CambridgeStudiesinLinguistics

103;Cambridge:CambridgeUniversityPress,2003).8 ThesourcecodefortrainingtheclassifierandapplyingittonewexamplesisavailableonGitHub

(https://github.com/jlonij/genre-classifier)andeverybodycanexperimentwiththeclassifierthroughagraphicalwebinterfacecreatedathttp://www.kbresearch.nl/genre

Thisdatasetwastheresultofa

large-scaleresearchprojectintothehistoricaldevelopmentofEuropeannewspaperswiththetitle‘Reportingattheboundariesofthepublicsphere.Form,StyleandStrategyofEuropeanJournalism,1880-2005’.

Page 4: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

4

AmachinelearningapproachtoautomaticallyclassifygenreBuildingonanexistingsetofmanuallycodedmetadata,describingseveraltextualcharacteristics,

suchasgenre,ofalargesample(33.000)ofhistoricalnewspaperarticles2,thispaperoutlinesa

machinelearningapproachtoautomatethegenreclassificationofhistoricalnewspaperarticles.Thisdatasetthusprovideduswithmetadataaboutanumberofhistoricalarticlesthatwasusedtotrainandformallyevaluateaclassifierthatisabletoautomaticallypredictthegenreofadditionalsamplesofhistoricalnewspaperarticles.Yet,theexistingmetadataneededtobelinkedtothecorrespondingdigitizedarticlesinthedigitalnewspaperarchivesoftheKB.

Thepaperwillfirstdiscussthislinkingprocess.Wefirstselectedthemostpromisingcandidatelinksforeachitemintheoriginaldataset,basedonthepositionofthearticleonthepage,itssize,andthepresenceofimagesandquotes.Asimpleclassifierwasthentrainedtoselectthebestlinkfromthecandidateset,ifany,basedonmoreprecisefeaturessuchasthesizedifferencebetweenthearticleandthecandidate,aswellasauthormentionsandsubjectmatter.Byonlyacceptinglinkspredictedwitharelativelyhighconfidencevalueapproximately50%ofallarticlescouldbeautomaticallylinked,withanerrorrateof0.5%.

Subsequently,wewilloutlineanddiscusshowtheresultingdatasetwasusedtotraintheactualgenreclassifier.Afterthearticleswerepre-processedwiththeNaturalLanguageProcessingsuite‘Frog’,theannotatedtextswereexaminedfortheirtextualfeatures,includingthelengthofthearticle,thenumberofdirectquotes,thenumberofadjectives,varioustypesofpronouns,andthenumberandpositionofnamedentitiesinthetext.Theselectionofthesefeaturesisbasedonthegenredefinitionsofthecodebookofthemanualcontentanalysis.

Thesefeatureswereusedtotrainaclassifiertochooseoneofeightpossiblegenresforeacharticle,rangingfromnewsreporttoopinionarticle.Weevaluatedtheperformancethrough10-foldcross-validation,usingstratifiedsamplingtocreaterelevantsubsets.AlinearSVMclassifierwaschosenaftercomparisonofvariousevaluationmetricswithanumberofotheroptions(NaïveBayes,non-linearSVMsandsomesimpleneuralnetworks),yieldingthebestresultswithanaccuracyof65%.Itisimportanttonoteherethathumancodersdonotalwaysagreeonwhattherightgenreis.Theintercoderagreementforgenreinthemanualcontentanalysiswasaround80%(Krippendorf’salpha,takingintoaccountchance,wasbetween0.7and0.8indifferentgroupsofcoders).Assuch,65%isconsideredaverypromisingresult.

Finally,wereflectontherelationbetweenarule-basedandmachinelearningapproachtotheclassificationofgenre.Wewilldiscussthesignificanceofindividualfeaturesinthemachinelearningprocessandshowhowthe‘confusionmatrix’providesvaluableinformationaboutthecommonmistakesoftheclassifierandwhichgenresaremostdifficulttopredict.Moreover,astheprobabilityforthepredictedgenreaswellasfortheothergenresisknown,wewilldiscusshowthesenumbersofferinsightsinthedynamicnatureofjournalisticgenres.

Bibliography-Broersma,M.(2011).‘Nooitmeerbladeren.Digitalekrantenarchievenalsbron’.In:TijdschriftvoorMediageschiedenis14(2):29-55Handford,M.(2010).‘Whatcanacorpustellusaboutspecialistgenres’.In:‘oKeeffe,A.&McCarthy,M.(eds.),TheRoutledgeHandbookforCorpusLinguistics.NewYork:Routledge.-Harbers,F.(2014).BetweenPersonalExperienceandDetachedInformation.TheDevelopmentofReportingandtheReportageinGreatBritain,theNetherlandsandFrance,1880-2005.PhDUniversityofGroningen

Page 5: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

5

3.GeneratingInteractiveNarrativesfromWikipediaArticlesKeywords:ComputationalCreativity,InteractiveNarratives,Wikipedia,Chatbots

BenBurtenshawComputationalLinguistics&PsycholinguisticsResearchCenterUniversiteitAntwerpenbenjamin.burtenshaw@uantwerpen.be

TomDeSmedtExperimentalMediaResearchGroupSintLucasAntwerpenSchoolofArtsinfo@emrg.be

MikeKestemontComputationalLinguistics&PsycholinguisticsResearchCenterUniversiteitAntwerpenmike.kestemont@uantwerpen.be

Storiesplayavitalroleinthelivesofchildren.Thealternativeworldstheyproduceencourageimaginationandcreativity,butalsotransformknowledgeintostructuresthatchildrencanunderstandandrelateto.WepresentaninteractivestorysystemthatcreatesnarrativesfromWikipediaarticles,andrevealsthemthroughdialoguewithauser.Usingstate-of-the-artnarrativegenerationtoolsandachatbotdialoguesystem,informationfromWikipediaisrevealedtothechildbasedontheirinput.GeneratingnarrativesfromanytexthaslongbeenagoalofArtificialIntelligenceresearchersbecausenarrativestructuresareusefulforlearnersofallages.However,manyoftheexistingstorygenerationsystemshavereliedonhand-writtentechniquesthatcannotmeetthescaleofdataonline.Inrecentyearssearch-basedsystemshavebeenabletoincorporatebroadertopics,buttheyhavesacrificedcontinuity,producingfragmentednarrative.Hereweproposeasearch-basedsystemthatscoursWikipediaforarticlesrelatingtouserinput,andthenrestrictsitsgenerationmaterialtothatarticle;indoingso,thesystemutilisesthedefinedtopicandchronologyofthearticletoretaincontext.

Narrativegenerationhasbeenacentraltopicwithinthefieldofcomputationalcreativityfordecades.OneofthefirstexamplesisTale-Spin,asystemthatgeneratesAesop'sFablesguidedbytheuser’sinput(Meehan1977).Tale-spinproducesaninnovativeformofinterface;however,itstrugglestodealwithundirecteduserinput.UniversebyMichaealLebowitzdrawsonadatabaseofcharacterdefinitions,plotoutlines,anddialogues,toweavetogethernewstories;however,thesystemhasatendencytobecomerepetitiveduetoitslimitedcontent.CallawayandLesterproducedStorybook,anarrativeprosegeneratorthatidentifiesthetemporalmarkersinatext,andusesthemtogenerateanewnarrativetext(CallawayandLester2002).Systemsofthiskindtendtoreproducenarrativetropes,andultimatelybecomeboring(SwansonandandGordon2008).McIntyreandLapatadevelopedoneofthefirstsearch-basednarrativesystems,thatusesuser-inputtosearchadatabaseforrelatingphrases(McIntyreandLapata2009).Thisarchitectureproducesinterestingrelationshipsbetweenphrases,butistosporadictocreateameaningfulnarrative.Tocounterthesesemanticfluctuations,RiedlandBultikousedialoguetoguidethegeneratedtext(RiedlandBulitko2012).However,bytheirownadmission,thisbecomesmonotonoustotheuser,withlittleroomforsurprise.

WeproposeadialoguesystemthattakesinputfromtheuserandgeneratesastoryinrelationtoarelevantWikipediaarticle.Thisusesasearch-basedretrievalapproachsimilartoprevioussystems(RiedlandBulitko2012;McIntyreandLapata2009);however,wetakeadvantageofWikipediafortopicandcontextualgrounding.Forexample,bytakingthepropernounsinatemplatesentence,andreplacingthemwiththosefromaWikipediaarticle,thesystemproducesastoryfeaturingplacesand

Page 6: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

6

charactersfromhistory:Theusermightask“WhydidtheEgyptianshavepyramids?”,thesystemwouldusekeypointsintheWikipediaarticle‘PyramidsofAncientEgypt’,toassembleasetofnarrativeconcepts,andquerytheusertoseeiftheyareinterestedinthosetopics.

Wehavetrainedthesystemonchildren’sstories,whichmeansitusesamachinelearningapproachtoselectresponsesthataremostsimilartothosestories.AdialogueisstartedbyinputtingatopicwhichisusedtoretrieveacompleteWikipediapage.Thepagestructureguidesthedialoguesequence,andthelanguageisusedtocreateresponses.Firstthesystemdefinesasetofnarrativeconceptsbasedonthesectionsofthewikipediapage,thenitusesgrammarbasedtechniquestocreatephrasesforeachconcept.Thenarrativeconceptsarevectorbasedrepresentationofeachpagesection,whichareusedtocomparethemostimportantstringsfromasection.Dialoguebeginswhenthesystemhasbuiltadatabaseofpossibleresponsesthatarestoredinadatabase.Asthedialogueprogressesthecorephrasesareaddedtoandremoved;ineffect,actingasanarrativecontext.DialogueisfacilitatedbyaMarkovDecisionProcessthatmatchestheuser’sinputtopossibleresponsesbasedonarewardfunctionfromtraining.Dialoguehistoryisalsoaddedtothedatabase,whichmeansthesystemlearnsfromthedialogueitself,andalsoavoidsrepetition.

Usingnamedentityrecognitiontoaddnamesandplacesgroundsthenarrativeofthestory.Thispracticedrawsonpedagogicaltheoriesofinquiry,thatproposechildrenlearnefficientlythroughselfinitiatedrequestsforinformation(Conle2000;Mcquigganetal.2008).Childrenareabletoconstructthisinformationintotheirownstory,whichinturnfortifiesthatknowledge.Inpractice,anarrativesystemlikethiswouldneedtobecontextualisedforthechild;thiscouldbeafictionalcharacterisationofthesystemasnaiveandinneedofguidance.Forexample,theforgetfulrobotthatneedshelptoexplaintheirstory.

Narrativesareaprovenpartofeducation,andgeneratingthemautonomouslyisalong-termaimofartificialintelligence.Herewespeculateuponacontainedusageofnarrativegeneratingtechnology,whichusesthestructureofWikipediatorearticulatetextinaformsuitableforchildren.Overthenextfouryearsouraimistodevelopageneralstorygenerationsystemforchildren.AtDHBeneluxwewillpresentaworkingprototypeofthissystem.

ReferencesCallaway,C.,andJ.Lester.2002.‘NarrativeProseGeneration’.ArtificialIntelligence139(2):213–252.

Conle,Carola.2000.‘NarrativeInquiry:ResearchToolandMediumforProfessionalDevelopment’.EuropeanJournalofTeacherEducation23(1):49–63.doi:10.1080/713667262.

McIntyre,Neil,andMirellaLapata.2009.‘LearningtoTellTales:AData-DrivenApproachtoStoryGeneration’.InProceedingsoftheJointConferenceofthe47thAnnualMeetingoftheACLandthe4thInternationalJointConferenceonNaturalLanguageProcessingoftheAFNLP:Volume1-Volume1,217–25.AssociationforComputationalLinguistics.

Mcquiggan,ScottW,JonathanPRowe,SunyoungLee,andJamesCLester.2008.‘Story-BasedLearning:TheImpactofNarrativeonLearningExperiencesandOutcomes’.InInternationalConferenceonIntelligentTutoringSystems,530–539.SpringerBerlinHeidelberg.

Meehan.1977.‘TALE-SPIN,AnInteractiveProgramThatWritesStories’.In5thInternationalJointConferenceonArtificialIntelligence,91–98.

Riedl,MarkO.,andVadimBulitko.2012.‘InteractiveNarrative:anIntelligentSystemsApproach’.AiMagazine34(1):67.

Swanson,R.andGordon,A.2008.‘SayAnything:AMassivelyCollaborativeOpenDomainStoryWritingCompanion’.InProceedingsofthe1stInternationalConferenceonInteractiveDigitalStorytelling.LectureNotesinComputerScience.Vol.5334.Berlin:Springer.

Page 7: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

7

SessionB

1.Linkingmulti-disciplinarydatasourcesforahistoricalresearchplatformKalliopiZervanou1,WouterKlein2,PetervandenHooff2,FransWiering1andToinePieters21Information&ComputingSciencesDepartment,UtrechtUniversity2FreudenthalInstitute,History&PhilosophyofScience,UtrechtUniversity

Theproblemofinformationaccessisachallengeinmakingdigitiseddatasourcesavailable.Historiansneedtoidentifyinformationindigitalmaterialpools,scatteredacrosscollectionsandoftenlackingsemanticlinkstoatopicofinterest.Thisproblemisaddressedbythedevelopmentofvariouscollection-specificmetadataschemas,suchasMARC21(LibraryofCongress,2010),andgenericmetadataschemas,suchastheDublinCoreMetadataInitiative(DCMI,2011).Moreover,diversemetadataschemasaremappedtoeachother(BountouriandGergatsoulis,2009),ortocustom(Liaoetal.,2010)orstandardontologies(Lourdietal.,2009),suchastheCIDOCConceptualReferenceModel(CIDOC,2006).Adominanttrendinrecentapproachesisthelinked-dataapproach(Berners-Lee,2006;Bizeretal.2009).Besidesinformationaccess,theamountandthecomplexityofinformationaccessiblegivesrisetoaninformationpresentationchallenge,wherebydataoverviewsshouldhighlightinterestingdataaspectsrequiringdetailedinspection.Additionally,fordigitalmethodstosupportcollaborativeresearch,theproblemofinformationvalidationandsharingmustbeaddressed.Thisissuecallsfortransparencyofdataandmethodsandreproducibilityofresults,orverificationoftheargumentsmade.Italsoentailsvalidationofcomputationalresultsandalgorithmicprocessesdetermininginformationaccess,insuchawaythattheeventualdataorsystemlimitationsandbiasesareknownandtheprocessesaretrustworthyandverifiable.

Inourwork,weaddressthesechallengesofinformationaccess,presentation,validationandsharingfromatwofoldresearchperspective:

I. Integrationandsemanticlinkingofexisting,multidisciplinarydatasources;II. Developmentofaresearchplatformthatsupportsaccess,presentation,validationand

sharingofcomplex,interlinkeddata.

OurparticulardomainofapplicationrelatestothehistoryofbotanicaldrugcomponentsfromtheNewWorldintheearlymodernperiod(17-18thcentury).Morespecifically,itconcernshighlightingphenomenadenotingdevelopmentalprocessesofremediesordrugtrajectories,suchastheevolutionofeconomicimportance,ethicalattitudes,scientificinterests,tradeandknowledgecirculation(Gijswijt-Hofstraetal.,2002;Pieters,2004;FriedrichandMüller-Jahncke,2009;Klein&Pieters,2016).

Forthispurpose,weintegratesourcescomprisingofpharmaceuticaldata,suchasthePharmaceuticalHistoricalThesaurus(Klein&vandenHooff,2013),archaeobotanicaldata,suchasRADAR(vanHaaster&Brinkkemper,1995;RCE,2013),botanicaldata,suchastheNationalHerbariumoftheNetherlands(Creuwels,2014),theEconomicBotanydatabase(Hoffman,2011)andtheSnippendaalCataloguedatabase(vanReenen,2007),colonialtradedata,suchasthedatabaseoftheaccountingbooks(Boekhouder-Generaal)oftheDutchEastIndiaCompany(Schooneveld-Oosterlingetal.,2013)andlinguisticdictionaries,suchastheChronologicalDictionaryofDutch(vanderSijs,2001).

AnotablerecentapproachtotheissueofdigitalformatsintegrationistheoneadoptedintheTimbuctooinfrastucture(Andersen,2013).Mostapproachesoptforconversiontoarecommendedmetadataschema,suchasSKOS(Miles&Bechhofer,2009),oracommondatamodelsuchastheEuropeanaDataModel(EDM,2016).However,apartfromthediversityofdigitalformatsan

Page 8: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

8

importantaspectinintegrationliesinthereuseandre-purposingofresourcesoriginallybuiltforadifferentaudienceandpurpose.

Inourapproach,integrationentailsconceptmapping,notonlyacrossdisciplines,butalsointime.Thus,datasourceintegrationcallsforsupportfortheevolutionofsciencefromthe16thcenturyonwardstore-classifyandre-defineconcepts.Additionally,itentailsdealingwithphenomenaofhistoricaltermvariationandambiguitywhichgraduallygivewaytospellingstandardisationandcurrentnomenclatureconventionsine.g.botanyandbiology.Furthermore,weaccountforunder-specificityandambiguityofinformationfoundinhistoricalsourceswhilemaintainingassociationswithpotentiallyrelatedconceptsandcontext.Mostimportantly,weprovidereferencesforinformationprovenancetracingandvalidation.Forthesepurposes,weresorttodesigningourownontology,wheree.g.ambiguoustermsareconnectedtomultipleconcepts,temporalperiodsandreferencesources,andwheremappingsareprovidedacrossessentialhistoricalandcurrenttaxonomies.Ourdatasourcesaresemi-automaticallyenrichedwithadditionalinformation,suchasgeographicalcoordinatesandnamedentities.Moreover,inconsistencieswithinandacrossdatasetsaresemi-automaticallyidentifiedandnormalised.Finally,datasourcesareintegratedfollowingalinkeddataapproachallowingforextensionstootherlinkedopendataandeventuallycapitalisingontechniquessuchasreasoning,whichmayextendexplicitinformationinourdatasetswithimplicitlyinferredinformation.

OurTimeCapsuleresearchplatform9implementsoursolutionstoinformationaccess,presentationandvalidationchallenges.Itisascalableworkingplatformcurrentlyqueryingmorethan55millionRDFtriples.Itisoftendifficultforanon-expertusertoperformqueries,eitherbecausetheyareunfamiliarwiththerequiredterminology,orbecausetheyareunfamiliarwiththeunderlyingdatamodel.Oursolutiontothisissueliesinprovidingtwoqueryingstrategies,onethatsupportsafaceted,exploratory,guidedsearchandbrowsingofinformationbymeansoflinks,photos,andkeywordauto-completionsuggestionsandonethatsupportsthecreationofadhocqueries.Ourexploratorysearchmodeisintendedtoengageawideraudienceandrevealtobothexpertsandnon-expertuserstheunderlyingdatacontentandstructure.AdhocqueriesareinessenceadhocRDFSPARQLqueries(Prud'hommeaux&Seaborne,2008)toourdata.However,giventhatmostusersareneitherfamiliarwithSPARQL,norwiththecontentandstructureofourdatastore,aquerywizardisprovidedthatassistsusersinformingnaturallanguagequeries,suchas“Whichdrugcomponent(s)aremadeoutoftheplantAcoruscalamusL.andwhichpartsoftheplantwereused?”

Searchresultsarepresentedasanoverviewofallavailableinformationonthequerytopicandusersmay"zoom-in"onspecificinformationbyfollowinglinksthatprovidemoredetailedgeographical,temporalandconceptrelationvisualisations.Suchvisualisationsaremainlyintendedtoprovideoverviewsintheevolutionofphenomenarelatedtodrugtrajectories,suchasforinstancechangeinaplantpartusedasmedicalingredient,traderoutesofbotanicalproducts,orgeographicaldistributionsintimeofknownconceptsinLatin/scientifictermsvs.layterms,thelatterindicatingpublicknowledgeandfamiliaritywithagivenplantordrug.

ReferencesAndersen,J.A.,Filarski,G.J.,HaentjensDekker,R.,Maas,M.&Ravenek,W.(2013).Timbuctoodatarepositoryinfrastructure(version1.0),HuygensING–ICT,Amsterdam,TheNetherlands.

Berners-Lee,T.(2006).LinkedData.Documentversion:June2009.In:DesignIssues,W3C.Availableonlineat:https://www.w3.org/DesignIssues/LinkedData.html

9 TimeCapsulesystem:http://timecapsule.science.uu.nl/timecapsule/#/loginLogginginasaGuestallows

fullaccesstothesystemfunctionalityexceptsavingyoursearchresults.

Page 9: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

9

Bizer,C.,HeathT.andBerners-LeeT.(2009).LinkedData-TheStorySoFar.InternationalJournalonSemanticWebandInformationSystems,vol.5(3),pp.1-22.DOI:10.4018/jswis.2009081901

BountouriL.andGergatsoulisM.(2009).Interoperabilitybetweenarchivalandbibliographicmetadata:AnEADtoMODScrosswalk.JournalofLibraryMetadata,9(1-2):98–133.

CIDOC(2006).TheCIDOCConceptualReferenceModel.CIDOCDocumentationStandardsWorkingGroup,InternationalDocumentationCommittee,InternationalCouncilofMuseums.Availableonlineat:http://www.cidoc-crm.org/.

Creuwels,J.(2014).TheNationalHerbariumoftheNetherlands.NaturalisBiodiversityCenter,Leiden.Availableonlineat:http://herbarium.naturalis.nl/

DCMI(2011).TheDublinCoreMetadataInitiative.Availableonlineat:http://dublincore.org/.

EDM(2016).EuropeanaDataModel–MappingGuidelinesv2.3,18November2016,EuropeanaNetworkAssociation.Availableonlineat:http://pro.europeana.eu/page/edm-documentation

Friedrich,C.andMüller-Jahncke,W.-D.(eds.)(2009).Arzneimittelkarrieren:zurwechselvollenGeschichteausgewählterMedikamente:dieVorträgederPharmaziehistorischenBiennaleinHusumvom25-28.April2008,Stuttgart:WissenschaftlicheVerlagsgesellschaft.

Gijswijt-Hofstra,M.,VanHeteren,G.M.andTansey,E.M.(eds.)(2002).Biographiesofremedies:drugs,medicinesandcontraceptivesinDutchandAnglo-Americanhealingcultures.Cliomedica66,Amsterdam:Rodopi.

vanHaasterH.andBrinkkemperO.(1995).RADAR,aRelationalArchaeobotanicalDatabaseforAdvancedResearch.VegetationHistoryandArchaeobotany,vol.4(2),pp.117-125,Springer.

Hoffman,B.(2011).TheNaturalisEconomicBotanydatabase.NaturalisBiodiversityCenter,Leiden.

Klein,W.andPieters,T.(2016).TheHiddenHistoryofaFamousDrug:TracingtheMedicalandPublicAcculturationofPeruvianBarkinEarlyModernWesternEurope(c.1650–1720).JournaloftheHistoryofMedicineandAlliedSciences,Vol.71(4),pp.400–421.DOI:10.1093/jhmas/jrw004

Klein,W.andvandenHooff,P.C.(2013).FarmaceutischeHistorischeThesaurus.NationalMuseumfortheHistoryofPharmacy,Utrecht.

Liao,S.-H.,Huang,H.-C.,andChen,Y.-N.(2010).Asemanticwebapproachtoheterogeneousmetadataintegration.In:ProceedingsofICCCI’10,LNCSvol.6421,pp.205–214,Kaohsiung,Taiwan.Springer.

LibraryofCongress(2010).MARCstandards.NetworkDevelopmentandMARCStandardsOffice,LibraryofCongress,USA.Availableonlineat:http://www.loc.gov/marc/index.html.

Lourdi,I.,PapatheodorouC.,andDoerrM.(2009).Semanticintegrationofcollectiondescription:CombiningCIDOC/CRMandDublinCorecollectionsapplicationprofile.D-LibMagazine,15(7/8).

Miles,A.andBechhoferS.(eds)(2009).SKOSSimpleKnowledgeOrganizationSystemReference.W3CRecommendation,18August2009.Availableonlineat:http://www.w3.org/TR/skos-reference

Pieters,T.(2004).Historischetrajectenindefarmacie:medicijnentussenconfectieenmaatwerk.Inaugurallecture–Hilversum.

Prud'hommeaux,E.andSeaborneA.(eds.)(2008).SPARQLQueryLanguageforRDF.W3CRecommendation,15January2008.Availableonlineat:https://www.w3.org/TR/rdf-sparql-query/

Page 10: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

10

RCE(2013).RADAR,aRelationalArchaeobotanicalDatabaseforAdvancedResearch.RijksdienstvoorhetCultureelErfgoed,MinisterievanOnderwijs,CultuurenWetenschap.Availableonlineat:https://archeologieinnederland.nl/bronnen-en-kaarten/radar

vanReenen,G.(2007).Snippendaalcatalogusdatabase.HortusBotanicusAmsterdam.Availableonlineat:http://dehortus.nl/en/Snippendaal-Catalogue

Schooneveld-Oosterling,J.,Knaap,G.,Karskens,N.,Smit-Maarschalkerweerd,D.,Tetteroo,S.,vandenTol,J.,Nijhuis,H.,vanWijk,K.,Kunst,A.,Buijs,J.,Jongma,M.,Boer,R.(2013).Boekhouder-GeneraalBatavia.HuygensING.Availableonlineat:http://resources.huygens.knaw.nl/boekhoudergeneraalbatavia

vanderSijs,N.(2001).ChronologischWoordenboek.Availableonlineat:http://dbnl.org/tekst/sijs002chro01_01/

2.ALinkedDataApproachtoDiscloseHandwrittenBiodiversityHeritageCollectionsLiseStork,LeidenInstituteofAdvancedComputerScience(LIACS),LeidenUniversity,NielsBohrweg1,2333CALeiden,[email protected]

AndreasWeber,DepartmentofScience,TechnologyandPolicyStudies(STePS),UniversityofTwente,POBox217,7500AEEnschede,[email protected]

Overthelastdecade,naturalhistorymuseumsinandbeyondtheNetherlandshaveheavilyinvestedindigitizingandextractingbiodiversity information frommanuscript and specimencollections(Heerlien et al. 2015; Pethers and Huertas, 2015; Svensson, 2015). In particular handwrittenfieldnotesdescribingoccurrencesofspeciesinnature(seeillustration)formanimportantbutoftenneglectedstartingpointforresearchersinterestedinlong-termhabitatdevelopmentsofaspecificareaand thehistoryof scientificordering,writingandcollectingpractices (Blair2010;Bourget2010;Eddy2016).Inordertodisclosehandwrittendescriptionsof flora andfauna and relatedspecimenanddrawingscollections,natural historymuseums usuallyresort tomanualenrichmentmethods such as full texttranscriptionorkeywordtagging(Ridge2014;Franzonietal.2014).Oftenthesemethodsrelyoncrowdsourcing, whereonlinevolunteersannotatepageswithunstructuredtextual labels (FieldBookProject2016).More recently, curatorsofarchives,datascientistsandhistorianshavestartedtoexperimentwithsemi-automaticannotationsystemsforhistoricalmanuscriptcollectionssuchastheMONKsystem(Schomakeretal.2016).SinceMONKisasupervisedlearningsystem,alargeamountofproperlyrecognizedtextuallabelsisnecessarytosafeguardthesystem’srecognitionabilities.

Thus,althoughsuchpracticeshavethepotentialtoyieldhighqualitydata,merelyannotatingpageswithunstructuredtextuallabelsraisestwoproblems:First,withoutsuggestionsdriven by semantic

Page 11: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

11

knowledge, itwill be hard for volunteers or amachine to start annotatinghandwrittenpages.Notonlyinthecontextofourcasestudy,whichdealswithfieldnoteswritteninearly nineteenth centuryinsular SoutheastAsia, but also in the contextof othermanuscriptcollections, one needs athorough knowledge of paleography, and historical and taxonomicbackgroundinformation(CauserandTerras2014).Semanticscanaidtheannotationprocesswhendealingwithambiguityorprovidesuggestionsincaseswherewordsarehardtoreadandtoolittleexampleinstancesareavailable.For instance,whenafieldnotedescribesanexpedition inEast-Java,aspeciesoffrogsofWest-Celebescanberuledout.Second,unstructuredtextualannotationwilleventuallyresultinaninefficientsearchprocessonthesideoftheuser.Traditionalkeyword-basedsearch leadstomanyirrelevantresultsorrequiresspecificpriorknowledgeregardingthecontent.Toanswermoregeneralandexpressivequeries,semanticrelationsbetweenannotationsneedtobeconsideredaswell(Elbassuoni,etal.2010).

Inordertohelpsolvesuchproblemsthispaperarguesforthedevelopmentandapplicationof asemantic model for semi-automatic semantic annotation. The model aggregates existingmetadatastandardsandontologies,followingtheLinkedDataprinciples,andpreparesthemforsemantically annotating and interpreting theNamedEntities (NEs) in the fieldnotesofdigitizednaturalhistoricalcollections.10

Thecasestudyofthispaperisacollectionof8000fieldnotesgatheredbytheCommitteefor NaturalHistory of the Netherlands Indies (Natuurkundige Commissie voor Nederlandsch-Indië,furtherreferredtobytheacronymNC).Inthefirsthalfofthenineteenthcentury,naturalistsoftheNCchartedthenaturalandeconomic state of the IndonesianArchipelagoand returnedawealthofscientificobservationswhicharenowstored in the archives anddepotofNaturalisBiodiversityCenterinLeiden(Mees1994;Klaver2007).Anin-depthhistoricalanalysisrevealsthatHeinrichKuhl(1797-1821), Johan Coenraad vanHasselt (1797-1823) and othertravelersof the NC use thefollowingNEstostructuretheirfieldnotes(seeillustrationdisplayingabundleofNCfieldnotes)whiletravelingininsularSoutheastAsia:collectinglocalities,dates,collectors’names,taxonomicnames,andreferencestootherprintedorhandwrittensources.KuhlandVanHasselt, for instance,regularlyusethe illustrationsofprintedworkssuchastheVoyagededécouvertesauxterresaustrales(1807-1816)byM.F.Péronasvisualpointofreferencefortheirfieldnotedescriptions.WhilelinkstopublishedresourcescanbeeasilyestablishedbylinkingthemtodomainspecificrepositoriesofdigitizedbookssuchastheBiodiversityHeritageLibrary(BHL),collectionlocalities,taxonomicnamesandcollectors’namesaremoredifficulttoprocess.

Inordertobeabletoidentify,annotateandinterlinksuchNEsinasemi-automaticway,thispaperproposestheimplementationofaKnowledgeBase(KB).TheKBhastwogoals:first,theunderlyingdatastructureof theKBenablescross-matchingofresourceswithinandacrossfieldnote

10 The project Semantic Blumenbach thinks in a similar direction, but then with a focus on publishedmaterial(Wettlauferetal.2015).

Page 12: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

12

collections.InordertorealizethisfunctionalightweightapplicationontologywritteninRDF11andOWL12issuggestedthatservesasaschematosemanticallystructuretheKB.Itexpressesspeciesobservations,ensurestheirprovenanceinrelationtothedigitizedfieldnotesandbuildsonexistingmetadataandontologystandards.Entitiesinturnaredescribedusinguniformresourceidentifiers(URIs).ThisallowsforanintegrationofthefieldnoteannotationsintothewebofLinkedData(LD)andensuresinteroperability with other digital collections (Hallo et al. 2016). Second, the logicalcharacteristicsofthepropertiesintheontologyenableareasonersystemtosuggestpossibleNEs.InordertoprovidepossiblelabelsregardingtheseNEs,theKBisprepopulatedwithlistsextractedfromthesauri,gazetteers,andtaxonomies.Asregardscollectionlocalitieswe,forinstance,drawupontheGEOnetsNamesServer(GNS),alargesemanticallystructureddatabasecontaininghistoricalandpresent-daygeographicallocationsininsularSoutheastAsia.BiologicalspeciesnamescanbedrawnfromtheLinnaeantaxonomyofspecieswhichwasalreadywellestablishedatthetimeoftheNC(Farber2000;Beckman2012).AsregardspersonnameswerelyonthedatabaseCyclopediaofMalaysianCollectorswhichM.J.vanSteenis-Krusemancompiledinthe1960sand1970s.13Takentogether, by prompting users to annotate with terms from the KB, a semantic network ofannotations isformedthat isableto improvethequalityoftheannotationsandbootstrapstheannotationprocess.TheontologyandanimplementationoftheKBbasedonourcasestudy,togetherwithpossibilitiesregardingsupportedqueryingandreasoningtechniques,willbediscussedinmoredetailduringthepresentation.

BibliographyBeckman,J.“TheSwedishTaxonomyInitiative :ManagingtheBoundariesof‘Sweden’and‘Taxonomy’” InScientistsandScholars intheField:Studies intheHistoryofFieldworkandExpeditions,editedbyK.H.Nielsen,H.Harbsmeier,andCh.J.Ries,395–414.Aarhus:AarhusUniversityPress,2012.

Bourguet,M.-N.“APortableWorld:TheNotebooksofEuropeanTravellers(EighteenthtoNineteenthCenturies).”IntellectualHistoryReview20,no.3(2010):377–400.

Causer,T.andM.Terras.“‘“ManyHandsMakeLightWork.ManyHandsTogetherMakeMerryWork”:TranscribeBenthamandCrowdsourcingManuscriptCollections.’” InCrowdsourcingOurCulturalHeritage,57–88.Surrey:Ashgate,2014.

Eddy,M.D.“TheInteractiveNotebook:HowStudentsLearnedtoKeepNotesduringtheScottishEnlightenment.”BookHistory19,no.1(2016):86–131.

Elbassuoni,S.,Ramanath,M.,Schenkel,R.,andWeikum,G.“SearchingRDFGraphswithSPARQLandKeywords”.IEEEDataEng.Bull.,33(1),(2010),16-24.

Farber,P.L.FindingOrderinNature:TheNaturalistTraditionfromLinnaeustoE.O.Wilson.Baltimore,Md.:JohnsHopkinsUniversityPress,2000.

FieldBookProject,SmithsonianNationalMuseumofNaturalHistory:http://naturalhistory.si.edu/fieldbooks/[accessed15February2017].

Franzoni,Ch.andH.Sauermann,“Crowdscience:Theorganizationofscientificresearchinopencollaborativeprojects,”Researchpolicy43,no.1(2014),1-20.

11 https://www.w3org/RDF/[accessedFebruary15,2017].12 https://www.w3org/OWL/[accessedFebruary15,2017].13 Thedatabaseisavailableonline:http://www.nationaalherbarium.nl/FMCollectors/[accessedFebruary15,

2017]

Page 13: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

13

GEONetsNameServer,http://geonames.nga.mil/gns/html/[accessedFebruary15,2017]

Hallo,M.,etal."CurrentstateofLinkedDataindigital libraries."Journalof InformationScience42.2(2016):117-127.

Heerlien,M.,J.VanLeusen,S.Schnörr,S.DeJong-Kole,N.Raes,andKirsten Van Hulsen. “TheNatural History Production Line: An Industrial Approach to theDigitizationofScientificCollections.”J.Comput.Cult.Herit.8,no.1(February2015):3:1–3:11.

Klaver,Ch.J.J.InseparableFriendsinLifeandDeath:TheLifeandWorkofHeinrichKuhl(1797-1821)andJohanConradvanHasselt(1797-1823),StudentsofProf.TheodorusvanSwinderen.Groningen:Barkhuis,2007.

Mees,G.F.andC.vanAchterberg.“VogelkundigonderzoekopNieuwGuineain1828:terugblikopdeornithologischeresultatenvandereisvanZr.Ms.KorvetTritonnaardezuidwestkustvanNieuw-Guinea.”ZoologischeBijdragen40(1994):3–64.

Péron,F.,N.Baudin,L.C.DesaulsesdeFreycinet,Ch.AlexandreLesueur,andN.-M.Petit.VoyagedeDécouvertesAuxTerresAustrales(Paris :Del’Imprimerieimpériale,1807).

Pethers,H. andB.Huertas. “TheDollmannCollection:ACaseStudyof Linking LibraryandHistoricalSpecimenCollectionsattheNaturalHistoryMuseum,London.”TheLinnean31,no.2(2015):18–22.

Ridge,M.(ed.),Crowdsourcingourculturalheritage(Ashgate:Farnham,2014).

Schomaker,L.,A.Weber,M.Thijssen,M.Heerlien,A.Plaat,S.Nijssen,etal.“MakingSenseofIllustratedHandwrittenArchives.”InBookofAbstracts,DigitalHumanitiesConference2016Krakow,764–66,2016.

Svensson,A.“GlobalPlantsandDigitalLetters:EpistemologicalImplicationsofDigitisingtheDirectors’CorrespondenceattheRoyalBotanicGardens,Kew.”EnvironmentalHumanities6(2015):73–102.

Wettlaufer, J, Ch. Johnson,M. Scholz,M. Fichtner, and S. GaneshThotempudi.“SemanticBlumenbach:ExplorationofText–ObjectRelationshipswithSemanticWeb Technology in theHistory of Science.” Digital Scholarship in the Humanities 30, Suppl. 1(December1,2015):187–98.

3.Linkedculturalevents:Digitizingpasteventsanditsimplicationsforanalyzingandtheorizingthe‘creativecity’HarmNijboer(HuygensING)ClaartjeRasterhoff(UniversityofAmsterdam)

IntroductionThispaperintroduces‘linkedculturalevents’asanovelmethodologicalframeworkthatallowsforthesystematicanalysisofculturalexpressionsintheirurbancontext.Theevents-basedapproachisinspiredbydatasetsdevelopedintheresearchprogramCREATE:CreativeAmsterdam:AnE-HumanitiesPerspective(UniversityofAmsterdam,2014-present).14Inthisprogram,theculturalsectorsofperformingartstakeupaparticularlyprominentposition,asdataonforinstancemusic,theatreandcinemaprogrammingisavailableinvariousformats.Intermsofmethodology,thedata

14 www.create.humanities.uva.nl.

Page 14: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

14

onperformingartsallowsustomovebeyondbiographicaldataonproducers,anddevelopamethodologicalframeworkinwhichdifferentdatatypescanbestudiedinconjunction.TheframeworkofLCEhastwomaincharacteristics:1)itpositsculturaleventsasanalyticalunitswithstructuralpropertiesandlinkagestoactors,institutionsandurbanproperties(linkedculturalevents);and2)itisconnectedtoadatastructurewhichallowsforqueryingtheconnectionsbetweentheseunitsofanalysis(linkeddata).Inthispaper,wediscusshowtheconceptof‘linkedevents’canbeusedtomapandanalyseurbanculturallife.

EventsandthecityStudiesinthesocialsciencesandhumanitiesresearchoffervaluableinsightsinconditionsandmechanismsfavorabletocreativityandinnovation,emphasizingforinstancetheroleofagglomeration,andlabourmobilityanddiversity.15Littletonoattentionisbeingpaidtowhatactuallymakescitiescometolife:theculturalexpressionsthemselves,andinparticulareventssuchexhibitions,concerts,plays,andpublications.Recenthistoricalresearchhas,furthermore,emphasizedthelimitsofsuchgeneralizationsonsourcesofcreativity,stressingtheimportanceoftime-andplace-specificcharacteristicsandcircumstances.16

Theevents-basedapproachmayhelptoaddresssomeoftheseissues.Muchhasbeenwrittenabouthoweventsshouldbeconceptualizedandabouttheroleofeventsinstudyingandwritinghistory.17Moreover,theoreticalandconceptualthinkingabouteventsisnotlimitedtohistoriographybutexpandstothefieldsofactiontheoryinphilosophyandsocialtheory.18Eventsalsofeatureasdevicesinstructuringheritagedataandasbuildingblocksforonlinereconstructionsofhistoricalnarratives.19Datasetsofeventsovertimehave,moreover,beenusedinthebroaderfieldofdataanalytics,forinstanceinevent-basednetworkanalyses,toaddtemporalityanddynamismtootherwisestaticinformationsystems.20Buildingoninsightsfromthesedifferentlinesofresearch,weemphasizethatnetworksofeventsshouldalsobeconsideredasunitsofanalysis.

LinkedculturaleventsAlargenumberofcontemporarysocialtheoristsrejectsthenotionofaculturalactoreventasanexpression(orrepresentation)ofagivenculture.Insteadcultureshouldbeunderstoodasacollectionofperformativeactsorevents.21Byperformativitywemeanthataneventcallsorrecallssomething(apieceofart,aculturalcodeortrait)intobeing.Aplay,forinstance,mustbeperformed(staged,read,remembered)tobethere.Events,moreover,donotoccurinisolation.Eachevent15 Cf.MarjattaHietalaPeterClark,‘CreativeCities’,in:TheOxfordHandbookofCitiesinWorldHistory,edited

byPeterClark.Oxford:OxfordUniversityPress2013.16 IljaVanDammeandBertDeMunck(eds.),CreativeCities1500-2000.TheHistoricalFabricationofCitiesas

AgentsofEconomicInnovationandCreativity,London:Routledgeforthcoming.17 Cf.RyanShaw,‘ASemanticToolforHistoricalEvents’,ProceedingsoftheThe1stWorkshoponEVENTS:

Definition,Detection,Coreference,andRepresentation,Atlanta,Georgia,14June2013:38–46;W.H.Sewell,‘Historicaleventsastransformationsofstructures:InventingrevolutionattheBastille’,TheoryandSociety1996,25:841-881.

18 RobertoCasati&AchilleVarzi,"Events",in:EdwardN.Zalta(ed.),TheStanfordEncyclopediaofPhilosophy(Winter2015Edition),http://plato.stanford.edu/archives/win2015/entries/events.

19 Cf.VictordeBoer,JohanOomen,OanaInel,LoraAroyo,ElcovanStaveren,WernerHelmich,DennisdeBeurs,‘DIVEintotheevent-basedbrowsingoflinkedhistoricalmedia’,JournalofWebSemantics,35/3:152-158..DOI:10.1016/j.websem.2015.06.003.Seealso:http://www.ehumanities.nl/events-working-group.

20 E.g.JoshuaO'Madadhain,JonHutchins,PadhraicSmyth(2005),‘PredictionandRankingAlgorithmsforEvent-basedNetworkData’,SIGKDDExplor.Newsl.,7(2),pp.23-30,doi:10.1145/1117454.1117458.

21 PeterDirksmeier&IlseHelbrecht(2008),‘Time,Non-representationalTheoryandthe"PerformativeTurn"—TowardsaNewMethodologyinQualitativeSocialResearch’,Forum:QualitativeSocialResearch9(2),pp.1-24.http://www.qualitative-research.net/index.php/fqs/article/view/385/839.

Page 15: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

15

involvestheactionsand/orpresenceanumberofentities.Theseentitiescanbehumanagents(e.g.performersandspectators),non-humanagents(organizations),materialobjects(places,artifacts,etc.)orimmaterialobjects(concepts,code).Andtheseentitiesareintheirturnlikelytobeinvolvedinothereventsaswell.Alreadyin1964theethnolinguistDellHymesdefined‘communities’as‘systemsofcommunicativeevents’.22Inordertooperationalizethisinterpretationforhistoricalresearch,weconceptualizeculturalcommunitiesaswebsoflinkedculturalevents(LCEs).

LCE’sthusformaninfrastructureforcombining,analyzing,andvisualizingexistingculturaldatasetsinanetworkthatexposestheirrelationsandinterdependencies,andthatallowsforquantitativeanalysis.Onthelevelofadvanceddatahandling,theLCEapproachhasastrongaffinitywithSemanticWebtechnologyandtheassociatedLinkedOpenDataparadigmwhichhaveevolvedinleadingprinciplesinthehandlingofhistoricalandculturalheritagedatainrecentyears.NotwithstandingthelimitationsandcomplexitiesofSemanticWebtechnology,thegreatpracticaladvantageofthistechnologyisthatitenablesustoconnectsingleresourcedatatoexternalresources.ThisisnotonlyimportantforourunderstandingofculturesaswebsasLCEs,butprovestobeinherenttothedataonforinstancetheatreandconcertprograms.

VisualisingandanalyzingLCE’sInthefinalsectionofthepaperwewillusetworecentlydevelopeddatasetstopresentanalysesoflinkedculturalevents.ONSTAGE(OnlineDatasystemofTheatreinAmsterdamintheGoldenAge)containsinformationontherepertoire,performances,popularityandrevenuesoftheculturalprograminAmsterdam’spublictheatreduringtheperiod1637-1772.23TheFELIX:FelixMeritisProgrammingDatabasestoresandlinksdataonconcertsheldinthefamousAmsterdamconcerthallFelixMeritisbetween1832and1888.24Inthesedatasetslinkageshavebeencreatedto,forinstance,genrecharacteristicsandbiographicaldatainexternalinternationalresourcessuchVIAF(VirtualInternationalAuthorityFile)andthedatabranchesoftheWikipediafamily(DBpediaandWikidata).Bylinkingdataonplaysandconcertstotheseresources,awealthofexternaldataonartefactsandactorsenrichesourlocalresources,andwemakeourlocaldataavailableinaglobalcontext.

VisualizingwebsofLCEsovertimerequirestechniquesthatgobeyondthestandardfeaturesofvisualizationtoolsandlibraries.Thechallengeinvisualizingsuchnetworksisthatwehavetoaccountforbothmultimodalityandtime.Inourpaper,weexplorethepossibilitiesandlimitationsofvisualizationsbylookingatthenetworksbehindthereceptionoftheFrenchplaywrightMolièreinDutchtheatreinthe17thand18thcenturies.Thistreatmentoftheinternationallinkagesoflocaltheatricalperformanceseffectivelyshowshowoperationalizingculturallifethroughtheconceptof,anddataon,linkedculturaleventsmayassistresearchersin1)mappingculturallifeinbothquantitativeandqualitativeways,and2)analysingtheorganisationofculturallifebeyondasingleeventorfixednetworkoflocalactors.

22 DellHymes(1964),‘Introduction:Towardethnographiesofcommunication’,AmericanAnthropologist66

(6-II),pp.1-34,p.13.https://www.jstor.org/stable/668159.23 http://www.vondel.humanities.uva.nl/onstage.KimJautze,FransBlom,LeonorÁlvarezFrancés(2016)

‘SpaanstheaterindeAmsterdamseSchouwburg(1638-1672).Kwantitatieveenkwalitatieveanalysevandecreatieveindustrievanhetvertalen’.DeZeventiendeEeuw.CultuurindeNederlandenininterdisciplinairperspectief32(1),pp.12–39.DOI:http://doi.org/10.18352/dze.100006;KimJautze,‘ONSTAGE!PresentationattheConferencefor“WerkgroepvoordeZeventiendeEeuw”inNijmegen,29August2015’,EMagazineeHumantiesRoyalNetherlandsAcademyofArtsandSciences6,http://ehumanities.leasepress.com/emagazine-6/recent-events/onstage.

24 MaschavanNieuwkerkandHarmNijboer,‘Nineteenthcenturyconcertprogramsinadigitalresearchenvironment;thecaseofFelixMeritis’.PosterpresentationatDHBeneluxBelval,9-10June2016.http://www.dhbenelux.org/wp-content/uploads/2016/05/60_nieuwkerk_nijboer_FinalAbstract_poster.pdf

Page 16: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

16

SessionC

1.TheQuestforQuestionsinDigitalHistory:AComparativeViewonWerner-andDelorsReportonEconomicandMonetaryUnionFlorentinaArmaseluandElenaDanescu

1. IntroductionInTheFormationoftheScientificMind,Bachelard(2002:25)considers“thesenseoftheproblem”asthecoreoftheconstructionofscientificknowledge:“allknowledgeisananswertoaquestion”.WithintheframeworkofDigitalHumanitiesandtextanalysis,Ramsay(2003:171,173)proposesthetermof“algorithmiccriticism”implyingawaytoassess,beyondhypothesesvalidation,“howsuccessfulthealgorithmswereinprovokingthoughtandallowinginsight”.InthecontextofDigitalHistory(SeefeldtandThomas,2009)andlanguagestudyinhistory(Bertrandetal.,2011),thisproposaldealslesswithhowtextualanalysisconfirms/disconfirmsprevioushypothesesandmorewithhowdigitaltoolshelparticulateresearchquestionsandfosternewpathsforinterpretation.Theanalysedtexts,Werner-andDelorsreport,wereselectedfortheircontrastive,comparativepotentialandtheirimportanceintheEconomicandMonetaryUnion(EMU)history.

2. Thematicsnapshot:Werner-andDelorsReportAttheHagueSummit(December1969),anexpertscommitteechairedbyPierreWerner(PrimeMinisterofLuxembourg)wassetuptoexploretheprogresstowardsEMUintheEuropeanCommunity(EC).TheresultwastheWernerreport(1970),whichofferedafulldefinitionofEMU(3stagesover1971–80).Goals:achieveirreversibleconvertibilitybetweentheMemberStatescurrencies,thecompleteliberalizationofcapitalmovements,theirrevocabilityofexchangerates,andevenasingleEuropeancurrency.Twomainprinciplesunderpinnedthisreport:gradualrealizationofEMUandparallelismbetweeneconomicandmonetaryconvergence.In1974theWernerreportwassuspended.In1988wassetupacommitteechargedwiththeStudyofEMU,chairedbyJacquesDelors(PresidentoftheEuropeanCommission).TheresultwastheDelorsreport(1989)whichwasappropriatingtheoverallphilosophyandstructureoftheWernerreport.

3. MethodologyThequestforquestionsstartedasacomparisonofthedocuments,usingthecorpusanalysisframeworkTXM.TXMhasbeenchosenforitscontrastivepotentialviathespecificitiesfeaturehighlightingwhatpropertiesarespecific,asoveruse/deficit(Lafon,1981),toapartversustherestofacorpus.

Thecorpuscontainsthereportsintxtformat,aswholeandfragments(numberedpartsandsections)inseparatefiles.ThefileswereimportedintoTXM(TXT+CSV)andtaggedviaTreeTagger(French).Partitionswerecreatedfortheentirereportsandtheparts/sections.Theanalysiswasbasedon:

• lexicaltableandspecificities,nom-adjectivequery:[frpos="NOM.*"][frpos="ADJ.*"](Vmax=500,Edit=frlemma);

• specificities,partofspeech(frpos).

Theselectionofpropertieswasdrivenbyspecificityscores,higherthantheTXMdefaultbanalitythreshold(+/-2.0),andrelevance.Thederiveddiagramswereusedtoformulateresearchquestions.

Page 17: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

17

4. TXMAnalysisandQuestionsFormulationFigure1showsaselectionofconceptsdefiningthe“monetary”aspectofEMUasreflectedbytheWerner-andDelorsreport.Fromthespecificitiesscores/diagram,thefollowingquestionwasformulated:

Q1:Howmonetarymatters(currency,budgetaryandfiscaltopics)differinWerner-andDelorsreport?

Despiteoftheapparentsimilaritybetweenthetworeports,thediagramshowsacontrastwithinseveralnotions(inter-Communitymargins,Communitycurrenciesversusmonetaryunion).

Fig1.Specificities:EMU“monetary”aspect,Werner-Delorsreport(RWe-RDe)(wholeview)

MoredetailsonthedifferencesareprovidedbyTXMco-occurences:“monnaiecommunautaire”with“fluctuation”,“intervention”,“marge(s)”,focusingonthemonetarystabilisationprocess(RWe);“unionmonétaire”with“convertibilitétotaleetirreversible”,emphasisingthe“monnaieunique”objective(RDe).

Figure2illustratesotheroppositionsrelatedtothe“economic”aspectofEMUandthequestion:

Q2:Howtheeconomicmatters(economicpolicy,market)differinWerner-andDelorsreport?

Page 18: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

18

Fig2.Specificities:EMU“economic”aspect,RWe-RDe(wholeview)

Thedistinctionprocess/objectivemaybefurtherobservedviaTXMco-occurrences:“politiqueéconomique”with“convergence”,“coordination”,“centrededecision”(RWe);“marchéintérieur/unique”with“programmed’achèvement”,and“déséquilibreéconomique”with“corriger”(RDe).

Asimilaranalysisappliedtotheparts/sectionscorpusprovidedfurtherincentivesforenquiriesonterminologicaland“actors”-relatedmatters(Q3,Q4).

Q3:CanwespeakofanevolutionoftheEMUterminologybetween1970and1989asreflectedbythestructureofthetwodocuments?

Q4:WhatinfluenceupontheEMUconstructiondidhavethestructureoftheWernerCommitteemembership(mainlypoliticians)andofDelorsCommittee(mainlycentralbankers)?Whatterminologyforwhatpeopleatwhatmoment?

IntheEuropeanintegrationprocess,manyconceptsevolvedfromhypothesestorealitybetween1970and1989.Itiswhysometerms(centralbank,Europeansystem,intra-Communitymargins,monetaryunion)areover/under-representedincertainsections.ThestructureofRWereflectsthedistributionofroles–politiciansdesignedthescope,elementsandstagesoftheEMUprocess(mainpart);centralbankers(appendix5)setupthetechnicalitiesoftheEuropeancurrencyandthearchitectureoftheESCB(EuropeanSystemofCentralBanks)(Fig.3,4).

Page 19: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

19

Fig.3.Specificities:EMU“monetary”aspect,RWe-RDe(structureview)

Q5:TheMonetaryUnionandtheEconomicUnionprocesseswerereallydesignedonasymmetricalandsimultaneousbasis?Aprocessesgranularityanalysis.

InRDe,thedegreeofdetaildescribingtheeconomicunionprocessislessthanthatofthemonetaryunion.Thismaybeassumedbylookingatthedocumentsectionsshowinghighspecificityscoresfortheseterms(Fig.3,4).

Fig.4.Specificities:EMU“economic”aspect,RWe-RDe(structureview)

Theanalysisofspecificitiescomputedaccordingtothepartofspeech(frpos)revealedothersalientoppositionsrelatedtotheuseofverbalforms,adjectives,pronounsandcitationmarks(Fig.5,6).

Page 20: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

20

Fig.5.Specificities:partofspeech,RWe-RDe(wholeview)

Fig.6.Specificities:partofspeech,RWe-RDe(structureview)

OnecanobserveadominanceofthefutureverbalforminRWeversusconditionalsinRDe,leadingto:

Q6:WhatisbehindtherangeofverbalformsinWerner-andinDelorsreport?Decodinghiddenpoliticalmeaningsandnationalinterests.

Q7:Whatisthedegreeofcertaintyandinter-conditionalitybetweenthesinglemarketandEMU?

RWewasdefiningadecadeprojectionfortheEMUprocesswhileRDewasbuiltuponitsfirststageachievementsbutinanuncertainenvironment.Thismayelicitfurtherinvestigationontheverbalformsusage(Q6,Q7).

Page 21: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

21

5. ConclusionsTheproposalrevisitstraditionalmethodologiesincontemporaryhistoryandDHfromanepistemologicalperspective:useofcomparativetextualanalysistoformulateresearchquestions.

ThefirstexperimentswithtwocrucialEMUdocumentssuggestthatdigitaltoolsmayserveashypothesesorconclusionsvalidatorsbutalsoasmeansofdiscoveringexplorationpathsintheconstructionofnewknowledge.

ReferencesBachelard,Gaston.TheFormationoftheScientificMind:AContributiontoaPsychoanalysisofObjectiveKnowledge,ClinamenPress,2002.

Bertrand,Jean-Marie.Boilley,Pierre.Genet,Jean-Philippe.Schmitt-Pantel,Pauline(éditeurs).LangueetHistoire,Paris,PublicationdelaSorbonne,2011.

Lafon,Pierre.1980.“Surlavariabilitédelafréquencedesformesdansuncorpus”,InMots.Saussure,Zipf,Lagado,desméthodes,descalculs,desdoutesetlevocabulairedequelquestextespolitiques,N°1,pp.127-165.http://www.persee.fr/web/revues/home/prescript/article/mots_0243-6450_1980_num_1_1_1008.

Ramsay,Stephen.“SpecialSection:ReconceivingTextAnalysis:TowardanAlgorithmicCriticism”,LitLinguistComputing(2003)18(2):167-174.DOI:https://doi.org/10.1093/llc/18.2.167.Published:01June2003.

Seefeldt,Douglas.Thomas,WilliamG.“Intersections:HistoryandNewMedia.WhatIsDigitalHistory?”,InPerspectivesonHistory,TheNewsmagazineontheAmericanHistoricalAssociation,May2009.

TXM,Textométrie,http://textometrie.ens-lyon.fr/?lang=en.

SourcesRapportauConseiletàlaCommissionconcernantlaréalisationparétapesdel’UnionéconomiqueetmonétairedanslaCommunauté(rapportWerner).Luxembourg:8octobre1970,documentL6.956/II/70-D.InJournalofficieldesCommunautéseuropéennes,n°C136,supplémentauBulletin11/1970,Luxembourg,11novembre1970.

Rapportsurl'UnionéconomiqueetmonétairedanslaCommunautéeuropéenne(rapportDelors).12avril1989.InEuropeDocuments.Bruxelles,20avril1989,n°1550/1551.

2.TransparencyasRupture:OpenDataandtheDatafiedSocietyofHongKongRolienHoyngLingnanUniversity

ThispaperdealswithOpenDataandthedataficationofgovernanceinHongKong.Itaddressescontestationsover“transparency”asatechno-politicalconstructionthatisembodiedin,andperformedby,theinfrastructuresandtechniquesofdata-drivengovernance.Transparencyisasiteofnegotiatingdistributionsofcognitionandperceptioninthecontextoftransformationsofcitizenshipandgovernanceinthedatafiedsociety.Ispecificallyinquireintotheinfrastructures,protocols,techniques,andpracticesofOpenData,whichpromisestosimultaneouslyenhancegovernmentaccountabilityandstimulatedata-driven“smart”governance.Accordingly,Ilookat

Page 22: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

22

techno-politicalorganizationsofdigitaldataanddatainfrastructuresthatsupportparticularmodesanddistributionsofcognitionandperception(Halpern2014;Hayles2014;Kitchin2014).Idistinguishtwodataregimesrespectivelyrevolvingaround“representation”and“prediction.”Ireviewtheseissuesinrelationtothequestionofglobalization.ThecasestudyofHongKongsuggeststhatdataficationdoesnotresultingloballyhomogeneouscyberneticcontrol.Rather,theprocessofadaptingOpenDatais(structurally)incomplete,disruptive,anddisruptedintheencounterwithresidualrationalitiesofstatecraft,whichmeansitopensupafieldofstruggleandcontestation.Inthispaper,“disruption”functionsasamethodologicaldevicetoexplorethepoliticsofOpendataanddataficationatlarge.Ratherthanappropriatingdisruptionasarevelatorymomentundoingthe“black-boxing”oftechnologyperse,myaimistorethinkthepoliticsoftransparencyandsecrecyinmorecomplextermsandinquireintothepossibilityofactivismandintervention(Birchal2015).

Ideploymixedmethodsincludinginterviewswithactorsandanalysesofpolicydocumentsandtechnicalliteratureaswellasmaterialarchitectures,formats,protocols,interfaces,anddatavisualizations.Onthebasisofexamplesincludingthedata.gov.hkwebsiteandfintechapps,Iarguethatthetwodataregimesof“representation”and“prediction”enactmentparticular“fields”ofvisibility:organizedarticulationsofstrategies,techniques,anddiscourses(Halpern2014).First,thedataregimeof“representation”providescognitionandperceptionintermsofoversightandtransparency.Orderingdata(capturing,aggregating,andorganizing)formspartandparceloforderingsociety.Dataformsevidenceforwhatexists“outthere”andaffordsreferential,descriptivecapability.Hence,itissupposedtoassistintheproductionofknowledgeandtruth.Second,thedataregimeofprediction,whichisaffordedbydigitaldataprocessingtechniquesandinfrastructure,orientsperceptionandcognitionontodiagnosisofpotentialandthepredictionoftendencies.Dataisgeneratedwithoutaspecificquestionorpurposeinmind.Ratherthandepictingtheworld,atstakeismodelingtheworldfortacticalinterventionsinshiftingpatternsandtrends(Andrejevic2013).Distributionofthismodeofperceptionandcognitioninducessociety’smediationbyalgorithmicdataprocessingtechniques.

Ratherthanrecognizingdataregimesinanideal-typicalfashion,mymainquestionaddressestheprocessesofadaptationandthecontradictionsthatemergeduetointersectingofdataexpediency.ThisfocusunderscoresOpenData’sparadoxofpromisingfortifiedtransparencyandaccountability,whilesimultaneouslyadvancingcovertformsofmodulation,control,dataveillance,andconcentrationsofcognition.Forinstance,citizen-consumersasusersofappsareinterpellatedintopositionsthatseeminglydemocratizepredictiveperceptionandcognition,yettheyaresimultaneouslysubjectedtodataveillanceandalgorithmicgovernance.However,thedatafiedsocietydoesnotpresentitselfasafaitaccompli,inotherwords,fullyoperationalandall-encompassing.Rather,mailto:[email protected](experienced)failure,disruption,anddeferment;itgeneratescontradictions,interferences,andarticulationsbetweenco-existingdataregimesandmultifariouspoliticalrationalities(cf.Chan2013).Thesemomentsmightofferpossibilitiesforimaginingmoreradicalnotionsoftransparencyandsecrecy.

Iftransparencyandsecrecyareco-constituted,thequestioniswhatescapestheparticularconstructionsoftransparencyinOpenData(Birchall2015).Forinstance,ifthegovernmentopensupcertaindatasets,doesthisenablepublicscrutinyofstatecraftordoesitmerelybenefittheexpansionofwhatEasterling(2015)callsextrastatecraft—nowbymeansofdata-drivenapparatusesbelongingtoinstitutionsthatdonotopentheirownproprietarydatasets?Howdodataanddatainfrastructuresmediatecitizens’relationtoprivate-publicgovernance?TowhatextentareOpenDataactivistsabletonotjustreclaimpublicscrutinyoverstatecraftbutradicalizetransparency,forinstancebyintroducinguncontrollabledatamotilityandreversibletransitionsbetweendataandinformation?Followingamorespeculativeturn,shouldtransparencyalwaysbethegoal,ordoessecrecyhaveitsmeritstooinordertointerveneintotheeffectsofpredictiveperceptionandcognitiononsociety?

Page 23: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

23

ReferencesAndrejevic,Marc.2013.Infoglut:HowTooMuchInformationIsChangingtheWayWeThink.NewYork:Routledge.

Birchall,Clare.2015.“’Data.gov-in-a-box’:DelimitingTransparency.”EuropeanJournalofSocialTheory.18(2):

Bratton,Benjamin.2015.TheStack:OnSoftwareandSovereignty.CambridgeMA:MITPress.

Chan,AnitaSay.2013.NetworkingPeripheries:TechnologicalFuturesandtheMythofDigitalUniversalism.CambridgeMA:MITPress.

Easterling,Keller.2015.Extrastatecraft:ThePowerofInfrastructureSpace.London:Verso.

Halpern,Orit.2014.BeautifulData:AHistoryofVisionandReasonsince1945.Durham:DukeUniversityPress.

Hayles,N.Katherine.2014.“CognitionEverywhere:TheRiseoftheCognitiveNonconsciousandtheCostsofConsciousness.”NewLiteraryHistory45(2):199-220.

Kitchin,Rob.2014.TheDataRevolutionBigData,OpenData,DataInfrastructuresandTheirConsequences.LosAngeles:Sage.

Ong,Aiwha.2006.NeoliberalismasException.Durham:DukeUniversityPress

3.Oralhistoryonline–UserperspectivesandbehaviorinatransformingWW2memorycultureDr.SusanHogervorst|OpenUniversiteitNederland/ErasmusUniversityRotterdam|[email protected]

Sincethe1980s,SecondWorldWar(WW2)memoryculturehasbeenincreasinglycharacterizedbytheforthcomingdisappearanceoftheeyewitnessgenerations.Onewayinwhichthisproblemhasbeenaddressed,isbyrecordingeyewitnesstestimonies.Bynow,multipleoralhistorycollectionshavebeencreatedthroughoutthewesternworld,inwhichtenthousandsofinterviewshavebeenpreservedonaudioandvideo(ApostolousandPagenstecher2013,Keilbach2013).Currently,weseeashiftfromcollectingandpreservingtestimoniestodisclosingthemforwideraudiences(ScagliolaandF.deJong2014;S.deJong2013;BotheandLücke,2013).Thisispartlyduetotechnologicaldevelopments,butalsotothedynamicsofWW2memoryculture,ofwhichtransmittingeyewitnessmemoriesontoyoungergenerationshasbecomeakeyfeature(Wieviorka2006;ErllandRigney2009;Hogervorst2010;SabrowandFrei2012).Howwillthedisappearanceoftheeyewitnessgenerationsaffectthetransmissionofeyewitnessmemories,andwhatroledoonlineinterviewcollectionsplayinthisprocess?

MethodsanddataTheaimofmypostdocresearchprojectistoprovidesubstantiateddataonthismatter,partlyacquiredbyanalyzingcontent,use,andusersofanonlinevideointerviewcollection:theDutchwebportal‘Getuigenverhalen.nl’.Thisportalgivesaccesstocirca500videointerviewswitheyewitnessesaboutdifferentWW2relatedtopics.Ideploythisinterviewcollectionasadigitalbarometerofcurrent,transformingmemoryculture:throughthewebstatistics,anonlinequestionnaire,focusgroupinterviewswith(student)historyteachers,andscreenrecordingsoftheirinteractionwiththe

Page 24: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

24

websitewhileselectinginterviewfragmentstheywoulduseinalessonaboutWW2.Thisinnovativeapproachnotonlyunderlinesthevalueofincorporatingdigitalsourcesandmethodsintothefield.Italsoenablesforegroundingtheuserandhis/heragencyintheanalysisoftherathertop-down,officiallysupportedprocessofWW2memorytransmission,whileusersoftenremainelusiveintraditionalmemorystudies(andhistorical)research(Kansteiner2002;ErllandRigney2009).

FindingsanddiscussionThefindingsindicatebothcontinuityandchangeregardingtheuseofWW2videointerviewscomparedtolivetestimoniesinclassrooms.Theportalisfirstandforemostanonlinearchive;itisnotexplicitlymeantasaneducationaltool.InquiriesintheeducationalfieldintheNetherlandspointataninterestin,butalsoatanunfamiliaritywithsuchcollectionsandtheirdidacticalpossibilities.Thisisconfirmedbyboththequestionnaireandthewebstatistics.Onlyfewrespondentsidentifythemselvesasteacherorstudent,andrelativelymanydonotfindwhattheywerelookingfor.Sitesearchisnotusedoften,andthemostlywatchedinterviewsaretheoneshighlightedonthehomepage.Twofocusgroupinterviewswithaninternationalgroupofstudenthistoryteachersofferedanopportunitytogetacloserandmorein-depthviewontheportal’suse,and–toascholarofculturalmemorymoreimportantly–onusers’selectioncriteriaofrelevantmaterialoutofthequiteabundantreservoirofeyewitnesstestimoniesavailable.First,accordingtotheparticipants,asuitableinterviewfragmentshould‘bringthepastcloser’(whichwastobeachievedindifferentmanners).Indeed,theparticipantscouldquiteeasilyfindfragmentsthatsuitedthesepurposes.Second,suitablefragmentsshould(accordingtotheparticipants)confirmexistinghistoricalknowledge.Thelatterindicatesamorefundamental,bothethicalandepistemologicalviewonthepositionandvalueofeyewitnessesandtheirfunctioningassourcesofhistoricalknowledge.

Bothaspectscorrespondtothewayparticipantswoulduselivetestimoniesintheirlessons.Theplenaryevaluationoftheselectedfragmentsandthecriteriaused,pointedatanimportantdifference:thedistancethroughthescreen.Thisdistanceenabledraisingcriticalquestionsaboutthenatureandvalue(reliability)oforaltestimonies,whichisratheruncommoninsettingsinwitheyewitnessesarephysicallypresent.Anothercharacteristicofsearchableinterviewcollections,thattheyenablecomparingdifferenttestimoniesandexperiences,andtherewithsupplementorchallenge(besidesconfirmandillustrate)existinghistoricalknowledgeandperspectives,wasnotmentionedbytheparticipants.ThismightpointatthefactthatworkingwithdigitaltestimoniesisstillinanearlystageintheNetherlands.Currently,inGermany,Austria,andtheUnitedStates,educationalprojectsaredevelopedaroundWW2videotestimonies.BesidesrecommendationsforimprovementsoftheDutchportalwebsite,thisstudycontributestoamorecritical,anddidacticallymorerelevantinteractionwithtestimoniesinDutchhistoryeducation,aswellastoabetterunderstandingofcurrenttransformingmemoryculture.

ReferencesApostolous,N.andC.Pagenstecher(eds.),ErinnernanZwangsarbeit.Zeitzeugen-InterviewsinderdigitalenWelt(Berlin2013).

Bothe,A.andM.Lücke,‘ShoahundhistorischesLernenmitvirtuellenZeugnissen‘,P.Gautschietal.(eds.),ShoaundSchule.LehrenundLernenim21.Jahrhundert(Zürich2013)55-74.

Erll,A.andA.Rigney,Mediation,remediation,andthedynamicsofculturalmemory(Berlin/NewYork2009).

Hirsch,M.andL.Spitzer,‘Thewitnessinthearchive.Holocauststudies/memorystudies’,MemoryStudiesvol.2(2009)2,151-170

Hogervorst,S.,Onwrikbareherinnering.HerinneringsculturenvanRavensbrückinEuropa,1945-2010(Hilversum2010).

Page 25: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

25

Huijgen,T.,&Holthuis,P.(2016).Dutchvoices:exploringtheroleoforalhistoryinDutchsecondaryhistoryteaching.InD.Trškan(Ed.),Oralhistoryeducation:dialoguewiththepast.(1ed.,Vol.1,pp.43-58).Ljubljana:SlovenianNationalCommissionforUNESCO.

Jong,S.de,‘ImSpiegelderGeschichten:ObjekteundZeitzeugenvideosinMuseendesHolocaustunddesZweitenWeltkrieges’,WerkstattGeschichte62(2013)19-41.

Kansteiner,W.,‘Findingmeaninginmemory.Amethodologicalcritiqueofcollectivememorystudies’,Historyandtheory41(2002)179-197.

Keilbach,J.,‘Collecting,IndexingandDigitizingSurvivors.HolocaustTestimoniesintheDigitalAge’,A.Bangertetal.(eds.),HolocaustIntersections.GenocideandVisualCultureattheNewMillennium(London2013)46-63.

Scagliola,S.andF.deJong,‘Clio’stalkativedaughtergoesdigital’,R.Bodetal.(eds.),TheMakingoftheHumanities,VolumeIII:TheModernHumanities(Amsterdam2013)511-526.

Sabrow,M.andN.Frei(eds.),DieGeburtdesZeitzeugennach1945(Göttingen2012).

Wieviorka,A.,Theeraofthewitness(Ithaca2006).

Page 26: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

26

SessionD

1.Collectionsasnetworks,UncoveringinformationexchangesandinformationnetworksinthecollectionsoftheMeertensInstitute(KNAW)DouweA.ZeldenrustMeertensInstitute(KNAW)

Thispaperisaboutuncoveringinformationexchangesandinformationnetworksinhumanitiesresearchcollections.Mosthumanitiesresearchersfocusonobtainingdatafromresearchcollections,withoutrealizingthatthosecollectionsalsocanbeseenastheresultsofepistemologicalexperiments.Thatis:everycollectionistheoutcomeoftheprocessofgatheringinformationandthereforeinterconnectedwiththepresuppositions,foundationsandtheactivitiesthathaveledtotheknowledgeitcontains.CharlesJeurgens(2012)statesaboutthisconnectionthat:‘(…)understandingthatbondhastoprecedeunderstandingtherecords’(p.51).The‘bond’Jeurgenswritesabout,isnotonlydictatedbythegoal(s)formingthecollection,butisalsodeterminedby(amongotherthings)thecultural,administrative,scientificandsocialclimate.Moreover,itisdependentontheindividualswhowerecollecting,theirscientificexperience,theirinterestsandpersonalities.

Thepaperwillreflectontheissuesofextracting,visualisingandprocessingthiscontextinformation,usingtheconceptof‘deepnetworks’.CharlesvandeHeuvelintroducedthisconceptinhisarticle‘MappingknowledgeexchangeinearlymodernEuropeintellectualandtechnologicalgeographiesandnetworkrepresentation’(2015).Theconceptallowsthecontextualisationofnetworksandthevisualisationofuncertaintywhilecreatinglayersofhistoricalsourcesinmultipleperspectives.Furthermore,itcombinespatternrecognitionintextualandvisualbigdatawithtraditionalhermeneuticmethods.ThevastcollectionsoftheMeertensInstitute(RoyalNetherlandsAcademyofArtsandSciences)willbeusedasausecaseinordertomakethefirststepsinrealizingthesenetworkswithintheframeworkofarchivalstudies(Meertens,2016).

ThecollectionsoftheMeertensInstitutehavebeenaccumulatedinaperiodofover80yearsandconcentrateonthediversityinlanguageandcultureintheNetherlands(Jongenburger,2013).Accesstothemorethan15terabytesofdata,6000hoursof(digital)audioand2kilometresofarchivalmaterialisprovidedbyarecordkeepingsystemcontainingdataabout,amongstotherthings,theresearchersinvolvedincollecting.Theinformationcapturedinthisrecordkeepingsystemcreatesthefirstlayerofthenetwork.Asecondlayerofinformation,regardingtheprovenanceofthecollections,isextractedfromtheannualreportsoftheMeertensInstitute.Thosereportscontaininformationabout,forinstance,theacquisitionofthecollections.AndathirdlayerofinformationisobtainedfromtheBiographicalPortaloftheNetherlands(Biografischportaal,2016).Thisonlinereferenceworkcontainsshortdescriptionsofthelivesofpersons(amongstthemprominentscholarsandinfluentialmanagersofresearchinstitutes)whodistinguishedthemselvesorplayedaroleofsomesignificanceinthepastintheNetherlands.

Theobjectiveofthisresearchisthreefold:firstitwoulddemonstratethatbuildingsuchanetworkisfeasible.Theweb-basedsoftwareplatformPalladiowillbeusedforprocessingthedataandvisualizingthenetwork(Palladio,2016).Asthisresearchisongoing,experimentswithother,moreversatilenetworkanalysistools,suchasNodegoat,willbeconsidered(Nodegoat,2016).25The

25 Variousdatavisualizationplatformsandnetworkarchitectureshavebeendeveloped.ForvisualizingdataPalladioisoneoftheplatformsthatisadvisedasanetworkvisualizationtoolforthehumanities(Düring,

Page 27: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

27

networkwillconsistofhundredsofnodesandthousandsof(potential)edgesinordertoincludetherelationsamongthemostprominentpersonsinvolvedincollectingtheinformation.Second,thismethodcan,withlocalmodifications,bereusedbyotherhumanitiesresearcherstogeneratenetworksforarchivalstudies.Andthird,theoutcomeswillbeincorporatedinmyPhDresearch,whichisaboutthehistoryofthecollectionsoftheMeertensInstitute.AsthisPhDresearchstartedinJanuary2016andisongoing,thispaperwillshowthefirstresults.

References:Düring,M.(2016).OnDilettantesandDialoguesinDigitalHistory.EuropeanSocialScienceHistoryConference2016.

Heuvel,C.vanden(2015).MappingKnowledgeExchangeinEarlyModernEurope.InternationalJournalofHumanitiesandArtsComputing,9(1),95-114.

Jeurgens,K.J.P.F.M.(2012).Informationonthemove.Colonialarchives:pillarsofpastglobalinformationexchange.ColonialLegacyinSouthEastAsia.TheDutchArchives.Eds.K.J.P.F.MJeurgens,A.C.MKappelhof&M.Karabinos.'s-Gravenhage:StichtingArchiefpublicaties.45-65.

Jongenburger,W,A.W.H.Jansen&D.A.Zeldenrust(2013).CollectieplanMeertensInstituut,2013-2018.Amsterdam:MeertensInstituut.

Websites:http://ckcc.huygens.knaw.nl(AccessedDecember08,2016)

https://nodegoat.net(AccessedDecember0817,2016)

http://palladio.designhumanities.org(AccessedDecember08,2016)

http://www.biografischportaal.nl(AccessedDecember08,2016)

http://www.meertens.knaw.nl(AccessedDecember08,2016)

2.MappingControversiesofDigitalCurationDanaMustataUniversityofGroningen,NL

Theemergenceofdigitaltechnologiesanddigitizeddatainhumanitiesresearch–aphenomenonthathasbeenshapingthecontoursandtheincentivesbehindtheorganizationofdigitalhumanitiesasfieldofstudy–hasraisedmorequestionsthanhasansweredany.Aredigitaltechnologieschangingourresearchpracticesandifso,how?Dotheyincitenewresearchquestions?Areestablishedfieldsofstudyinthehumanitiesdrasticallyalteringinthefaceofthesenewphenomenathathavetechnologyattheircentre?Whatisnewandwhatisoldinthewaywedoresearchindigitalenvironments?Ifthereisanyunderlyingassumptiontraversingallthesequestionsisthatdigitalhumanitiesisa‘transformativepractice’(Svensson,2009).Ithasbeenprimarilythedifficultytodescribeandexplainwhatitisthatischanginginourresearchpractices.Thispapertacklesthisparticularconcern.

Hastherebeenashiftinourtraditionalresearchpracticesrootedintheanalogueera?Whatdoesthisshiftconsistof?Howdoweredefineourselvesfromtraditionallyanalogueresearchersinto

2016).Nodegoatwasusedintheproject‘CirculationofKnowledgeandLearnedPracticesinthe17th-centuryDutchRepublic’(Huygens,2016).

Page 28: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

28

digitalhumanitiesresearchers?Thesequestionscontributetofurtheringthedefinitionofdigitalhumanitiesasafieldofstudyandmorespecifically,toredefiningthepracticeofdoingscholarshipwithdigitaltechnologiesanddigitizeddata.Withoutadvocatingforessentialist,stableandfixeddefinitionsofdigitalhumanitiesasanarenainwhichscholarshipisproduced,IamparticularlyinterestedinwhatBrunoLatour(1991)callsthe‘socio-logics’characterizingdigitalhumanitiesasatransformativefieldofstudy.Socio-logicsrefertohowknowledgeismobilised,constructedandaccumulatedinthefaceofa‘controversy’.

“Theword“controversy”refersheretoeverybitofscienceandtechnologywhichisnotyetstabilized,closedor“blackboxed”...weuseitasageneraltermtodescribeshareduncertainty.(Macospol,2007:6,citedinVenturini,2010:260).

Inotherwords,asTommasoVenturiniexplains:

“controversiesaresituationswhereactorsdisagree(orbetter,agreeontheirdisagreement).Thenotionofdisagreementistobetakeninthewidestsense:controversiesbeginwhenactorsdiscoverthattheycannotignoreeachotherandcontroversiesendwhenactorsmanagetoworkoutasolidcompromisetolivetogether.Anythingbetweenthesetwoextremescanbecalledacontroversy.”(Venturini,2010:261).

Thestartingpointofmypaperisthus,approachingdigitalhumanitiesasacontroversialarena,oneinwhichresearchers,tools,tooldevelopersanddataproviders–humansandnon-humans,actorsandactantsinLatour’sterms–collide;anarenainwhichscientificandtechnologicalclashesplayout.Itisthroughthechainsofassociationbetweenresearchers,tools,tooldevelopersanddataprovidersaswellasthoughthetransformationspromptedbytheclashesbetweentheseactorsandactantsthatthesocio-logicofthisnewfieldofstudyisrenderedvisible.

Thepaperwill‘mapthecontroversies’ofcuratingdigitalobjectsinavirtualresearchenvironment.Thesecontroversiesrelatetotranslatingacademicknowledgeintotoolsdesignandimplementation,translatinghistoricalnarrativesintouserfunctionalities,findingasharedworklanguageandcollaboratingattheintersectionofdifferentfieldsofexpertiseandfieldsofknowledgeproduction.

Mappingcontroversiesisahands-onmethodrootedintheANTtraditionofthought,whichexplores,describes,visualizesandmakessenseofissuesthatemergeattheintersectionofcollaborativeworkdonebetween–inthisparticularcase-researchers,tools,designers,tooldevelopersanddigitaldataproviders.Thisparticularmethodprovidesinsightsintopracticesofworkingwithdigitaltoolsanddigitiseddataandthesubsequentprocessofknowledgeproduction.

Thepaperwillmapcontroversiesthroughthepracticesofdesigning,researchingandcuratingthevirtualexhibitions(VEs)ontheonlineplatformwww.euscreen.eu.Thisisaplatformthatmakesfreelyaccessiblethousandsofaudiovisualitemsoriginatingfrom21contentprovidersinEurope.TheVEswerecuratedbyresearchersincollaborationwith1)contentproviders(CPs)consistingofaudiovisualarchivesandresponsibleforco-selectinganduploadingtheircontenttothe‘SpecialCollections’onplatform;2)tooldevelopersinchargeofdevelopingtheVEbuilderandalltheuserfunctionalitiesaroundit;3)designersresponsibleforthedesignofthefrontendoftheexhibition,thedesignoftheuserexperienceaswellasformediatingthecommongroundsbetweentheresearcher’sneedsandthepotentialsoftooldevelopment;4)theVEbuilderwhichallowedtheresearcherstocuratetheirexhibitions.TheresearchersdraftedthecontentselectionstrategyfortheCPs;viewed,researched,furtherselectedandthenbookmarkedthecontentuploadedtotheSpecialCollections;definedandadvisedonthedevelopmentanddesignoftheVEbuilderthroughwireframing,paperprototypingandjointworksessionswiththedesignersandtooldevelopers;andlastbutnotleast,curatedtheirvirtualexhibitionsasanendresultofallthesecollaborativeworkpractices.

Page 29: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

29

Takingapractice-orientedanthropologicalapproachtotheresearchers’journeythroughcuratingtheVEs,thepaperwillexploredigitalcurationthroughwhatLatour(1991)calledthe‘chainofassociations’andthe‘seriesoftransformations’underwentbytheactorsandactantsinvolvedaswellasthroughthe‘translations’thattookplacethroughoutthecollaborativeworkprocess,whichsawtheinitialenunciationsoftheVEsturnintothefinalproductsthatwerepublishedonline.

Bymappingtheassociationsthatresearchersenteredinto,thetransformationstheyunderwentaspartoftheseassociationsandthetranslationsthattookplacefromtheirfirstVEideastothefinalcuratedobjectsonline,thispapertriestopindownwhatitisthatchangesinthepracticeofdoinghumanitiesresearchwhenknowledgeis(co)producedwithdigitaltools,digitizeddataandattheintersectionatdifferentfieldsofexpertise.

Makingsenseofdigitalhumanitiesasatransformativepracticeof(co-)producingknowledgeisafertilegroundtocometotermswiththisemergingfieldofstudy.Ithelpsusunderstand‘digitalpractices’intermsofwhathumanitiesscholarsdowithdigitaltoolsindigitalenvironments,toparaphraseCouldry’s(2010)understandingof‘mediapractices’.Iargue,thusthatunderstandingtheproductionofknowledgeindigitalhumanitiesbecomesanarchaeologicalactofretracingassociationsandtransformationsthroughdifferentspacesofexpertise,differentactorsandactants,lendingitselftowhatFoucaultcalled‘principlesofdiscontinuity’.Thismayhelpbridgethegapsbetweendigitaltechnologiesand(analogue)researchers,techniciansandscientiststhatareatthecoreofcontroversiesinthefield.

ReferencesNickCouldry,‘TheorisingMediaasPractice’inSocialSemiotics,Vol.14,Issue2,2004,pp.115-132.Publishedonline:13Oct2010,http://dx.doi.org/10.1080/1035033042000238295

MichelFoucault,ArchaeologyofKnowledge,Routledge,1972

BrunoLatour,‘TechnologyisSocietyMadeDurable’in:J.Law,ed.,ASociologyofMonstersEssaysonPower,TechnologyandDomination,SociologicalReviewMonographN°38,pp.103-132,1991

PatrikSvensson,‘HumanitiesCompuringasDigitalHumanities’,DigitalHumanitiesQuaterly,3(3),2009

TommasoVenturini,‘Divinginmagma:howtoexplorecontroversieswithactor-networktheory’in:PublicUnderstandingofScience,19(3),2010,pp.258–273

3.ResearchopportunitiesforthearchivedwebintheBeneluxSallyChambers,GhentCentreforDigitalHumanities,GhentUniversity

PeterMechant,Media,InnovationandCommunicationTechnologies(MICT),GhentUniversity

KeesTeszelszky,NationalLibraryoftheNetherlands

YvesMaurer,NationalLibraryofLuxembourg

Web-archivingorcollectingportionsofthewebtoensuretheinformationispreservedinanarchive,beganin1996withtheInternetArchiveinitiative26anditswell-knowndigitalarchive‘TheWaybackMachine’27.Othershavefollowed,fromnationalandstatelibrariesandarchivestomuseumsand

26https://archive.org/27https://archive.org/web/

Page 30: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

30

nonprofitssuchas‘CommonCrawl’whichcorpuscontainsmorethan3.14billionwebpagesandabout250TBofuncompressedcontent28.

Althoughmostresearchersinthehumanitiesstillneedtobegintoexplorethepotentialofthesearchives,someprojectshavealreadyinvestigatedtheirpotential,e.g.theBUDDAHproject29(BigUKDomainDatafortheArtsandHumanities)whichawardedbursariestoresearcherstocarryoutresearchintheirsubjectareausingtheUKwebarchive,ortheRESAWnetwork30(ResearchInfrastructurefortheStudyofArchivedWebMaterials)whichaimsatpromotingacollaborativeEuropeanresearchinfrastructureforthestudyofarchivedwebmaterials.Despitesuchinitiatives,researchersinthehumanitiesstillstrugglewiththecomputationalturnoftheirfieldontheoretical,methodological(e.g.developtheoreticalandmethodologicalframeworkswithinwhichtostudywebarchives)andpracticallevels(e.g.theylackexpertiseandknowledgetousewebarchivesandtoapplydigitalmethodsorbigdatatechniquesontheircorpus).

Althoughgeographicallyveryclose(thehistoryof)nationalwebarchivingisverydifferentforthethreeBeneluxcountries:

IntheNetherlands,theNationalLibraryalreadystartedin1992withmappingtheDutchwebbycompilingwebdirectoriesorweblists.Afirstwebarchivingpilotwasconductedin2003andweb-archivingstartedasaregularactivityin2007usingaselectiveharvestingstrategybasedonaselectionoftheexistingwebdirectory(governmental,culturalandacademicwebsites,sitesthatmirrortrendsontheweb,and‘endangered’websiteswhichareconsideredasDutchdigitalculturalheritage).AsperJanuary2017,12,000websiteshavebeenharvestedwithHeritrix31(26TBofcompressed.arcfiles,211millionURLs).Alinguisticanalysisofthecollectionhasnotbeendoneyet,but368Frisianwebsitesareincluded.TheDutchwebarchiveisavailableinthereadingroomsoftheNationalLibraryoftheNetherlandsorresearcherscanrequestaccesstothedataforspecificprojects32.

InLuxembourg,apilotprojectforweb-archivingwasundertakenin2005andsubsequentlythelegaldepositlawwasextendedin2009toalsocovercontentpublishedontheweb.Duetofundingissues,theregularharvestsforthe.ludomainandotherwebsiteshostedinLuxembourgonlystartedinAugust2016.AsecondcrawlfinishedinJanuary2017.Thesecrawlsweresupplementedwithdatafromanumberoftargetedcrawlsofgovernmentalsites.Currently,thearchivecontains15TBofcompressedwarcfiles(250millionURLs)witharound40%ofthehostsintheccTLD.luand35%inthe.com.SimilartotheDutcharchive,adetailedlinguisticexaminationoftheLuxembourgcollectionhasnotbeendoneyet,butabasiclinguisticanalysisshowsthepresenceof30%English,30%French,15%German,5%Luxembourgishanda‘longtail’ofotherlanguages.TheLuxembourgwebarchivewillbeavailableinthereadingroomsofthenationallibraryandresearcherscanbegrantedaccesstotheunderlyingdatasetoncase-by-casebasis.

Althoughthe.bedomainwasintroducedinJune198833,theBelgianwebiscurrentlynotsystematicallyarchived.AsofFebruary2017,1.561.932domainsareregisteredbyDNSBelgium34.WithoutaBelgianwebarchive,thecontentofthesewebsiteswillnotbepreservedforfuturegenerationsandasignificantportionofBelgianhistorywillbelostforever.InDecember2016,apilot28 http://commoncrawl.org/29 http://buddah.projects.history.ac.uk/30 http://resaw.eu/31 https://webarchive.jira.com/wiki/display/Heritrix32 https://www.kb.nl/en/organisation/research-expertise/long-term-usability-of-digital-resources/web-

archiving33 HistoryoftheBelgianweb:https://www.dnsbelgium.be/en/history34 DNSBelgium:https://www.dnsbelgium.be/en

Page 31: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

31

web-archivingprojectcalledPROMISE(PReservingOnlineMultipleInformation:towardsaBelgianStratEgy)wasfunded.Theaimoftheprojectisto(i)identifycurrentbestpracticesinweb-archivingandapplythemtotheBelgiancontext,(ii)pilotBelgianweb-archiving,(iii)pilotaccess(anduse)ofthepilotBelgianwebarchiveforscientificresearch,and(iv)makerecommendationsforasustainableweb-archivingserviceforBelgium.Thispilotprojectisconsideredasafirststeptowardsimplementingalong-termwebarchivingstrategyforBelgium.SimilartoLuxembourg,Belgiumhasvariousofficiallanguages35thatwillneedtobeconsideredduringthepilotphase.

Fromaweb-archivistperspective,akeychallengeishowtocollaborateonjointweb-archivinginitiativestoenable‘trans-national’researchopportunities,forexample,bytakingeitherasite-,topic-,ordomain-centricarchivingapproach,orbyunifyingmethodologicalapproachesfordiscovery,acquisitionanddescriptionofwebcontent.Fromtheviewpointofaresearcherinthehumanities,web-archivesarerich‘born-digital’resources,whichcananalysedalongsideotherdigitisedandanaloguesourcesinawiderangeofhumanitiessubjectareas.

ForresearchingthearchivedwebintheBenelux,possible‘tri-national’researchquestionscouldincludealinguisticanalysisofthe‘Beneluxweb’,orageo-spatialanalysisofthegeographicdistributionofweb-domainsacrossBeneluxregion36.Similarly,researchquestionscouldfocusonjusttwooftheBeneluxcountries,suchasthegeographicdistributionofDutch-languagewebsitesintheNetherlandsandFlanders;ortheGerman-languagewebsitesinBelgiumandLuxembourg.Furthermore,asmanyEuropeanUnioninstitutions(withwebsitesinthe.eudomain)arelocatedwithintheBeneluxregion,thisalsooffersafurtherwealthofopportunitiesforhumanitiesresearchers.

Whiletheincreasedavailabilityofsuch(big)born-digitaldatasetsopensuptheopportunitiesforusingcomputationalresearchmethods,italsopointstotheneedtouptakenewskills.Itwillbethereforebeimportanttoestablisharangeofstandardtoolsandmethodswhicharewidelyacceptedforarchivedwebresearch37.Despitethesechallenges,thearchivedwebofferssubstantialopportunitiesfordigitalhumanitiesresearchers,bothintheBeneluxandbeyond.

Thispaperdiscussesa)thepotentialofwebarchivesfordigitalhumanitiesresearchers,b)introducestheweb-archivesinTheNetherlands,LuxembourgandBelgiumandc)presentsthepossibilitiesfortrans-nationalresearchthatcollaborationbetweentheBeneluxweb-archivescouldenable.

35 ThethreeofficiallanguagesofBelgiumareDutch,FrenchandGermanwithEnglishalsobeingwidelyused

OfficiallanguagesofBelgium:https://en.wikipedia.org/wiki/Languages_of_Belgium36 DNSBelgiumhasmappedthegeographicdistributionofweb-domainsbylocalauthorityacrossBelgium,

see:https://www.dnsbelgium.be/whois/stats.TheusefulofextendingthismappingcouldbeextendedtothewholeBeneluxregion.

37 Forexamples,see:Truman,Gail.2016.WebArchivingEnvironmentalScan.HarvardLibraryReport:https://dash.harvard.edu/handle/1/25658314

Page 32: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

32

SessionE

BeFAIRorbesquare:Stakeholders’perspectivesondataqualityintheDigitalHumanitiesReinierdeValk,DataArchivingandNetworkedServices(DANS)

VanessaHannesschläger,ÖAW-ACDH(AustrianCentreforDigitalHumanities)

KlausIllmayer,ÖAW-ACDH(AustrianCentreforDigitalHumanities)

FrancescaMorselli,DataArchivingandNetworkedServices(DANS)

EmilyThomas,DataArchivingandNetworkedServices(DANS)

IntroductionDigitaldataiscreatedeveryday.Notonlyhaveculturalandresearchinstitutesbeenmassivelydigitisingtheiranaloguecontentoverthepastdecades(digitisedobjects),butresearchinstitutesandindividualresearchersarealsoconstantlyproducingnewdigitaldata(born-digitalobjects).Thisisnotanewrevelation:withinthenaturalsciences,researchershavebeenusingandproducingstructured(and,morerecently,machine-readable)dataforcenturies.However,overthelastdecadestheresearchlandscapehasbeenchanging:withinthesocialsciencesandhumanities(SSH)disciplines,too,theuseofexistingdigitaldataandtheproductionofnewdigitaldatahasincreasedenormously[2,12,13,16].Thisentailsseveralissuesthatmustbeaddressed[5,9,18,23].

Thenecessitytopreserveandensurereusabilityfortheincreasingquantityofthisdata,whichtendstobequiteheterogeneous,hasmadetheissueofdataqualityaspecificallyurgentone[1,15,17].Inordertodepositresearchdatainatrustedrepository,itneedstomeetaminimumsetofqualitycriteria,suchascompleteness,reliability,andcorrectformalstructurebymeansoftheimplementationofinteroperableordiscipline-specificstandards[3,22].

Moreover,newactorsaswellasnewrelationshipsamongthemhaveemerged--aconsequenceofthereuseandsharingofresearchdataamongresearchersandinstitutions[20].Notonlyresearchersandresearchinstitutions,butalsoculturalheritageinstitutions,researchinfrastructuresandEuropeanprojects--allofwhichcanbereferredtoasstakeholders--arenowheavilyinvolvedindataexchangeprocesses,withtheaimofincreasingdatainteroperabilityandvisibility.

Againstsuchacomplexbackground,however,itisdifficulttodevelopandmutuallyagreeonatrulysharedvisionofwhathigh-qualitydatais,andwhatisrequiredtoachieveit.OneinnovativeapproachtoreachcommongroundisapplyingtheFAIRprinciples.

TheFAIRprinciplesFollowingalifesciencesworkshopinLeidenentitledJointlydesigningadataFAIRPORTin2014[4],aminimalsetofcommunity-agreedguidingprincipleswereformulatedbyadiversegroupofstakeholders,sharinganinterestinscientificdatapublicationandreuse.Thiswasinordertomakedatamoreeasilydiscoverable,accessible,appropriatelyintegratedandreusable,andadequatelycitableforbothmachinesandpeople.TheprinciplesthatwereconstructedherearenowwellknownastheFAIRprinciples[6,7,8,24],andactasaguidetodatapublishersandstewardsratherthanbeingastandardorspecification.Althoughtheseprincipleswereconceivedwithinalifesciencescontext,socialsciencesandhumanitiesalsofacesimilarissuesastheybecomemoredigitised,makingthetopicofFAIRdatamanagementalsoapplicabletothesefields.Insimplerwords,theFAIRprinciplesprovideasetofmilepostsfordataproducersandpublisherstohelpensurethatalldatawillbeFindable(definedbyapersistentidentifieranddetailedmetadata),Accessible(well-definedlicenseandaccessconditions),Interoperable(readytobecombinedwithotherdatabyhumansand

Page 33: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

33

machines:standardisedformatsandvocabulary)andReusable(readytobereusedinfutureresearchandprocessedusingcomputationalmethods).

StakeholdersWhenitcomestorepositoryanddataquality,themainfactorshapingindividualneedsandrequirementsisnotthedisciplinethedatacomesfrom,butratherthetypeofstakeholder,whichiscrucialfortheperspectiveonthedataandthenecessitiesthatcomewithit.InordertomotivatestakeholderstocommittotheFAIRprinciples,thedifferenttypesfirsthavetobeidentifiedandtheirspecificinterestshavetobeinvestigated.Examplesofstakeholdersareresearchcommunities,funders,dataarchives,researchinfrastructures,projects,andculturalheritageinstitutions.Itisimportantthatthesegroupsarebroughttogethertoaligntheirdifferentperspectivesondataasproducers,consumersandproviders.Forinstance,findabilityfromabroaderperspectivemightalsomeanhavingauser-friendlyinterfaceforaresearchertofinddatasets,whilstforaresearchinfrastructure,theavailabilityofsubstantialmetadatawouldbethecoreinterest.Therefore,differentstrategiesofcommunicatingandimplementingtheFAIRprincipleswillbenecessarytoreachthevarioustypesofstakeholders.

Moreover,bringingtogetherdifferentstakeholdersallowsdiscussionsforcollaborationsintheimplementationoftheFAIRprinciplesandsharingexperiencesofongoingefforts.Especiallywhenitcomestodataqualityanddataexchange,adiscussionofthegeneralframeworkoftheFAIRprinciplescanhelptobettercoordinatethedifferentapproachesandinterestsofstakeholders.

TheproposedpanelWeproposeapaneldiscussionwithrepresentativesofdifferentstakeholders,bothcurrentandpotentialfutureFAIRimplementers.TheirdiscussionwillfocusontheapplicationoftheFAIRprinciplestoimprovedataquality,formulatingFAIRdatamanagementrequirements(e.g.,byfunders)andassessingthequalityofdatasets(e.g.,byrepositories).Thiswillhelpdeterminecommonapproachesaswellasvariationsinperspective.Thefocuswillbeontheexchangebetweenthosewhoalready(startedto)implementtheFAIRprinciplesandthosewhohavenotyetdoneso;thiswillhelpanalysewhichgoalsareattainablebythevariousstakeholders.Weplantoinviterepresentativesofthefollowingtypesofstakeholders:

• researchers• researchinstitutes• culturalheritageinstitutions• researchinfrastructures• projects• funders.

Thefollowing(deliberatelyslightlycontroversial)sevenstatementsareintendedtoguidethediscussion:

1. Dataqualityiscompromisedbychangingresearchmethodsandincreasedcollaborationamongstakeholders.

2. Asaconsequence,stakeholdersdonotsufficientlyaddressthechallengeofguaranteeingdataqualityduetochangingresearchmethodsandtechniques.

3. Dataquality(includingimplementationsofFAIR)isnothighenoughontheagendaofthevariousstakeholdergroups.

4. Therefore,stakeholdersshouldimplementtheFAIRprinciples,evenifthismeansthatcurrentapproacheshavetobeadapted.

5. StakeholdersshouldraiseawarenessaboutusingtheFAIRprinciples.6. DataproducersareresponsibleforensuringtheFAIRnessoftheirdata.

Page 34: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

34

7. TheimplementationofFAIRprinciplesshouldbemonitoredininstitutionsand/oramongdifferentstakeholders.

Panelists(tobeconfirmedshortly)

Selectedbibliography[1]Batini,C.,andScannapieco,M.(2006).Dataquality:Concepts,methodologiesandtechniques.Berlin:Springer.

[2]Borgman,C.L.(2015).Bigdata,littledata,nodata:Scholarshipinthenetworkedworld.Cambridge,MA:MITPress.

[3]Brown,A.(2008).SelectingFileFormatsforLong-TermPreservation.https://www.nationalarchives.gov.uk/documents/selecting-file-formats.pdf

[4]DataFAIRport.DataFAIRportconference:JointlydesigningadataFAIRport.http://www.datafairport.org/component/content/article/8-news/9-item1

[5]Dix,A.,Cowgill,R.,Bashford,C.,McVeigh,S,andRidgewell,R.(2014).Authorityandjudgementinthedigitalarchive.In1stDigitalLibrariesforMusicologyWorkshop,London,UK.

[6]DutchTechcentreforLifeSciences.FAIRData.http://www.dtls.nl/fair-data/

[7]DutchTechcentreforLifeSciences.GO-FAIRinitiative.http://www.dtls.nl/go-fair/

[8]Force11.GuidingprinciplesforFindable,Accessible,InteroperableandRe-usabledatapublishingversionb1.0.https://www.force11.org/fairprinciples

[9]Giaretta,D.(2011).Advanceddigitalpreservation.Berlin:Springer.

[10]Griffin,G.,andHayler,M.,eds.(2016).ResearchmethodsforreadingdigitaldataintheDigitalHumanities.Edinburgh:EdinburghUniversityPress.

[11]Hayler,M.,andGriffin,G.eds.(2016).ResearchmethodsforcreatingandcuratingdataintheDigitalHumanities.Edinburgh:EdinburghUniversityPress.

[12]Kaplan,F.(2015).Amapforbigdataresearchindigitalhumanities.FrontiersinDigitalHumanities2(1):1-7.

[13]Lane,R.J.(2016).Thebighumanities:DigitalHumanities/digitallaboratories.London:Routledge.

[14]LERU(2013).LERUroadmapforresearchdata.http://www.leru.org/files/publications/AP14_LERU_Roadmap_for_Research_data_final.pdf

[15]NISO(2007).Aframeworkofguidanceforbuildinggooddigitalcollections.3rded.http://www.niso.org/publications/rp/framework3.pdf

[16]Owens,T.(2011).Definingdataforhumanists:Text,artifact,informationorevidence?JournalofDigitalHumanities1(1):n.p.http://journalofdigitalhumanities.org/1-1/defining-data-for-humanists-by-trevor-owens/

[17]Peer,L.,Green,A.,andStephenson,E.(2014).Committingtodataqualityreview.InternationalJournalofDigitalCuration9(1):263-291.

[18]Pryor,G.,ed.(2012).Managingresearchdata.London:FacetPublishing.

Page 35: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

35

[19]Purdy,J.P.,andWalker,J.R.(2010).Valuingdigitalscholarship:Exploringthechangingrealitiesofintellectualwork.Profession1:177-195.

[20]Quan-Haase,A.,Suarez,J.L.,andBrown,D.M.(2014).Collaborating,connecting,andclusteringinthehumanities:Acasestudyofnetworkedscholarshipinaninterdisciplinary,dispersedteam.AmericanBehavioralScientist59(5):565-581.

[21]Terras,M.(2010).Digitalcuriosities:Resourcecreationviaamateurdigitization.LiteraryandLinguisticComputing25(4):425-438.

[22]Tjalsma,H.,andRombouts,J.(2011).Selectionofresearchdata:Guidelinesforappraisingandselectingresearchdata.TheHague:DANS.

[23]VanZundert,J.(2012).Ifyoubuildit,willwecome?LargescaledigitalinfrastructuresasadeadendforDigitalHumanities.HistoricalSocialResearch37(3):165-186.

[24]Wilkinson,M.D.,etal.(2016).TheFAIRGuidingPrinciplesforscientificdatamanagementandstewardship.ScientificData3(160018):1-9.

Page 36: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

36

SessionF

APragmaticApproachtoUnderstandingandUtilizingEventsinCulturalHeritageLoraAroyo1,MarnixvanBerchum4,LizzyJongma3,WillemRobertvanHage5,GerardKuys6,SusanLegene1,AnneliesVanNispen3,JaccovanOssenbruggen2,LodewijkPetram4andPiekVossen11VrijeUniversiteitAmsterdam,TheNetherlands2CentrumWiskunde&Informatica(CWI),TheNetherlands3NIODInstituutvoorOorlogs-,Genocide-enHolocauststudies,TheNetherlands4HuygensING,TheNetherlands5NetherlandseScienceCenter,TheNetherlands6NationaalArchief,TheNetherlands

IntroductionCulturalheritageinstitutionsarecontinuouslyrethinkingtheaccesstotheircollectionstoallowthepublicaswellasscholarsandprofessionalstointerpretandcontributetotheircollections.TheircollectionsarechallengedwithadvancementoftheWeb.Theyneedtobepresentedinasustainablewayonline,andtobeinstantlysearchableandunderstandableforexpertsandlayaudiences[1].Hermeneuticsishumanitiestheoryofinterpretation.Currentlyitisamendedtodigitalhermeneuticstoformtheappropriatecontexttothinkaboutprovidingaccesstoandinterpretationofonlineculturalheritagecollections[2].

Importantroleintheinterpretationofculturalheritagecollectionsplay‘historicevents’,whichmeaningkeepsbeingre-discoveredandre-interpretedinlightofmoderndiscussions.HistorychangesovertimeandwiththepresenceofthesocialWebitisundercontinuousevolvement.“Itisnotonly‘grand’historicaleventsthataresubjecttochangesininterpretation.Singlewords,concepts,ideasandbookscanalsohavedifferentmeaningsacrosstime,spaceandsocialgroups.”[3].Automatictextanalysistechniquesprovidethemeanstominelargeamountsofunstructureddataandgivescholarsaccessto`bigdata’.Tounderstandbetterthis‘bigdata’weobserveashifttowardsdeeperdataminingfocussedontheretrievalofmeaningfulunits,e.g.answers,entities,events,discussions,andperspectives.Additionally,wealsoobserve,apushtowardstheautomaticcreationofknowledgegraphsthatarepopulatedwithrichsemanticunits,e.g.entities,relations,activities,eventsprovidepossibilitiesofdivingintomorethedetailsandaddressmorecomplexquestions.Allthiscomesasaresponsetotheneedtounderstandbetter‘events’andtheirsemanticstructureandthushelp,ontheonehand,heritageinstitutionsassigningmeaningandvaluetoonlinecollectionobjects,andontheotherhand,helphumanitiesscholarsintheexplorationandcontextualizationoftheirtasks[3].

MethodologyThisworkismotivatedbythe(1)demandsforfacilitatingdeeperunderstandingofonlineculturalheritagecollections,andbythefactthat(2)eventsemergedasakeyelementintherepresentationofdatainareassuchashistory,culturalheritage,andmultimedia.Webringtogethercomputerscientists,computationallinguistsandhumanitiesandsocialsciencesscholarsinordertobuilduponandexpandtheresultsinexistingresearchcommunities,e.g.NLP,InformationRetrieval,SemanticWeb,SocialWebAnalytics,Multimediaanalysis,andprovidestructureanddeeperunderstandinginhistory,media,journalismandculturalheritageresearch,withaspecificfocusonhoweventsareusedasakeyconceptforrepresentingknowledgeandorganisingmediainonlinewebcollections.TheultimategoalistodistillaresearchandapplicationroadmapsforeventsinCulturalHeritage,

Page 37: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

37

e.g.achievingasocialconsensusonprocesses,identifypracticalstandardsandprotocols,definingtheinfrastructureneeded.

Ourapproachistwo-foldfollowingtwoparalleltracks.Ontheonehand,wedivetop-downtoprovideancomprehensiveanalysisofthestate-of-the-artaroundeventsandtheirpivotalroleinenrichingthecontentofcollectionsintheseareas.Inthiscontext,westudytheiradded-valueinenablingnewmeaningfulinteractionswithmultimediacollectionsonlineforhumanitiesscholars,heritageprofessionalsandlayaudiences[4].Wealsostudytheirvariousaspectsandpotentialbenefitsofassigningeventsintherepresentationandorganisationofknowledgeandmedia[5].Forthis,weexploremethodsandtechniquestosupport(1)detection,modelingandrepresentationofeventsinonlinecollections;and(2)searching,explorationandinterpretationofonlinecollectionsenrichedwithevents.Forexample,weassesstheutilityofexistingeventmodelstosupportusersinderivingusefulfacts.

Ontheotherhand,weemergeabottom-upanalysisofconcreteusecasesanddatasets.Weguideourexplorationsthrougheventdetectionandanalysisperformedbymachine[6,7]andhuman[8,9]computationondifferentcollectionsinthecontextofconcreteusecases.Weidentifyfourgroupsofresearchquestionsrelatedto(1)eventidentityanddefinitions,(2)eventdetectionandextraction,(3)eventmodellingandrepresentation,and(4)eventrelationshipsandinteractionswithapplications.

Inthecontextofstudyingtheeventidentityanddefinitionsweareinterestedinunderstandingbetterwhatistheinternalstructureofanevent;whatarethedifferencesbetweenevents,actionsandstates;whentwoormoreeventsthesame;whataredifferentpointsofviewandinterpretationsofthesameevent.

Tocontinuouslyimprovemethodsandtoolsforeventdetectionandextractiontheresearchneedstobeguidedbyadeeperunderstandingofhoweventscanberecognisedindifferentmediatypes;howcanweassignanotionofnovelty&veracitytoevents;howcanweassignalevelofgranularitytoevents;howaredifferenteventsrelated.

Tomodeleventsandrepresentknowledgeabouteventsacrossdifferentdomainsweseekdeeperunderstandingofshared,openorproprietaryknowledgestructures,suchasvocabularies,taxonomiesandthesaurithatcanbuildthebackboneofsuchmodels.Wefurtherstudyhowwecanachieveinteroperabilityofeventstructure,andwhataretheeventrepresentationrequirementsfordifferenttypesofevents,e.g.,historical,cultural,personalevents.ItisalsointerestingtoknowhowdoesSocialWebinfluenceorcontributetotheunderstandingofevent.

Inthiscontext,wemovebeyondthetypicalphilosophicalleveldiscussionsabouteventsandprovidethelandscapeofthedifferentpointsofviewsandschoolofthoughtonthatmatter.Tofacilitateasharedandpragmaticapproachtodealwithevents,wefocusonexistingmodels,suchastheSimpleEventModel38,LODE39,EVENT,Schema.org,Wikidata.Eachofthemhasbeendevelopedtomakeuseofexistingvocabulariesanddatasourcesthatdescribeevents,whereeventsrefertoeverythingthathappens,evenfictionalevents.

Finally,weaimtounderstandthediversityofeventrelationshipsandtheirinteractionswithapplicationsanddata,i.e.howcaneventsberepresentedintosupportcollectionbrowsing,serendipitousexploration,narrativebuilding;whatareusefultoolsforeventannotationbyexpertsandlaycrowds;whatareefficientwaysofcrowdsourcingeventannotations;whataresuccessfulmethodsforeventvisualisation&interaction.

38http://semanticweb.cs.vu.nl/2009/11/sem/39http://linkedevents.org/ontology/

Page 38: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

38

References[1]CVanDenAkker,AvanNuland,LvanderMeij,L.Aroyo(2013).Frominformationdeliverytointerpretationsupport:evaluatingculturalheritageaccessontheweb

[2]Capurro,R.(2010).DigitalHermeneutics:AnOutline.AI&Society2010,35(1),35-42

[3]Wyatt,S.,Millen,D.(Eds.)MeaningandPerspectivesintheDigitalHumanities.AWhitePaperfortheestablishmentofaCenterforHumanitiesandTechnology(CHAT),KNAW,2014

[4]VDeBoer,JOomen,OInel,LAroyo,EVanStaveren(2015).DIVEintotheevent-basedbrowsingoflinkedhistoricalmedia

[5]MvanErp,JOomen,RSegers,CvandenAkker,LAroyo,GJacobs(2011).Automaticheritagemetadataenrichmentwithhistoricevents.Archives&MuseumInformatics,Toronto

[6]Sprugnoli,R.,Tonelli,S.(2016)‘One,nooneandonehundredthousandevents:Definingandprocessingeventsinaninterdisciplinaryperspective’,NaturalLanguageEngineering,pp.1–22.

[7]TPloeger,MKruijt,LAroyo,FdeBakker,IHellsten(2013).ExtractingactivisteventsfromnewsarticlesusingexistingNLPtoolsandservices(2013)

[8]ADumitrache,OInel,BTimmermans,LAroyo,RJSips(2015).CrowdTruth:Machine-HumanComputationFrameworkforHarnessingDisagreementinGatheringAnnotatedData.

[9]LAroyo,CWelty(2013).Harnessingdisagreementforeventsemantics.IntheproceedingsofDetection,Representation,andExploitationofEventsintheSemanticWeb.

Page 39: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

39

SessionG

Panel:StrategiesforintegratingDigitalHumanitiesskillsandpracticesintheHumanitiesCurriculumSusanAasman(UniversityofGroningen)40StefaniaScagliola(UniversityofLuxembourg)41

InthispanelweintendtoevaluatepoliciesandstrategiesthathavebeenappliedtointegratebothDHskillsandpracticesinHumanioracurricula.MostteachingonDHisofferedinseparateMinorand/orMasterprogramsandisscarcelyintegratedintheregularcurriculum.Thisreflectstheskepticismwithregardtothestatusofdigitalapproachestohumanitiesresearch:aretheysupposedtograduallymergeintotheregularhumanitiescurriculumorisDigitalHumanitiesgoingtoremainadistinctivefield?(Reid,2012).Overall,alackofconsensusonthelevelofexpertisethatshouldbetaughtcanbediscerned.Shouldtheybetrainedtobeabletomakeaneducatedchoiceintheirfuturejobofwhichkindoftechnologicalexpertisetheyshouldseek?Orshouldthegoalbetousethetoolsthemselvesandbeabletocustomizethemtotheirspecificneeds?Eventeachingtheverybasicskillstostudentsonhowtheycansearch,access,process,analyseandcreateinformationwithdigitaltools,requiresliterallymorespaceandtimethenisavailablewithinatraditionalsubjectsuchas‘methodsofresearch’.NordoestheintroductiontotheservicesoftheLibrarythatisofferedyearlyatthestartofahumanitiesbachelor,sufficetocoverallnecessaryskills(Ferrarietal,2014,Clement,2012).Thisknowledgegapisremarkable,consideringthewidelysharedbeliefthatDHskillsareanimportantassetforincreasingthechancesofstudentsonthejobmarket(Clement,2012,Scagliolaetal,2014).ItisclearthatthefutureofDigitalHumanitiesteachingfacesanumberofinstitutional,political,logisticalandpedagogicalchallenges.

Ourintentistoofferanalternativetotheusualidealtypicalagendasonwhatshouldbedonetosolvethisproblem.Weintendtogatherbestpracticesthroughtheactiveinvolvementoftheattendantsoftheworkshop.WewillstartwithashortoverviewofexistingDHteachingcourseswithintheBenelux,thatcanberetrievedthroughtheDARIAH/CLARIAHwebbasedCourseRegistry.Followingtheshortoverview,threeexemplaryusecasesfromourownteachingpracticewillbeintroduced.

Case1:Integrationincurriculum:MasterprogramDigitalHumanitiesinafacultyofArts• Context:designingaMasterprogramforaFacultyofArts(History,ArtHistory,Journalismand

MediaStudies,Literature,Film,EuropeanLanguagesandCulture,CommunicationScience,InformationScienceandArcheology),whichisopentoallstudentswithaBAinoneoftheArts

• Goal:offeringanall-roundprogramthatcombinestheoreticalreflectiononDigitalHumanitiesandtheroleofdigitaldataincontemporarycultureandsociety(includingArt),toskillcourses(codingforHumanities,creatingadatabase)anddatahandling(creating/analyzing/visualizing)

• Credits:60ECprogram,noentryrequirements• Obstacle:makingtheshiftinjustoneyearfromaregularBachelorprogrambasedonthe

traditionalhermeneuticframework,tomorequantitativeapproachesrequiringnewskills,newmethodswhilekeepingclosetothedisciplinarybackground.Thisposesdilemmasonwhattoleaveoutandinclude.

40 SusanAasmanismediahistorianandworksattheDepartmentforMediaStudiesattheUniversityof

Groningen.ShealsocoordinatestheMasterprogramDigitalHumanitiesandisDirectoroftheGroningerCentreforDigitalHumanities.

41 StefaniaScagliolaisahistorianandworksasapostdocatC2DH,theCentreforContemporaryandDigitalHistoryoftheuniversityofLuxembourg.SheisdevelopingaplatformforteachingDigitalSourceCriticism.

Page 40: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

40

Thesecondcaseisacoursethatisinpreparation.

Case2:Integrationincurriculum:subjectDigitalSourceCriticisminatraditionalfacultyofHistory• Context:DesigningaplatformforDigitalSourceCriticismforbachelorandmasterstudentsto

teachdigitalhistory,withinatraditionalfacultyofhistory.• Goal:teachingstudentsthepracticalandtheoreticalimplicationsofhistoricalsourcesindigital

form,teachthemhowtocreateadigitalobject/exhibit/publication.• Credits:tobedecided• Obstacle:theamountoftimetotrainskillsandtocreateadigitalobject,isnotavailablewithin

theexistingcurriculum.DHismethodoriented,whereasmosthistoryclassesarethematic.

Case3:IntegrationintheeducationalresourcesofahumanitiescurriculumoftheDIgitalHumanitiesCourseRegistry.• Context:asearchenvironmenthasbeendesignedthatoffersanoverviewofDHcoursesthatcan

betakenupintheBenelux• Goal:Thegoalistoofferstudentsandlecturerstheopportunitytogetanoverviewofthecourses

thataretaught.Studentscanorientatethemselvesandchooseabachelor,masterorsubject,lecturerswithinterestintakingupDHintheirteachingcanorientatethemselveswithregardtocontentandapproachbydrawingontheeffortoftheirpeers.

• Credits:notapplicable• Obstacle:Theresourcehasbeencreated,butisnotintegratedintothestandardeducational

resourcesthatareofferedtostudentsandlecturerstoorientatethemselves.Reachingouttotheintendedaudienceisproblematic.

Afterashortintroduction,theparticipantswillthenbedividedinthreegroupsandeachgroupwillbegivenacasestudywithanassignmentrelatedtothechallengesthatthecaseposes.Theywillberequestedtobrainstormonpossiblesolutionsanddocumenttheirsuggestionsinacollectiveonlinedocument.Thiswillformthebasisforabroadercollectivedocumentthatcanbecrowdsourcedwithintheteachingcommunity,turnedintoapublicationandpresentedatanextBeneluxDHgathering.Afterthebrainstorm,eachgrouppresentsitsfindings.

Ourexpectationisthatthefocusonaconcreteteachingpracticebyscholarsdirectlyinvolvedinthefield,willyieldusefulinsightsintotheirbestpracticesandstrategiesforexpandingtheinterestinDH.OneofthecentralissuesremainsthequestionwhetherDigitalHumanitiesshouldbeconsideredasanentityinitselfcompetingwithregularsubjectsorwhetheritshouldbeintegratedintotheregularcurriculumandbecomeastandardpractice.

LiteratureAnuscaFerrari,BarbaraNežaBrečko,YvesPunie,‘DIGCOMP:AFrameworkforDevelopingandUnderstandingDigitalCompetenceinEurope’,in:eLearningPapers38,May2014–www.openeducationeurope.eu/en/elearning_papersn.38

T.Clement(2012),‘MultiliteraciesintheUndergraduateDigitalHumanitiesCurriculum:Skills,Principles,andHabitsofMind’.InHirsch,B.(ed),DigitalHumanitiesPedagogy:Practices,PrinciplesandPolitics.Cambridge,U.K.:OpenBookPublishers.365-388.

R.Reid(2012),‘GraduateeducationandtheethicsoftheDigitalHumanities’,in:MatthewK.Gold(ed),DebatesinDigitalHumanities,Minnesota,USA.:UniversityofMinnesota.

S.Scagliola,F.Maas,E.Stronks,‘TheTeethingTroublesofTeachingDigitalHumanities:Sharingknowledgeandmappingchallenges’,presentationattheDHBenelux2014.

Page 41: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

41

SessionH

1.IstheEuropeofKnowledgethetalkofthetown?ExploringthepotentialofdigitaldataonMEPspeechesintheEuropeanParliamentMartinaVukasovic1,a,JulieM.Birkholz1,b,JelenaBrankovic1,2,c1CentreforHigherEducationGovernanceGhent(CHEGG),DepartmentofSociology,FacultyofPoliticalandSocialSciences,GhentUniversity,KorteMeer5,9000Gent,Belgium2UniversitatBielefeld,FacultyofSociology,GebäudeXC2-201,Bielefeld,Nordrhein-Westfalen,DE33501aCorrespondingauthor:[email protected],[email protected],[email protected],+49-521-106-12978

AbstractWeexplorewhetherandhowincreasingcompetencesoftheEuropeanParliament(EP)acrosspolicyareasimpacteditsapproachtohighereducation(HE).Usinganewdigitaldatasetcontainingmorethan10,000speechesdeliveredintheEPplenarybetween2000and2014,weidentifythattotalnumberofspeechesdidincreaseovertime,particularlyduringtheadoptionofactionprogrammesintheareaofHEandrelatedbudgetarydecision.HEwaslessreferredtointheEPspeechesasastand-aloneissuethaninrelationtootherpolicyareasinwhichtheEUhasstrongjurisdiction.WealsoprovidetentativeevidencethatthevarianceinwhetheramemberoftheEP(MEP)speaksaboutHEismorelinkedtoMEP’scountryoforiginthanpartyaffiliation.Promisesandpitfallsofdigitaldataanalysisandpossibleavenuesforfurtherresearcharealsodiscussed.

Keywords:highereducation;policy;EuropeanParliament;EuropeofKnowledge;digitaldatacollection;semi-automaticcontentanalysis

AcknowledgementsWewouldliketothankcolleaguesattheCHEGG(inparticularMarcoSeeber)andparticipantsatCHER2015conference.Anyremainingerrorsareourown.AcknowledgementstoTalkofEuropestaff,editorsandreviewerstobeadded.

FundingThisworkwassupportedbytheResearchFoundation–Flanders(FWO),underGrantG.OC42.13N.

IntroductionSince2000,theEuropeanUnion(EU)hasputknowledgeatthecentreofitsstrategicendeavours.TheaimoftheLisbonStrategywasforEuropetobecomethemostadvancedknowledge-basedeconomyintheworldby2010.Asaresult,duringthe2000s,theEuropeanCommissionputforwardseveralcommunicationsfocusingontheroleofuniversitiesinthisprocessandthenecessityforauniversitymodernizationstrategy(e.g.EuropeanCommission,2006),culminatingwiththeEurope2020inwhichknowledgeisessentialforensuringsmart,inclusiveandsustainablegrowth(EuropeanCommission,2010).Throughoutthisperiod,knowledgehasbeen‘exported’tootherpolicyareasasapolicysolution(Elken,Gornitzka,Maassen,&Vukasović,2011),whilethefundingofEUprogrammesfosteringcooperationinthisareahasbeenincreasing,despitethefinancialcrisis–e.g.forthe2014-2020period,thereisa30%increaseoffundsallocatedtoresearchcooperation,and40%increaseforeducation.

Page 42: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

42

Althoughthesedevelopmentshavebeenthefocusofmanystudies,42mostofthemareconcernedwiththecreationofspecificinstitutions(e.g.theEuropeanInstituteofTechnology)ortheBolognaProcessanditsrelationshipwiththeEUinitiatives,wherebytheytypicallyhighlighttheindividualpolicyentrepreneursortheroleoftheEuropeanCommission(andsometimesalsotheEuropeanCourtofJustice,ECJ).OtherEUinstitutions,inparticulartheEuropeanParliament(EP)anditsinvolvementinpolicycoordinationinthisarea,havereceivedfarlessattentionthusfar,whichreflectsneithertheimportanceofHEforthewholeEuropeanprojectnortheincreasingimportanceoftheParliamentinEUdecision-making(seebelow).

Withthisinmind,thepresentstudyfocusesontheextentandthemannerinwhichHEhasbeendiscussedintheEPsince2000,exploringanewdigitaldatasetcontainingMEPspeechesdeliveredduringtheperiodstudied.WestartbyoutliningthechangesinhowEUapproachesthetopicofHEandtheoverallroleoftheEPinEUdecision-making.Fromthis,wederiveasetofexpectationsconcerninghowHEisdiscussedintheEPwhichweinvestigatethroughanexploratoryresearchdesign.Namely,weundertakedigitaldatacollectionmethodsandsemi-automaticcontentanalysiscodingonasetofmorethan10,000speechesgivenintheEPbetweenJanuary2000andDecember2014,identifiedthroughasetofsearchtermsusing‘TheTalkofEurope’dataset.43Weanalysethedataandidentifypatternsrelatedto‘when,whoandhow’speaksaboutHEintheEP.Wethendiscussourfindings,aswellasthepromisesandpitfallsofdigitaldataanalysisasamethod,andoffersomedirectionsforfutureresearch.

Contextandanalyticalpatternsofinterest

HEandtheEUIntheEUcontext,HEhaslargelybeenconsideredaspecialisedpolicyareasteeredbynationalministerialadministrationsandstronglyinfluencedbyexpertcommitteesandlocalsectoralinterests.Educationingeneralhaslongremainedanareaofnationalcompetence(Gornitzka,2009),meaningthatthelegislativebodiesoftheEU(theEPandtheCouncil)donothaveregulativecompetencesintheareaofHE.PrevioustotheTreatyofLisbon,thiswasreinforcedintheprincipleofsubsidiarity–decisionsweretakenatthelowestpossiblegovernancelevel,inthiscasethenationalauthorities.From1December2009onwards,whentheTreatyofLisboncameintoforce,educationhasbeenconsideredasasupportingEUcompetenceallowing‘theUniontocarryoutactionstosupport,coordinateorsupplementMemberStates’actions’inthisarea.ThischangepotentiallyprovidesmoreleewaytotheEUinthisdomain,althoughEUstillcannotengageintheseactionsonitsown.However,anumberofcaveatsneedtobeaddressed.

First,interestinEuropeanlevelpolicycoordinationintheareaofHEhasexistedsincetheearlydaysoftheEuropeanproject.AsCorbett(2005)states,eversincetheEuropeanCoalandSteelCommunity,EuropeanlevelpolicyentrepreneursinvariousEUinstitutionshavebeenpushingfordifferentEuropeaninitiativestargetingeducation.TheireffortseventuallylaidthegroundfortheErasmusprogrammeandanumberofpilotprojectsfocusingonpolicycoordination,suchascooperationintheareaofqualityassurance(ENQA2010;EUCouncil1998).ThistrendhasbeenstrengthenedbyseveralrulingsoftheECJconcerningrecognitionofqualifications(seee.g.Corbett2005onGravierdecision),aswellasregulationconcerningrecognitionofqualifications,inparticularforregulatedprofessions(Beerkens,2008).TherearealsoindicationsthatHE(andresearch)mayincreasinglybecomesubjecttoEUprimarylaw(i.e.EUlevelregulation)concerningcompetition(an

42 Seee.g.Amaraletal.(2009),ChouandGornitzka(2014),Corbett(2005),HuismananddeJong(2014),

MaassenandOlsen(2007).43 http://www.talkofeurope.eu/data/(accessed22January2017).

Page 43: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

43

exclusivecompetenceoftheEUaftertheTreatyofLisbon),duetotheblurringofthedistinctionbetweenpublicandprivateaspectsofHE(Gideon,2015).

Second,(higher)education,researchandinnovationcomprisetheso-calledknowledgetrianglewhichhasbeenthecornerstoneoftheEU’sstrategicdocumentssincetheLisbonSummitin2000.ThefocusondevelopingtheEUasaEuropeofKnowledgeorInnovationEuropehasremainedstrong.Thus,onecouldarguethatintegrationinthisareacanbeconsideredasinequanonofEuropeanintegrationassuch(EuropeanCommission,2010).HEisbeing‘exported’tootherpolicyareas–economiccompetitiveness,socialcohesion,environment,security,foreignrelationsetc.–asapolicysolutionandmodernizationofHEisseenasakeyingredientofpolitical,social,economicandculturaldevelopment(Elkenetal.,2011).Duetothisfunctional‘spill-over’fromareasinwhichtheEUdoeshaveformalregulativecompetences,thismeansthatHEisbecomingatopicofgrowinginterestforEUinstitutions.

Third,intheareaofHEtheEUhasbeenemployingtheso-calledOpenMethodofCoordination(hereinafter:OMC).OMCreliesonvoluntarysettingofstandardsandbenchmarks,andincludesdevelopmentofproceduresdesignedtomonitorprogress.Whilethisapproachmay,atfirstglance,seemrathersoftgivenitsvoluntarynatureandampleroomforwindow-dressing,evidencesuggeststhatthepossibilityinherentintheOMCto‘nameandshame’laggardscanactuallybeapowerfulinstrumentleadingtosignificantchangesonboththenationalandinstitutionallevel(Gornitzka,2014).ThefactthatthesechangesdonotnecessarilyresultinclearanddeepconvergenceislessanindicationofOMC’ssoftnessandmoreanindicationofthecomplexityofimplementationprocessesinHE(Musselin,2005).

Insum,theEUhasbeenincreasinglyfocusingonpolicycoordinationintheareaofHE,eitheronitsownorduetospill-overfromotherpolicyareasinwhichithasexplicitcompetences.Whilemostoftheactivitiesinthisareahavebeenledbytheexecutivebranch–theEuropeanCommission(EC),otherEUinstitutionshavefocusedonHEaswell,includingtheEPwhich,amongstother,istaskedwithoversightoftheEC.

TheroleoftheEuropeanParliamentintheEUdecision-makingOverall,inthecaseofEUdecision-making,thedistributionofpowerisassessedasrathercomplex(Börzel,2010).Whiletheexecutive,judicialandlegislativepowersintheEUareshared,thebasicdistinctionbetweengovernment,courtsandtheparliamentthatexistsonnationallevelsdoesnotexistinthesamewayontheEuropeanlevel.ThisisinparticularthecaseforlegislativecompetenceswhicharecurrentlysharedbetweentheEPandtheCouncil,aset-upreferredtoas‘co-decision’.

ThespecificationanddivisionoftasksbetweenthedifferentEUinstitutionshasbeenevolvingsincetheverybeginningoftheEuropeanintegrationprojectandthisisinparticulartrueforthe“legislativepowersoftheEP[which]havegrownsequentially”(Pollack,2010,p.31).TheseedsofEP’sempowermentcanbefoundalreadyinthetreatiesof1970and1975whichgavetheEPsomecontrolovertheEUbudgetandintroducedalsotheCourtofAuditors.TheSingleEuropeanActfrom1986alsogavetheEPincreasedlegislativepowerandexpandedtheoverallEUpolicyscope,extendinganddeepeningEU’scompetencesinmoreareas(Wallace,Pollack,&Young,2010).Co-decisionbetweenEPandtheCouncil–implyingthatadecisionneedstobeacceptedbybothbodies–wasfirstintroducedinthe1992TreatyonEU(Maastricht),whilethe1997TreatyofAmsterdamintroducedstrongrequirementsforEP’sassentonenlargementandappointmentsoftheCommission.Overall,EP’sinvolvementinEUdecision-makinghasevolvedfromanon-bindingconsultationproceduretoaco-decisionprocedurewiththeCouncilinthe1990s,onlytobefurtherstrengthenedinthe2000sbyestablishingco-decisionasthestandardoperatingprocedureusedformajorityofpolicyareas(Pollack,2010).

Page 44: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

44

Furthermore,theEPistaskedwithapprovingtheEUbudgetanddischargingtheaccountsofthepreviousyear(Laffan&Lindner,2010).Concerningbudgetapproval,thesedecisionsareimportantbecausetheyarehighlyvisibletoMEPsconstituentsandarepossiblycontentiousgiventhatpotentialwinnersandloserscanbeclearlyidentified–“sinceithasbeengrantedbudgetarypowersin1975,theEPhasregardedtheEUfinancesasoneofitskeychannelsofinfluencevis-à-vistheCouncil”(Laffan&Lindner,2010,p.214).ThisiswheretheEPtriestoinfluencedecisionsatboththemacrolevel–concerningmulti-annualfundingframeworks,aswellasthemicrolevel–concerningspecificprogrammesandprojects.AnexampleoftheformerconcernsthestrongfocusonresearchandtechnologyinthediscussionoftheFinancialPerspectivefor2007-2013(reflectingthefindingsoftheso-calledSapirreport),wheretheEPalsosupportedtheEC’sproposaltostrengthenexpenditureforpublicgoods,effectivelypositioningitselfagainstthesomeMemberStates(Laffan&Lindner,2010).ExampleofthelatteristhedecisionconcerningErasmusMundusbudgetin2003(Corbett,2005)andtheEP’sconcernovertheJunckerCommissiontouse2/3oftheHorizon2020fundingfortheEU’sinvestmentfund.44Giventhatthemulti-annualbudgetplanactuallyhasthestatusofalaw,bindingforseveralyears,theEP’sdeliberationsanddecisionsonthebudgetissueshavebecomeevenmoreimportant.

TheEPalsoplaysanimportantroleinappointingtheCommission;itapprovestheCommissionPresidentandhasthepowertoholdtheCommissionaccountable.Forexample,itwaseffectivelytheEPwhichforcedtheSanterCommissiontoresigninthelate1990s,followingtheclaimsofinsufficienttransparencyinspendingofEUfunds.TheEPalsodelayedtheendorsementofthe2004Commissionduetotheproposedcompositionandtheendorsementofthe2009CommissionPresident.FortheJunckerCommission,theMEPsputforthanumberofrequestsconcerningtheportfoliosofthedifferentCommissionersandtheproportionoffemaleCommissionersbeforeapprovingtheoverallcomposition.

Overall,theEPiscurrentlyinthepositiontoconstraintheagenda-settingactivitiesoftheECanditcanalsoexplicitlyasktheECtodealwithspecificissues(Young,2010).Givenitsroleintheco-decisionprocedure,itcaneffectivelyactasavetoplayerandblockdecision-making(Finke,2010).However,ithasalmostnoinvolvementintheprocessofimplementationandpolicyevaluation(Young,2010).Ingeneral,sincetheearlydaysoftheEuropeanintegrationproject,theEPhasincreaseditsinfluenceonEuropeanleveldecision-making,whereinexchangetheECandtheCouncilhavearguablylostsomeinfluence.However,adetailedanalysisofinter-institutionalpowerrelationshipswouldneedtotakeintoaccountthatnoneoftheseinstitutionsareunitaryactorsandthattheirowninternaldynamicisimportantaswell.

WhatgoesoninsidetheEPandwhyisitimportant?TheEPiscomposedofMEPs,thevastmajorityofwhichareorganizedintoEuropeanpartyfamilies,whiletherestare‘non-attached’MEPs.ThecandidatesforMEPsrunatnationallyorganizedelections,wherethenumberofMEPstobeelectedfromeachstatedependsonthecountry’spopulation.However,onceelected,theMEPsaregroupedintheEPnotaccordingtotheircountries,butinaccordancetotheirpartisanaffiliation.ThecompositionoftheEPandthetotalnumberofMEPsperparliamentarytermofinterestforthisstudyispresentedinTable1.

Giventhatitisasupranationallegislature,EP’sconnectiontotheelectorateisbysomeconsidered“notablyweak”(Young,2010,p.58),althoughitshouldbeacknowledgedthattheEPiseffectivelytheonlyEUinstitutionwhosemembersaredirectlyelected.Evidencesuggeststhat,onceintheEP,thedecisionsoftheMEPsaremoredeterminedbythegenericleft-rightpoliticalcleavagesbetween

44 Seee.g.https://euobserver.com/economic/128867(pageaccessed1March2017).

Page 45: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

45

thedifferentEuropeanpartyfamiliesandtheirpositionsconcerningthescopeandlevelofappropriateEuropeanintegrationthanbytheMEPs’countryaffiliations(Finke,2010;Pollack,2010).

Table1–NumberofMEPsacrosspartyfamiliesandparliamentaryterms.Source:EPwebsite.

NumberofMEPs(perpartyfamily)5thterm1999-2004

6thterm2004-2009

7thterm2009-201445

EuropeanUnitedLeft/NordicGreenLeft(GUE-NGL) 42 41 35

ProgressiveAllianceofSocialistandDemocrats(S&D);formerlyPES

180 200 184

Greens/EuropeanFreeAlliance(Greens/EFA) 48 42 55

AllianceofLiberalsandDemocrats(ALDE);previouslyELDR 50 88 84

EuropeanPeople’sParty(EPP),formerlyEPP-ED 233 268 265

EuropeofFreedomandDirectDemocracy(EFDD),formerlyIND/DEMorEFD

16 37 32

EuropeanConservativesandReformists(ECR),formerlyUEN 30 27 54

Non-attached(NA) 9 29 27

Total 626 732 736

WhilepreparatoryworkiscarriedoutinspecialistcommitteesoftheEP(Wallace,2010),plenarysessionstakingplaceeverymonthinStrasbourgserveasanopportunityforMEPstoaddresseachother,aswellasotherEUinstitutionsandthepublic(Proksch&Slapin,2010;Slapin&Proksch,2010).Thespeechesduringthesesessionsserveseveralpurposes:(a)arguinginfavouroragainstalegislativeproposal,(b)scrutinizingotheractors,inparticularthoseoverwhichtheEPhasoversight(e.g.theEC),(c)sendingsignalstonationalconstituents,(d)othermembersofthepartygroupor(e)othermembersoftheEP(Slapin&Proksch,2010).ThesessionsaresometimesstructuredaroundanopeningstatementoraproposalbytheEC,followedbyarapporteuroftherelevantEPcommittee(Proksch&Slapin,2010).Thelatterplayaparticularlyimportantrole:theyaretheonessteeringnegotiationswithinthecommitteesandworkingonensuringthesupportacrossdifferentpoliticalgroups(Kohler,2014).Whilethis‘behind-the-scene’workpotentiallylimitsthepossibilitiesfordebateandconflictintheplenarysessionbetweenthedifferentpartyfamilies(Kohler,2014),rapporteur’sspeechesintheplenaryareneverthelessimportantasindicatorsoftheoutcomesofnegotiationswithinthecommittees.46Aftertherapporteur,thespeakingtimeisallocatedtopartyfamilies,withMEPsofthelargestfamilyspeakingfirst.AllocationoftimebetweentheMEPswithinonefamilyisdoneinternally,andtheindividualspeechcannotlastmorethanthreeminutes.Attheendofthedebateandbeforethevote,theECrepresentativesmayreplyandindicatetheEC’spositionontheproposal(Proksch&Slapin,2010).Importantly,astheEPalsohasthepowertoputforwardissuesonitsown,andnotonlytofollowEC’sagenda,MEPscanspeakonawiderangeoftopics,boththoseinwhichtheEPhasexplicitcompetencesconcerningregulationadoption(theso-called‘hardlaw’),aswellasthosesubjecttosofterpolicycoordination.Effectively,MEPsuseplenarysessionsastheopportunitytogivespeechesbothtocommunicatetheirownpositionstowardsthe

45 ThenumberofMEPsintheSeventhParliamentarytermchangedtwice,firstduetotheLisbonTreaty

enteringintoforceinDecember2009(to754MEPs)andthenduetoCroatiajoininginJuly2013(to766MEPs).

46 The‘TalkofEurope’datasetincludesonlyspeechesmadeintheplenarysession.

Page 46: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

46

generalpublicandtheirownconstituents,aswellastocoordinatewithotheractors,relyingondiscursivepracticesasinstrumentsofchange(Schmidt,2010).

ExpectationsGiventhat(a)HEhasbecomeincreasinglyimportantfortheoverallEUstrategicdevelopment,that(b)HEhasbeenexportedtootherpolicyareasasapolicysolution,that(c)despitethelackofstrongregulativecompetences,thereissignificantHEpolicycoordinationattheEU-level,that(d)therehasbeenagradualempowermentoftheEPwithregardstotheEUleveldecision-making,inparticularwhenitcomestobudgetdecisions,andthat(e)thebehaviourofMEPsingeneralseemstobedeterminedmorebytheirpartyaffiliationthanbytheircountryaffiliation,thefollowingpatternswithregardstohowHEisconsideredintheEPcanbeexpected:

1. ThetotalnumberofMEPspeechesreferringtoHEincreasesovertime.ThemostsignificantincreaseisexpectedinrelationtotheadoptionofEUactionprogrammesandrelatedbudgetarydecisions.

2. HEismoreoftenreferredtointheEPspeechesinrelationtootherpolicyareasinwhichtheEUhasregulativecompetences,thanasastand-aloneissue.

3. WhetherornotanMEPmakesaspeechaddressingHEismorestronglylinkedtohis/herpartyfamilyaffiliationthantothecountryoforigin.

DataandmethodToinvestigatetheroleofHEintheEPwestudiedspeechesdeliveredintheEPplenaryusingthe‘TalkofEurope’–alinkedopendatainfrastructure(vanAggelen,Hollink,Kemman,Kleppe,&Beunders,2016),whichcomprisesspeechesgivenintheEPfrom1999–2014(translatedintoEnglish)andrelateddataavailablethroughtheEuropeanDataPortal.47‘TalkofEurope’allowstheuseofsemanticqueriestoretrievedatastoredintheResourceDescriptionFramework(RDF),acomputerdatalanguage(Juric,Hollink,&Houben,2012).Byformallylinkingtraditionallydistributeddatasets,queriescanbeimplementedtoidentifyspecificartefactsandrelatedmeta-data.Thesedigitalprovisionsofferanumberofadvantagestotheresearcherswishingtoinvestigatethisdata.First,onecanautomaticallyidentifyalargeamountofdocumentsinastraightforwardmanner,asopposedtoqueryingonedatabaseofMEPs’speeches,queryinganothertoretrievedataaboutdates,agendaitems,MEPs,etc.andthenmergingthem.Inaddition,insteadofmanuallyinferringpossibleconnections,thedataisautomaticallylinkedandcompiledasoneartefact,whichsavessubstantialtime(e.g.thatJohnSmithinonedatabaseisthesameintheother).Dependingonthequerysize,queryingsuchdatamayberuninminutes,ifnotseconds.

However,withtheadventofthesetoolscomechallengesassociatedwithdesigning,conducting,andinterpretingresearchresults(Bar-Ilan,2001).Asthedataishighlysensitivetothequerycommands,thespecificsofthequeryinfluencethedatareturned.Thustheresearcherhastobeacquaintedwiththedatabase,itspossibilitiesandthenatureoftherequest,soasnottojeopardizevalidityofthedesign.Forexample,theresearchermustknowwhetherthedatabaseiscontinuouslybeingupdated,whatcharacteristicsareavailableoftheartefactsbeingqueried,andhowthedataneedstobestructuredtoconducttheappropriateanalysisgiventhespecificresearchdesign.Takingintoconsiderationthenatureofsuchdataandthefocusonthedevelopmentofaquerytoidentifyspecific,craftedvalidsampleswithintheserelativelylargedatasets,atotaldescriptionoftheentiredatasetisrarelyfeasiblenorconducted.Thisimpliesthatnormalization–comparinganentiredatasettotheselectedsample–forthepurposesofverifyingtherepresentativenessofsample(astandardinquantitativeresearch)isinthisstudyconductedusingothermeans(see‘ResultsandDiscussion’).Althoughitmayquestionsomeassumptionsofspecificdesigns,forexampleprocedures

47https://www.europeandataportal.eu/(pageaccessed1March2017).

Page 47: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

47

andcharacteristicsnecessaryforinferentialstatistics,wewouldarguethatsuchissuesdonotwarrantdisregardingsuchdata,butratherrequirethatthesespecificitiesaretransparentlypresentedandopenlydiscussed,asweworktowardsbuildingamethodologicaltoolkitfitforanalysingsuchdigitaldata.Despitethesepitfalls,wecontendthatexploringsuchdatawouldoffervaluableandperhapsuniqueinsightsaboutphenomenastudied.

Inanalysingthesedatawetakeanexploratoryapproach.WestartedourresearchbydevelopingasetoftermsrelatedtoHE,whichwerethenreviewedbyanumberofHEresearchers(seeAppendixforthelistofalltermsqueried).Inordertoidentifyspeeches,aquery,usingthesekeywords,wasdevelopedbythesecondauthor,withtheassistanceofthe‘TalkofEurope’team.Thisreturnedthespeechesinatextformat,aswellastherelatedmeta-data(ifavailable)ofthe:titleofthespeech,dateofthespeech,URLtotheoriginalspeech,identificationofthespeaker,speaker’scountryaffiliation,andthespeaker’spartyaffiliation(ifknownorapplicable).Queryingthesewordsresultedinasetof10,180uniquespeeches(allincludingatleastoneHEtermfromourlist,duplicatesremoved)andrelatedmeta-datarepresentingallpotentialdiscussionsonHEintheEPsince1999.Themeta-data–whichconstituteessentiallytextualdata–werecodedinordertoallowforasystematicanalysisoftemporal,topicalandcountry/partyaffiliationpatterns.Importantly,thequerydevelopedfocusedonidentifyingspecifictermsusedinspeeches,notadescriptionoftheentiredatasetofspeeches.Thisapproachmimicstechniquesimplementedinotherstudiesusingthisdataset(vanAggelenetal.,2016).Inadditiontothisdata,weusedpubliclyavailableinformationonthenumberofMEPspercountryorperpartyfamilyduringtheperiodstudied.ThetreatmentofthedataispresentedintheTable2.

Table2–Variablesandtreatment.Source:Authors.

Variable Typeofdata Treatment

Titleofthespeech TextualdataManuallycodeddatainrelationtothetopics,seeTable3.

Dateofthespeech Date n/a(nottreatedhere)

UniqueIDofthespeaker,givenbyEP Nominal n/a(nottreatedhere)

Speaker’scountryofaffiliation Textualdata Codedtonominaldata

Speaker’spartyaffiliation,ifknown Textualdata Codedtonominaldata

Toefficientlyidentifytopicsaccordingtokeywordsinthespeechtitles,aspresentedinTable3,speechesweresemi-automaticallycoded-usingbothmanualandcomputationalcoding(Lewis,Zamith,&Hermida,2013).Thisresultedinonecodeperspeech,followingahierarchicalschema:withatitlecontainingoneoftheHEtermsidentifiedearliertakingprimacy,thenanon-HEtopic(e.g.geographicaldeterminant,demographicdeterminantorreferencestootherpolicysectors).Forexample,ifthetitlereferstoRomaortheDanubeRegionbutthespeechmentionsaHEterm,itconstitutesanon-HEtopicwhereHEhasbeendiscussedinrelatedtoanotherpolicytopic.Speechesthatincluded‘vote’,‘budget’orreferencetoaproceduralmatterwerealsocodedintoseparatecategories(seeTable3fordetails).Votingandbudgetformallyrepresenttwodifferentactivities,wherebyoneispossiblyaspecificdiscussionofthebudget,comparedtoadiscussiononthevotingitselfasadecision-makingprocessoftheEP.Weacknowledgethatinasmallnumberofcasesthesemayoverlap,giventheEPoftenvotesonbudgets.AllotherformalactivitiesrelatedspecificallytoprocedureandtheorganizationoftheEUingeneralareconsideredasproceduraltopics.

Inordertoexplorethesepatterns,thesecodeddata,togetherwiththeabovementionedmeta-dataoftextorigin(stringvariables),weretransformedintonominalcategoricalvariables.Thedatasetwasthenusedtoexplorea)thetemporalpatternsoftheuseofHEtermsinspeechesovertime,b)the

Page 48: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

48

topicalpatternsoftheuseofHEtermsinthedifferenttypesofspeeches,c)theroleofthecountryandpartyinexplainingtheuseofthesetermsovertimeandinspecifictopics.

Table3-Codingscheme.Source:Authors.

Code Description

HE SpeecheswithatitlethatincludedoneofourkeywordsandaddressedHEasthespecifictopic

Non-HE

Speecheswiththementionofageographicalplaceinthetitle(e.g.country,cityorregion),oraspecificgroupofpeopleinthetitle(e.g.women,youth,disabled,elderly,Roma),oranissuethatisnotspecificallyrelatedtoHE(e.g.economy,humanrights,employment,labour,resources,security,environment,defence,transportation,andsoforth).

Vote SpeecheswiththetitleVoteorVotes

Budget Speecheswiththementionofthewordbudgetinthetitle

ProceduralSpeecheswithamentioninthetitleonproceduralmattersoftheEUitself(e.g.reviewofECnotes,announcements,andsoforth)

Unidentified Speechesthatarenotattributabletoatopicgiventhelackofdetailinthetitle

Weacknowledgeanumberoflimitationstoourdesign.ThereliabilityofthepublicdataisrelatedtotheaccuracyofEUOpenDataPortalandthe‘TalkofEurope’infrastructureinbothpublishingandaccuratelylinkingrelateddata.Giventhebiggernatureofthisdata,itisexpectedthatlesssignificant‘bugs’mayoccur,butthatthis‘noise’wouldbesystemicandthuswouldnotsignificantlyinfluenceresults.Inthisrespect,wehaveencounteredanunprecedentedamountofunattributedpartyaffiliationsinthe7thEPsessionwhichreflectedmissingdata.Thus,toensurevalidity,inconsideringtheextenttowhichMEP’sspeakingonissuesrelatedtoHEisdeterminedbyhis/hercountryoforiginorpartyfamilyaffiliationwehavenotanalysedthe7thterm.WithintheavailabledatawewerenotabletoconfirmwhetherallspeakerswereMEPs,orguestspeakers,althoughwecouldsafelyassumethatthenumberofnon-MEPsspeakingintheEPplenarysessionsisverylowandthusnotsignificantinawaythatcoulddistortourfindings.

ResultsanddiscussionAspreviouslyindicated,ourqueryretrievedatotalof10,180speechescontainingoneormoreofthetermsinour‘dictionary’(seeAppendix).Giventhattheoutputofthequerydoesnotcontainalistoftermsthatwerefoundinaparticularspeech,itwasnotpossibletosystematicallymeasuretheco-occurrencesofthetermsacrossallofthe10,180speechesandtousesuchdatatotestthesensitivityofthequerytothecontentofthe‘dictionary’.

Giventheselimitations,wehavedevisedanalternativeapproach.Wefocusedonthepotentiallymostproblematictermsinthe‘dictionary’,i.e.termsthatmayappearinspeecheswithnolinkagetoHEwhatsoever:innovation,mobility,science,technology,andtraining.Wehavequeriedthe‘TalkofEurope’infrastructureforthesefivetermsseparatelyandanalysedtheoverlapbetweenspeechesretrievedthiswayandspeechesretrievedwhenqueryingfortwotermsdefinitelylinkedtoHE–‘highereducation’and‘university’.TheresultsarepresentedinTable4.

Page 49: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

49

Table4–Numberofspeechescontainingoneormoreoftheselectedterms.Source:Authors.

X

Numberofspeeches…innovation

mobility sciencetechnology

training

A:containingoneoftheterms(X) 923 752 534 973 1011

B:containingXAND'highereducation'(Y) 262 269 180 264 281

C:containingXAND'university'(Z) 370 322 258 421 403

containing(XANDY)OR(XANDZ)=B+C 632 591 438 685 684

D:containingYANDZ 221 221 221 221 221

containing(XANDNOTY)OR(XANDNOTZ)=B+C-D

512 382 317 509 548

Thus,thereare2,268speeches(22.2%ofthetotalnumberofspeechesinourdataset)thatcontainatleastoneofthefiveproblematicterms(X),butdonotcontain‘highereducation’(Y)or‘university’(Z),i.e.potentiallythereare22.2%ofspeechesinthedatasetthatshouldnotbethere.However,weneedtostressthatthisisactuallythemaximumpossiblevalue,fortworeasons:(1)wejustexploredtheco-occurrenceoftheproblematicterms(X)withtwootherterms(‘highereducation’and‘university’)andnotwithothertermsinthe‘dictionary’whichmayalsobecloselylinkedtoHE(e.g.student,academic);and(2)weignoredthepossibilitythattheremaybeco-occurrencesofthedifferentXsinthesamespeech(e.g.‘innovation’and‘technology’)andmerelyaddedthedifferentnumbersinthelastrowofTable4.Notwithstandingthattheactualproportionofspeechesthatdonotbelonginthedatasetisverylikelylowerthan22.2%,wewillproceedwithourexplorationofthedatatakingthisintoaccount.

InrelationtoourexpectationthatthetotalnumberofMEPspeechesmentioningHEincreasesovertime,Figure1presentsthefrequencyofsuchspeechesforthe5th,6thand7thterm(aggregatedforafour-monthperiod).

Page 50: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

50

Fig.1-SpeechesreferringtoHE,overtime.Source:Authors.

AsFigure1shows,thereisanincreaseinthenumberofspeechescontainingatleastoneofthetermsweidentifiedasbeingattributedtoHEovertime.Thefigurealsohelpsusidentifymomentsofincreasedfrequency,suchastheendof2008,partsof2011ortheendof2013.

AcloserlookintothedatasetrevealsthattheseincreasesarerelatedtotheactivityaroundtheadoptionofspecificprogrammesanddecisionsconcerningHE,suchas:

• the‘ErasmusMundusII’programme–31speechesonthistopicon20October2008alone;• EuropeanQualityAssuranceReferenceFrameworkforVET–38inDecember2008;• thereportonthe‘YouthontheMove’(whichalsoincludesstudentmobilityprogrammes,such

as,currently,Erasmus+)–57speechesinMay2011;• Agendafornewskillsandjobs–53speechesinOctober2011;• ModernisingEurope’sHigherEducationSystems–38inApril2012;• adebatetitled‘IsErasmusindanger?’–42speechesinOctober2012;• ‘Erasmus+’programmeunder‘ErasmusforAll’itemontheagenda–142speechesin

November2013.

Page 51: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

51

Thisalsomeansthattheincreasedfrequencycannotbeduetothe22.2%potentiallyproblematicspeechesinourdataset.Atthesametime,momentsoflowestfrequencypertaintothetransitionbetweendifferentparliamentaryterms.

ConcerningthecontextinwhichHEisreferredto,inthemajorityofspeechesHEisnotdiscussedasastand-aloneissuebutratherinrelationtootherpolicyissuesorinrelationtothevoteexplanations(Table5).TheseotherpolicyissuesincludeareasthatcouldbeconsideredcloselyrelatedtoHE,suchasgeneraleducation,youthissuesorrecognitionofprofessionalqualifications,butalsoincludeareasthatcanbeconsideredasratherdistantfromHE,e.g.visaissues,armssales,maritimepolicyetc.Thehighproportionofthespeechescategorizedunderthe‘Vote’topicindicatesthatHErelatedtermsarealsoreferredtoduringexplanationsofvotingproceduresaswellasdiscussionsconcerningimplicationsofthevotes.

Table5–DistributionofspeechesreferringtoHEinrelationtotheirmaintopic.Source:Authors.

Maintopic Numberofspeeches %inrelationtoallspeechesincludingHEterms

Non-HE 3,844 37.76%

Vote 3,131 30.76%

HE 1,358 13.34%

Procedure 1,122 11.02%

Budget 713 7.00%

NI 12 0.12%

WhilethestructureofourdatasetdoesnotallowforamorerefinedanalysiswithregardstohowHEisreferredtoinrelationtootherpolicyissuesorvoting,itisclearthatHEdoesnotfeatureprominentlyasastand-aloneissuebutthatitismostoftenreferredtoinrelationtootherpolicyissuesinwhichtheEUhasregulatorycompetences,evenwhentakingintoaccountthatpotentially22.2%ofthespeeches–allofwhichwouldbeontopicsotherthanHE–perhapsshouldnotbeinourdataset.

Concerningourthirdexpectation,wefocusedonlyonthe5thand6thtermforwhichwehadclearpartyfamilyaffiliationforeachMEPandrestructuredthedatasetsotheMEP(andnotanindividualspeech)isthedataunit.WethencalculatedforeachMEPtheproportionofspeechesthathadHEasitsmaintopicinrelationtothetotalnumberofspeechesgivenbysaidMEP(hereinafter:HEspeeches),andbasedonthisexploredthevarianceinproportionofHEspeechesinSPSSwithatwo-wayANOVAusingcountryandpartyasfixedfactors(Field,2009).TheresultsshowthatastatisticallysignificantdifferenceintheproportionofMEPspeechesthatareonHEexistsonlyforcountryaffiliation(andthatonlyatp<0.05levelofsignificance)whilethedifferenceforpartyaffiliationisnotsignificant.ThiscanbeconsideredasasuggestionthatthecountryoforiginismorestronglylinkedtothevarianceinproportionofspeechesanMEPmakesthathaveastheirmaintopicHE,thoughprimarilyatentativeone,giventhepotentialthatacertainnumberofspeeches–likelylessthan22.2%becausethisanalysisconcernsonly5thand6thterm–shouldnotbeconsideredinthisdataset.

ConclusionsThefindingspresentedinthispaperaretheresultofthefirstexplorationofthe‘TalkofEurope’dataset.TheysuggestthatEuropeofKnowledgeisbecomingthetalkofthetownintheEuropeanParliament.ThetotalnumberofMEPspeeches,eitherspecificallydedicatedtoHEormentioningHEinspeechesdedicatedtootherissues,appearstohaveincreasedovertime,particularlyduringtheadoptionofEUactionprogrammesintheareaofHEandrelatedbudgetarydecision.Moreover,overtheperiodanalysed,HEwaslessreferredtointheEPspeechesasastand-aloneissuethaninrelation

Page 52: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

52

tootherpolicyareasinwhichtheEUhasstrongjurisdiction.Finally,thetentativefindingsindicatethatthevarianceinwhetheranMEPspeaksaboutHEismorelinkedtocountryoforiginthanpartyaffiliation.

Moregenerally,thesefindingsattesttotheincreasingroleoftheEPinHEpolicymaking,whichhasbeenlargelyoverlookedinstudiesonEuropeanleveldynamicinHE.Asthisstudydemonstrates,acloserlookattheEPpotentiallyoffersawealthofinformationonhowHE,bothinrelationtootherareasandasapolicyissueinitsownright,isconsideredanddecideduponbytheonlydirectlyelectedinstitutionoftheEuropeanUnion.Availabledatabases,suchasthe‘TalkofEurope’,butalsotherichrepositoriesofpubliclyavailableinformationontheEuropeanUnionwebsites,therefore,offerapromiseofabetterinsightintothesematters.

Inthisstudywehavealsotriedtoexplorethepotentialsandlimitationsofusingbig(ger)digitaldata.Webelievewehaveaccuratelyshownhowsuchmethodsandanalysiscanbeusefulinpolicyresearch,whilealsotryingtohighlightsomeoftheirshortcomings.Regardingthelatter,thechoiceofresearchdesigndidnotallowfortraditionallyacceptedmethodsofanalysistobeemployed,togetherwiththeassumptionsnecessarytoconductthoseanalyses.Anexampleofthiswouldbethedifficultieswithdevelopingavalidqueryfromafrequentlychangingdataset,whichinturnpreventsnormalizationofdata.Weargue,andhaveshowninthisresearch,thatsuchdifficultiesshouldnotautomaticallymeanthatsuchdataisofnousebutratherthatthemethodsdistincttodigitaldataneedtobeemployedandtransparencyneedstobeensured.Specifically,tothisproject,weacknowledgealimitationofthedevelopedquerytoindividuallyidentifytermswhereco-occurrencecanbeassessed.Indevelopingfuturequeriesusingthe‘TalkofEurope’infrastructureoneshouldattempttobuildadatasetthatwouldallowaquerytoconsideranadditionalnumberofcharacteristics,atleast:(1)thetotalnumberofspeechesatthedateofdataofcollection(toenablenormalization),and(2)dataonco-occurrenceofdifferenttermsofinterest.Thesetwoadditionswouldallowforfurtherquantitativeanalysis,butalsomoreconfidenceinclaimingthatcertainmechanism(s)andrelationshipsareatplay,whichwecanonlynowprovideastentativefindings.

Takingintoaccounttheabovementionedmeasuresconcerningthedatainfrastructure,anumberofpossibleavenuesforfurtherresearchbecomeopen.First,in-depthanalysisofthetextualdatacontainedinselectedspeecheswouldallowforfurtherexploringthecontentofthesespeeches,e.g.whatpreferencesandpositionsareMEPsputtingforwardandhowthismaychangeovertime.Moreover,relationshipsbetweentheEPandotherEUinstitutions,suchastheEuropeanCommissionandtheCounciloftheEUcanbeanalysedby,forexample,analysingtheextenttowhichMEPsrefertoHEwhenrespondingtoinitiativesofotherEUinstitutionscomparedtospeakingaboutHEwithoutanexternalprompt.Theinitialexplorationspresentedinthispapercanthusserveasthebackdropforfurtherin-depthanalysisofMEPbehaviourconcerninghighereducationusingdigitaldata.

ReferencesBar-Ilan,J.(2001).DatacollectionmethodsontheWebforinfometricpurposes—Areviewandanalysis.Scientometrics,50(1),7-32.doi:10.1023/a:1005682102768

Beerkens,E.(2008).TheEmergenceandInstitutionalisationoftheEuropeanHigherEducationandResearchArea.EuropeanJournalofEducation,43(4),407-425.doi:10.1111/j.1465-3435.2008.00371.x

Börzel,T.A.(2010).EuropeanGovernance:NegotiationandCompetitionintheShadowofHierarchy.JCMS:JournalofCommonMarketStudies,48(2),191-219.doi:10.1111/j.1468-5965.2009.02049.x

Chou,M.-H.,&Gornitzka,Å.(Eds.).(2014).BuildingtheknowledgeeconomyinEurope:NewconstellationsinEuropeanresearchandhighereducationgovernance.Cheltenham:EdwardElgar.

Page 53: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

53

Corbett,A.(2005).UniversitiesandtheEuropeofknowledge:Ideas,institutionsandpolicyentrepreneurshipinEuropeanUnionHigherEducationPolicy,1955–2005.Basingstoke:PalgraveMacMillan.

Elken,M.,Gornitzka,Å.,Maassen,P.,&Vukasović,M.(2011).Europeanintegrationandthetransformationofhighereducation.Oslo:UniversityofOslo.

EuropeanCommission.(2006).Deliveringonthemodernizationagendaforuniversities:Education,researchandinnovation.Brussels.

EuropeanCommission.(2010).Europe2020:Astrategyforsmart,sustainableandinclusivegrowth.(COM(2010)2020final).Brussels:EC.

Field,A.(2009).DiscoveringstatisticsusingSPSS:(andsexanddrugsandrock'n'roll).LosAngeles:SAGE.

Finke,D.(2010).Europeanintegrationanditslimits:intergovernmentalconflictsandtheirdomesticorigins.Colchester:ECPRPress.

Gideon,A.(2015).ThePositionofHigherEducationInstitutionsinaChangingEuropeanContext:AnEULawPerspective.JCMS:JournalofCommonMarketStudies,n/a-n/a.doi:10.1111/jcms.12235

Gornitzka,Å.(2009).NetworkingAdministrationinAreasofNationalSensitivity:TheCommissionandEuropeanHigherEducation.InA.Amaral,G.Neave,C.Musselin,&P.Maassen(Eds.),EuropeanIntegrationandtheGovernanceofHigherEducationandResearch(Vol.26,pp.109-131):SpringerNetherlands.

Gornitzka,Å.(2014).HowstrongaretheEuropeanUnion'ssoftmodesofgovernance?TheuseoftheOpenMethodofCoordinationinnationalpolicy-makingintheknowledgepolicydomain.InM.-H.Chou&Å.Gornitzka(Eds.),BuildingtheKnowlegeEconomcyinEurope:NewconstellationsinEuropeanResearchandHigherEducationGovernance(pp.160-187).Cheltenham:EdwardElgar.

Huisman,J.,&deJong,D.(2014).TheConstructionoftheEuropeanInstituteofInnovationandTechnology:TheRealisationofanAmbiguousPolicyIdea.JournalofEuropeanIntegration,36(4),357-374.doi:10.1080/07036337.2013.845179

Juric,D.,Hollink,L.,&Houben,G.J.(2012).BringingparliamentarydebatestotheSemanticWeb..Paperpresentedatthe11thInternationalSemanticWebConference,workshoponDetection,RepresentationandExploitationofEventsintheSemanticWeb(DeRIVE2012),Boston.

Kohler,M.(2014).EuropeanGovernanceandtheEuropeanParliament:FromTalkingShoptoLegislativePowerhouse.JCMS:JournalofCommonMarketStudies,52(3),600-615.doi:10.1111/jcms.12095

Laffan,B.,&Lindner,J.(2010).TheBudget.WhoGetsWhat,When,andHow?InH.Wallace,M.A.Pollack,&A.R.Young(Eds.),Policy-MakingintheEuropeanUnion(pp.208-228).Oxford:OxfordUniversityPress.

Lewis,S.C.,Zamith,R.,&Hermida,A.(2013).ContentAnalysisinanEraofBigData:AHybridApproachtoComputationalandManualMethods.JournalofBroadcasting&ElectronicMedia,57(1),34-52.doi:10.1080/08838151.2012.761702

Maassen,P.,&Olsen,J.P.(Eds.).(2007).UniversityDynamicsandEuropeanintegration.Dordrecht:Springer.

Musselin,C.(2005).ChangeorContinuityinHigherEducationGovernance?InI.Bleiklie&M.Henkel(Eds.),Governingknowledge:Astudyofcontinuityandchangeinhighereducation(Vol.9,pp.65-79).Dordrecht:SpringerNetherlands.

Page 54: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

54

Pollack,M.A.(2010).TheorizingEUpolicy-making.InH.Wallace,M.A.Pollack,&A.R.Young(Eds.),Policy-MakingintheEuropeanUnion(pp.15-44).Oxford:OxfordUniversityPress.

Proksch,S.-O.,&Slapin,J.B.(2010).PositionTakinginEuropeanParliamentSpeeches.BritishJournalofPoliticalScience,40(03),587-611.doi:doi:10.1017/S0007123409990299

Schmidt,V.A.(2010).Takingideasanddiscourseseriously:explainingchangethroughdiscursiveinstitutionalismasthefourth‘newinstitutionalism’.EuropeanPoliticalScienceReview,2(01),1-25.doi:doi:10.1017/S175577390999021X

Slapin,J.B.,&Proksch,S.-O.(2010).Lookwho’stalking:ParliamentarydebateintheEuropeanUnion.EuropeanUnionPolitics,11(3),333-357.doi:10.1177/1465116510369266

vanAggelen,A.,Hollink,L.,Kemman,M.,Kleppe,M.,&Beunders,H.(2016).ThedebatesoftheEuropeanParliamentasLinkedOpenData.SemanticWeb(Preprint),1-10.

Wallace,H.(2010).AnInstitutionalAnatomyandFivePolicyModes.InH.Wallace,M.A.Pollack,&A.R.Young(Eds.),Policy-MakingintheEuropeanUnion(pp.69-104).Oxford:OxfordUniversityPress.

Wallace,H.,Pollack,M.A.,&Young,A.R.(2010).AnOverview.InH.Wallace,M.A.Pollack,&A.R.Young(Eds.),Policy-MakingintheEuropeanUnion(pp.3-13).Oxford:OxfordUniversityPress.

Young,A.R.(2010).TheEuropeanPolicyProcessinComparativePerspective.InH.Wallace,M.A.Pollack,&A.R.Young(Eds.),Policy-MakingintheEuropeanUnion(pp.45-68).Oxford:OxfordUniversityPress.

Page 55: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

55

Appendix–HEtermsqueried(Note:the‘TalkofEurope’databasecanquerytermsconsistingofoneortwowords)

Academia

Academic,academics

bachelor,bachelors

BolognaProcess

Copenhagenprocess

COST

Curriculum

diplomasupplement,diplomasupplements

ECTS

EHEA

employability

Erasmus

ErasmusMundus

Erasmus+

EuropeanInstitute(toidentifyEuropeanInstituteofTechnology)

EuropeanStandards(toidentifyEuropeanStandardsandGuidelines)

EuropeanUniversity

FrameworkProgramme

Graduate,graduates

highereducation

Horizon2020

innovation

knowledge(toidentifyknowledge-basedeconomy)

learning(toidentifylifelonglearningissuesandLifelongLearningProgramme)

LLP

master,master's

Mobility

polytechnic

Qualityassurance

science

Skillandskills

Socrates

STEM

Studentandstudents

technology

Tempus

tertiaryeducation

Training

university

VET

vocationaltraining

Page 56: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

56

2.Miningmeaning.FromnetworkanalysistoalgorithmicsemanticdataminingIntroductionPaulVerhaarandMirkoTobiasSchäfer

Bringingdigitalmethodstolinguistics,ourpaperusesadatasetconsistingoftweetsabouttherefugeecrisisasabaselineforsemanticanalysis.Thispaperdescribeshowtheanalysisofacorpusofstatusmessagescanleadtothedefinitionoflinguisticfingerprintsfordetectingideologicalpositions.Weuseanetworkanalysisofretweetstorevealthedifferentpositionsofparticipantswithinthehighlypolarizeddebate.Mappingthenetworkreturnstwoopposingclustersinthedebate;oneexpressesamildlypositivestancetowardsrefugees,consideringrefugeestobevictims,andisuptoacertainextentwelcomingthem.Theotheroneisopposedtorefugees,portrayingthemascriminalsorprofiteers.Theclusterscanalsoroughlybedividedinpoliticalpreference;leftwingversusrightwing.Asthepoliticaldifferencesareobvious,weusethisasabaselineforfurtheranalysisbasedonthecontentoftweets.Miningthetimelinesprovidesinsightsintothedistinctiveuseoflanguagebytheparticipantsoftheopposedclusters.Ourpaperdescribesageneralmethodforanalysingacorpusofstatusupdatesinordertoidentify‘linguisticfingerprints’revealingideologicalpositions.

LinguisticsmeetsdigitalmethodsOurapproachcombinesdigitalmethodsforTwitteranalysiswithlinguisticmethods.Thismeansthatweanalysethestructureofthedebateandconsiderhowlanguageplaysaroleincarryingkeysforidentifyingideologicalpositionswithinthedebate.AnalysingTwitterorsocialmediamessagesisnotnew.Previousresearchhasfocussedonemailanalysis(GrohandHauffa,2011),networkanalysisonretweetbehaviour(Passmanetal.,2014),abstractingpersonalityfromsocialmedia(Schwartzetal.,2013)andstyleaccommodation(Danescu-Niculescu-Miziletal.,2011).Wearenotawareofastudycombiningdigitalmethodsfornetworkanalysiswithlinguisticanalysistobreedaconnectionwithasolidbaseline.Manyconceptsinnewmediastudiesandlinguisticscanbecombined.Registerisanimportantpartofoureverydayconversation,usefulforresearchontheonlinedomain.Peopletendtoconveytheirmessagesindifferentwaysandformsdependingonthepersontheyshareinformationwith(Danescu-Niculescu-Miziletal.,2011).Registerisastrongandimportantaspectinthis,asproposedbyPennebaker,whoclaimsthatwordscanbea“windowtothesoul”(2011).ThelinguisticcharactersthatareusedcanbeseenasmarkerswithindistinctgroupsasshownonWikipedia(Danescu-Niculescu-Miziletal.2012).Languagecoordinationisstronglydependentontheaccommodationtheoryandpowerdifferenceswithinsocialgroups.OnTwitterthisisdonebyposting,replyingandretweeting.Theirinteractionsrelyonlinguisticstylemarkers,suchastheuseofcontentandfunctionwordsandcertainkeywords(Anger,2011).Allinall,notonlythenetworkapersonmovesin,butalsothelanguageisimportantforanalysingonlinesocialformationsonsocialmediaplatforms.

ThedataDataforthisstudyconsistofDutchTwittermessagesfromJanuary2015toOctober2015:intotal561.179tweets,ofwhich363.079wereuniquetweetsand198.100retweets.Selectedwasbasedontworelevanttermsintherefugeedebate,bothsubjecttodifferentformsofconnotationandrepresentation:“vluchteling”(refugee)and“gelukszoeker”(literallyhappinesseeker,orfortuneseekeroreconomicrefugee).Forthepurposeofbuildingacorpusforsemanticminingthecompletenessofthedatasetwasoflesserimportancethanitsrepresentationofdistinctclusters.

FindingsOurfindingsaddresstwoaspectsinpoliticaldebatesonTwitter:

Page 57: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

57

Structureofparticipantsanddebate:networkanalysisconfirmedthepolarisationwithintherefugeedebateintotwoopposingclusters,thedynamicofmediaoutletsandopinionleadersinshapingthedebateandtheinteractionofthevariousparticipants.

Semanticanalysis:aquantitativeanalysisoflanguageuserevealedsignificantdistinctionsbetweenthetwoopposinggroups:e.g.therightwingusesmoreadjectives,needsmorewordstoconveyamessageandusesmoresix-letterwordsthantheopposingcluster.

ThispaperfocusesonthedistinctionsinlanguageusewhichopensnewpossibilitiesforautomaticallyminingTwitteroranycorpusformeaning.Withregardtorecenteffortstocreatemodelsandwaysofalgorithmicanalysisofsocialmediacontent(BurnapandWilliams2015),ourpaperindicatesthepossibilitytomovefromnetworkanalysistosemanticanalysisoflargecorpora.WhileRanganathetal.proposeamodelforpredictingprotesttweets,ourconceptsuggeststhedetectionofextremepoliticalpositionsinsocialmediadebates(2016).ThelimitationinRanagathetalistheirdependenceonthenetworkstructureandhistoryofsocialinteractionofthevariousparticipants.Inourexample,thenetworkprovidesmerelythebaselineforfurtherlinguisticanalysis.Developingthisfurtherwouldentailcreatinganalgorithmbasedonthefindingsfromourinitialcorpus.However,thisraisesissuesaboutthequalityofpublicpoliticaldebate,freedomofexpressionandprivacy.Dataretentionandsocialmediametricsprovidethemeansforacoherentanalysisofpoliticalexpressionsanddeliverpowerfultoolsforsecurityauthoritiestomonitorthepoliticalexpressiononline.

ReferencesAnger,I.,&Kittl,C.(2011).MeasuringinfluenceonTwitter.Proceedingsofthe11thInternationalConferenceonKnowledgeManagementandKnowledgeTechnologies.

Burnap,P.Williams,M.(2015)CyberHateSpeechonTwitter:AnApplicationofMachineClassificationandStatisticalModelingforPolicyandDecisionMaking.Policy&Internet7(2),223–242.

Conover,M.D.,Ratkiewicz,J.,Francisco,M.,Goncalves,B.,Flammini,A.,Menczer,F.(2011).PoliticalPolarizationonTwitter.Proceedings5thInternationalAAAIConferenceonWeblogsandSocialMedia.

Danescu-Niculescu-Mizil,C.,Lee,L.,Pang,B.,&Kleinberg,J.(2012).Echoesofpower:Languageeffectsandpowerdifferencesinsocialinteraction.Proceedingsofthe21stInternationalConferenceonWorldWideWeb.

Danescu-Niculescu-Mizil,Cristian,Gamon,Micheal,Dumais,S.(2011).Markmywords!LinguisticStyleAccommodationinSocialMedia.WWW,78(11).

Gilbert,E.(2012).Predictingtiestrengthinanewmedium.ProceedingsoftheACM2012ConferenceonComputerSupportedCooperativeWork.

Groh,G.,&Hauffa,J.(2011).CharacterizingSocialRelationsViaNLP-basedSentimentAnalysis.ProceedingsoftheFifthInternationalAAAIConferenceonWeblogsandSocialMedia,502–505.

Paßmann,J.,Boeschoten,T.,&Schäfer,M.T.(2014).TheGiftoftheGab:RetweetCartelsandGiftEconomiesonTwitter.TwitterandSociety,331–344.

Pennebaker,J.W.(2011).Youruseofpronounsrevealsyourpersonality.HarvardBusinessReview,89(December),32–3.

Ranganath,S.,Morstatter,F.,Hu,X.,Tang,J.,Wang,S.,Liu,H.(2016).PredictingOnlineProtestParticipationofSocialMediaUsers.AssociationfortheAdvancementofArtificialIntelligence.

Page 58: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

58

SchwartzHA,EichstaedtJC,KernML,DziurzynskiL,RamonesSM,etal.(2013)Personality,Gender,andAgeintheLanguageofSocialMedia:TheOpen-VocabularyApproach.PLoSONE8(9)

3.Learningcomplementaryalternativemedicinesocially?TopicmodelinghealthconsciousnesswithbigonlinediscussionforumdataMarjoriikkaYlisiurua,ConsumerSocietyResearchCentre,UniversityofHelsinki

IntroductionIndividualshavevaryingabilitiestounderstandandretainhealth-relatedinformation.Thishealthliteracyformsthebasisofindividuals’knowledgeinmakingthedecisionsconcerningtheirhealth(Chinn,2011;Sorensenetal.,2012;Walsh&Elhadad,2014).Theconceptofhealthliteracyiswidelyemployedinhealthcareresearchasameasurable,rationalskill.

Nevertheless,individualswithhealthliteracylevelsthatstandardinstrumentsscoreas“adequate”,oftenrelyonbiomedicallycontroversialComplementaryandAlternativeMedicine(CAM)treatments(Bains&Egede,2011;Stoneman,Sturgis,&Allum,2012).OnepotentialexplanationcomesfromChinn(Chinn,2011)andPuuronen(Puuronen,2015,inFinnish),bothofwhomhighlighttheeffectsofsocialcommunityonwhatinformationindividualsacceptasrelevant.Puuronen(2015)refersto“extended”healthliteracyashealthconsciousness,whichincludes(sub-)culturalcodesandsociallyconstructedmeanings.Tocomplementearlierresearch,thispaperanalysesalargeonlinedatasettostudyhowonlinecommunitiesshapehealthconsciousnessinthefieldofCAM.

ThematerialforthisresearchconsistsofdiscussionsontheCAMfieldofhomeopathy.Onlinediscussionforumsspread“traditionalbiomedical”knowledge,healthexperiencesandpeeradvice,makingsocialmediaafieldwherehealthliteracyisacquiredandrequired(Centola,2013;Cline&Haynes,2001).Thetaskistoanalyzediscussiontopics(DiMaggio,Nag,&Blei,2013)andtoinvestigatehowwriterslearnhealthconsciousnesssocially,whileexpressinghealthliteracycapabilities.Tothatend,thestudyemployedboththetopicmodelingalgorithmLDA(Blei,Ng,&Jordan,2003),andclosereading.

Materials&methodsSuomi24.fi("Finland24.fi")isthelargestandoneoftheoldestFinnishdiscussionforumsinwhichreadersandcontributorsmayeitherregisteroruseatemporaryalias.Variousdiscussionsubforaconsistofdiscussionthreadsthatengagecontributorsinconversationswhichoccasionallylastaslongasseveralyears.

AdatasetcoveringSuomi24-activityfromyears2001to2015isavailableforacademicuseatTheFinnishLanguageBankFin-Clarindatabase48.Itconsistsofover55milliondiscussioncommentsandtheirmetadata,e.g.timestampsandcontributornicknames.Fromthisdatabase,thefullHomeopathysubforumdatasetwasacquiredinCSVformat.Eachrowincludedanoriginaldiscussionsentence,itslemmatizedsentencewithitsstopwordsremoved,andsentencemetadata.Thedatafiletotals26MB(52,729sentences,or9,326comments).

Asthefirststep,anLDAalgorithmdevelopedwithPythonGensimpackagewasrunwithavaryingnumberoftopics.Afteralgorithmicmodeling,commentsandtheirsurroundingdiscussionthreads

48Resourcedescription:http://urn.fi/urn:nbn:fi:lb-2017021503

Page 59: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

59

wereanalyzedwithclosereading.TheoriginaltextswerethensampledusingtopicmodelkeywordsasCSV/Excelsearchkeywords.Thisresultedina15-topicmodel,combinedintoframesasfollows:

• Fieldsofcontroversy:Both“proponents”and“opponents”ofhomeopathydiscussingscientificevidencefor/againsthomeopathyanditsareasofapplication.(topics#A)

• Historicalcontext:Mostlyproponentsdescribingthelengthsoftheirpersonalexperiences,aswellasthehistoryofhomeopathy.(2topics)

• Celebritydiscussion:Mostlyproponentsdiscussingaphysicianwhoisapublicproponentofhomeopathy.(1topic)

• Helpgainedfromhomeopathy:Contributorsemployedthisframetodescribetheirexperienceswithhomeopathy.(topic#B)

• Askingquestions:Primarily“newinitiates”describingtheirconditionandaskingforhelp.(topic#C).

Afterrecognizingthediscussionframesalgorithmically,analysiscontinuedwithfurtherclosereading,includingsampled,originalonlineconversations.

ResultsTypically,homeopathy“proponents”seethetreatmentascomplementarytotraditionalbiomedicine.Theproponents’positioningofhomeopathyinrelationtobiomedicalmedicineisrevealedintheframesonhomeopathichistoryandcelebrities.Inthispaperhowever,thefocusisonthethreeremainingframes(#A,#B,#C)toobservehealthconsciousnessandhealthliteracyexpressionsindialoguesbetweenproponentsandopponentsofhomeopathy.

Self-professed“initiates”tohomeopathy(topic#C)oftenseekpeerexperiencesonhowhomeopathyhandlescertainconditions.Inresponse(topic#B),someproponentsunderlinetheimportanceoffindingtherighthomeopathicpractitioner.Incontrast,someproponentsdescribetheuseofhomeopathicproductsthattheyadministerindependently,withoutthehomeopaths’advice.Furthermore,someauthorsdescribehomeopathy’sfailuretocuretheircondition,whereas“opponents”promoterelianceonbiomedicine.

Experienceandopinionsharingelicitsdialogue,whichoftenturnsheated.Especiallytheself-professed,experiencedproponentssoughttoactivelydefendtheirindividualexperiencesagainstattacksfromopponentsofhomeopathy(topic#B),ashighlightedbelow.

Personalexperiencewithmychild’schronicotitis:Thechildcouldn’tstomachmanyantibioticsduetoallergiesandasensitivitytomedicines.Medicineswouldoftencauseseveresymptomssoalternativeswereneeded,andhomeopathyprovidedtheanswer.Howcouldasmallchildunder3yearsoldpretendthatasubstanceishelpful,theywouldn’tknow.Andyet,homeopathywastheonlyeffectivemeasure.Thelatercheck-upsconfirmedthattheinfectionwasremedied.Ourpets,too,havehadsuccesswithhomeopathy,andtheycan’tpretendeither.Agoodhomeopathcanchooseasuitablesubstance.Itneedstobethecorrectonetobeeffective.49

Thehomeopathiccommunitysupportsitsproponentsfacingopponentscrutiny.Somestrategiesinvolvecapabilitiesandlexisthatsuggesthighhealthliteracy.Forexample,opponentsmaydefendbiomedicaltreatmentsoraccusehomeopathyoflackingscientificevidence,usingbiomedicaltermslike“PediatricNeurotransmitterDisorders”50.Theproponentsthenexpresstheircapabilityin

49 Author”ArnicaD”,31.8.2014,http://keskustelu.suomi24.fi/t/12228806/homeopatiaa-kokeilleiden-

kokemuksia-kaivataan!50 http://keskustelu.suomi24.fi/t/2428127/tarkkaavaisuushairio

Page 60: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

60

counterattacks,citingcommercialnatureofbigpharmaordownsidesofexcessiveuseofantibiotics,usingbiomedicaltermslike“Hospital-AcquiredInfections”(topics#A).

Thisstudysuggestsaninter-groupconflictmechanismforsocialhealthconsciousnesslearning.TheSuomi24homeopathiccommunityexhibitstracesofhealthliteracy,yetpositionsCAMtreatmentsdifferentlythanitsopponents.Thematerialandmethodsobviouslydonotallowconcludingthatallcommentsreflecttheauthors’actualexperiencesandopinions.However,forumdiscussionsmaybeanimportantlearningenvironmentforcontributorsandnon-contributingvisitorsalike.Tounderstandthesocialprocessofhealthconsciousnessinthefieldofhomeopathy,thisstudyshouldbecomplementedwithinterviewsandethnographicresearch.

REFERENCESBains,S.S.,&Egede,L.E.(2011).AssociationofHealthLiteracywithComplementaryandAlternativeMedicineUse :ACross-SectionalStudyinAdultPrimaryCarePatients.BMCComplementaryandAlternativeMedicine,11(138),7.http://doi.org/10.1186/1472-6882-11-138

Blei,D.M.,Ng,A.Y.,&Jordan,M.I.(2003).LatentDirichletAllocation.JournalofMachineLearningResearch,3,993–1022.

Centola,D.(2013).Socialmediaandthescienceofhealthbehavior.Circulation,127(21),2135–2144.http://doi.org/10.1161/CIRCULATIONAHA.112.101816

Chinn,D.(2011).Criticalhealthliteracy:areviewandcriticalanalysis.SocialScience&Medicine,73(1),60–67.http://doi.org/10.1016/j.socscimed.2011.04.004

Cline,R.,&Haynes,K.(2001).ConsumerhealthinformationseekingontheInternet:thestateoftheart.HealthEducationResearch,16(6),671–92.

DiMaggio,P.,Nag,M.,&Blei,D.(2013).Exploitingaffinitiesbetweentopicmodelingandthesociologicalperspectiveonculture:ApplicationtonewspapercoverageofU.S.governmentartsfunding.Poetics,41(6),570–606.http://doi.org/10.1016/j.poetic.2013.08.004

LexisNexis.(2007).HowManyPagesinaGigabyte ?Retrievedfromhttps://www.lexisnexis.com/applieddiscovery/lawlibrary/whitePapers/ADI_FS_PagesInAGigabyte.pdf

Puuronen,A.(Ed.).(2015).TerveystajuNuoretpolitiikkajakäytäntö.Helsinki,Finland:Nuorisotutkimusverkosto.

Sorensen,K.,VanDenBroucke,S.,Fullam,J.,Doyle,G.,Pelikan,J.,Slonska,Z.,…EuropeanHealthLiteracyProjectConsortium(HLS-EU).(2012).Healthliteracyandpublichealth:asystematicreviewandintegrationofdefinitionsandmodels.BMCPublicHealth,12(1),80.http://doi.org/10.1186/1471-2458-12-80

Stoneman,P.,Sturgis,P.,&Allum,N.(2012).Understandingsupportforcomplementaryandalternativemedicineingeneralpopulations :Useandperceivedefficacy.Health,17(5),512–529.http://doi.org/10.1177/1363459312465973

Walsh,C.,&Elhadad,N.(2014).ModelingClinicalContext:RediscoveringtheSocialHistoryandEvaluatingLanguagefromtheClinictotheWards.,224–231.

Page 61: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

61

4.WebdataextractionallowsindependentevaluationofGlobalAbsolutePoverty51MichailMoatsosUtrechtUniversity

Thewidelyapplied“dollar-a-day”methodologyidentifiesglobalabsolutepovertyasdecliningprecipitouslysincetheearly80'sthroughoutthedevelopingworld.Themethodologicalunderpinningsofthedollar-a-dayapproachhavebeenquestionedintermsofadequatelyrepresentingequivalentwelfareconditionsindifferentcountriesandyears[ReddyandPogge,2010;Deaton,2010;Srinivasan,2010;AtenandHeston,2010;Sub-ramanian,2015;Moatsos,2015].Ifempiricallysubstantiated,suchcriticismdirectlyquestionsthevalidityofthedollar-a-daymethodologysinceininternationalpovertymeasurement“thefirst-orderissueistodemandwelfareconsistency"[Ravallion,2015,p.4].

However,anindependentexaminationofthelevelsandtrendsofglobalpovertyisaverydemandingtask.Inmostofitspartthisisduetotherestrictedaccessonnationaleconomicdistributionsofincomeorconsumptionthatareutmostessentialforthecalculations.TheeasiestwaytousethosedistributionsisthePovcalNetwebsiteofferedbytheWorldBank.Unfortunately,theBankdoesnotmaketheunderlyingdistributionaldataavailable.Insteaditonlyconditionallyallowsdirectcalculationsofglobalpoverty.Theconditionbeingthattheindependentresearcheracceptsthevalidityofthedollar-a-dayapproachthattheWorldBankfollowsratherreligiously.Thusaseriousproblemappearswhenoneseekstoevaluateglobalpovertyusingadifferentmethodology,orwhenonetriestoquantifytheaforementionedconcernsagainsttheWorldBankmethodology.

ThesolutiontothisconundrumistheuseofITmethodsthatscraptheunderlyingdatafromthePovcalNetwebsite,sothattheycanbeusedindependentlyofanyconditions.ThishasbeendoneinafirststepbyDykstraetal.[2014],butwithseveralissuesofdiscrepancybetweenthedataofferedbyPovcalNetandthosemadeavailablebytheauthors.BasedontheirworkIsimplifytheprocessonestepfurtherbyallowingautomatedevaluationofpovertyinbulkbasedonindependentlycalculatedpovertylines.ThisapproachhastheadvantageofqueryingthePovcalNetservice“asitis”withoutdiscrepancies.

Thealternativetothedollar-a-daymethodologicalapproachistoestimateabsolutepovertyonagloballevelusingappropriatelydefinedconsumptionbasketsforeachcountryandyearseparately.Allen[2001]definestheBBBsforuseinthehistoricalrealwagesliterature,anddeZwartetal.[2014]applythemonaglobalscale.Table1containstheoverview.TheBBBsareconstructedsuchastorepresentbareminimumabsolutepovertylevelsinconsumptionterms.However,theabsolutepovertyyardstickcanbeexpandedtoaccountforotheressentialelementsoflifeandwellbeing,suchaseducationandhealth,asboththeCopenhagenDeclarationandtheUniversalDeclarationofHumanRightsstipulate.TabledescribesonesuchBBBderivativethatallowsforconsiderablyhigherwelfarelevelscomparedtothebasicBBB.

51 ThispaperislargelybasedonaforthcomingarticleintheJournalofGlobalizationandDevelopment

entitled“GlobalAbsolutePoverty:BehindtheVeilofDollars”.

Page 62: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

62

Table1:Thecompositionofbarebonesbasketsinrealwagesandthetwoderivativesappliedhere.

Item Unit/Year RealWages BBB BCS

Basket

EnergyTarget kcal MDER MDER

1455/2100

Minimization - cheapestbundle meanof3

cheapestbundles

Mainstaple kg basedonkcal/proteintarget**

155-413*

Beansorpeas kg -/20/45 LP 40atminimum

Meatorsh kg 3or6 3or6 12or24

Butteroroilorghee kg 3 3 12

Sugar kg -/2 2 8

Linen(applied) share 8% 8%±2% WBGC

Lampoil liter 1.3 1.3 WBGC

Soap kg 1.3 1.3 WBGC

Candles kg 1.3 1.3 WBGC

Fuel mbtu 3 f(Tin°C) WBGC

Cooking mbtu - MDER WBGC

Housing mark-up 5% 5%±2% WBGC

Health,Education,Water % - - WBGC

Additionalshares % - - WBGC

Note:TheBarebonesbasketwithConsumptionShares(dubbedBCS)usestheaverageofthreecheapestbundles,andfourtimesmoremeat/fish,butterandsugarallowance.Inaddition,anallowancecoveringhealth,education,andwaterisincludedusingtheconsumptionbudgetsharesfromtheWorldBankGlobalConsumptiondataset(notedasWBGConthetable).Consumptionbudgetsharesarealsousedforenergy,housing,andclothing,andallowancesforPersonalCare,ICT,FinancialServices,andOthersareincludedintheadditionalshares.

*:dependingonthecountryandmainstaple

**:Toavoidinflatingthepriceoftheconsumptionbundle,priorityinlinearprogrammingisgiventothekcaltarget,andproteintargetisallowedtoovershootby200%atmaximumifnecessary.OnlyforDominicanRepublicthiscapincreasesthebundlepricebymorethan20%,andforBelarusbymorethan10%,comparedtoallowingforunlimitedproteinovershooting.Forallothercountriesthereincreaseifanyisrestrictedtoonlyafewpercentagepointsincrease.

Page 63: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

63

Theresults(figure1)showthat,intermsoflevels,ontheonehandthetargetofalleviatingabsolutepovertyisnotasfaroffaswasthoughtof,butontheotherhand,

absoluteBBBpovertyhasshownremarkablepersistencethroughouttheperiod.ThedifferencewiththePovcalNetestimatesisenormousthroughout.Comparingthe1990and2014estimatesleaveslittleroomforcelebrationsovertheachievementofhalvingabsoluteglobalpovertybetween1990and201552.UsingtheBBBpovertylinesthepointestimateforglobalpovertyin1990is5.6%andfor20143%.UsingtheBCSpovertylines,thecorrespondingratesare62%and33%.Inturn,thisshowsthattheconclusionaboutthequestionableMDG1successdoesnotresultfromtheverylowwelfarelevelthattheBBBpovertylinesencapsulate.

Figure1:EvolutionofpovertyintheDevelopingWorld,1983-2014.PCN2005/11refertotheWorldBankglobalpovertyestimatesbasedonthe2005or2011--socalled--ICProunds.

ThevastdifferencesamongBBBwelfarelevelandtheInternationalPovertyLine(iPL)canbeattributedontwoelements.First,themuchlowercostsofbarebonessubsistencecomparedtothe$1.9valueforthevastmajorityofthecountriesandyears.Andsecond,onthedifferentialbetweenconsumerpriceindexandtheBBBpriceindex.ThealsoverylargedifferencesofiPLwithBCS,especiallyonthelateryearsoftheperiod,isattributabletotheinabilityoftheiPLtoencapsulateexpensesthatarenecessaryinescapingabsolutepovertyasdescribedininternationaltreatiesandconventions.

Thisresearchdemonstratesthattheuseofdigitaltechniquesforscrapingonlinedatathatarenotexplicitlyprovidedfordownloadingcanprovideanswerstobigquestionssuchasthelevelandtrendsofglobalpoverty.Itisimportantthatthisworkcanbeperformedindependentlyofthecustodian

52 MillenniumDevelopmentGoal1:“Target1.A:Halve,between1990and2015,theproportionofpeoplewhoseincomeislessthan$1.25aday”.TheWorldBankhasannouncedthatthisgoalhasbeenachievedasearlyas2010,fiveyearsaheadofschedule.

Page 64: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

64

institutionsformonitoringandreducingpoverty(theWorldBank).Institutionallybaseddecisions,evaluationsandcalculationsarenotnecessarilybeyonddispute;andmustnotbe.ThenextstepinthisprojectistoelaborateontheproperaccountingofuncertaintiesintheestimatesusingtheMonteCarlomethodforpseudo-experiments.This,computationallyverydemandingtask,wouldallowforamoreappropriatecomparisonofthepovertyestimatesbetweenthetargetyearsofMDG1.

ReferencesAllen,R.C.(2001).TheGreatDivergenceinEuropeanWagesandPricesfromtheMiddleAgestotheFirstWorldWar.ExplorationsinEconomicHistory,38:411-447.

Aten,B.andHeston,A.(2010).UseofCountryPurchasingPowerParitiesforInternationalComparisonsofPovertyLevels:PotentialandLimitations.InAnand,S.,Segal,P.,andStiglitz,J.E.,editors,DebatesontheMeasurementofGlobalPoverty.OxfordUniversityPress,Oxford.

deZwart,P.,vanLeeuwen,B.,andvanLeeuwen-Li,J.(2014).RealWages.InvanZanden,J.L.,Baten,J.,D'Ercole,M.M.,Rijpma,A.,andTimmer,M.P.,editors,HowWasLife?GlobalWell-beingsince1820,chapter4,pages73-86.OECDPublishing,Paris.

Deaton,A.(2010).MeasuringPovertyinaGrowingWorld(orMeasuringGrowthinaPoorWorld).InAnand,S.,Segal,P.,andStiglitz,J.E.,editors,DebatesontheMeasurementofGlobalPoverty,pages187-222.OxfordUniversityPress.

Dykstra,S.,Dykstra,B.,andSandefur,J.(2014).WeJustRanTwenty-ThreeMillionQueriesoftheWorldBank'sWebsite.

Moatsos,M.(2015).GlobalAbsolutePoverty:BehindtheVeilofDollars.CGEHWorkingPaperSeries,(77).

Ravallion,M.(2015).TowardBetterGlobalPovertyMeasures.CenterforGlobalDevel-opment,WorkingPa(417).

Reddy,S.G.andPogge,T.(2010).Hownottocountthepoor.InAnand,S.,Segal,P.,andStiglitz,J.E.,editors,DebatesontheMeasurementofGlobalPoverty,pages42-51.OxfordUniversityPress.

Srinivasan,T.N.(2010).Irrelevanceofthe$1aDayPovertyLine.InAnand,S.,Segal,P.,andStiglitz,J.E.,editors,DebatesontheMeasurementofGlobalPoverty,pages143-151.OxfordUniversityPress,NewYork.

Subramanian,S.(2015).Oncemoreuntothebreach.Economic&PoliticalWeekly,L(45):35-40.

Page 65: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

65

SessionI

1.SociologyofFrenchVideoGameMagazinesBjörn-OlavDozo

ThefirstvideogamemagazineinFrench,Tilt,waspublishedbyEditionsMondialesinSeptember1982,justafewmonthsafterthefirstreleaseofComputerandVideoGames(UK,November1981)andElectronicGames(US,November1981).ItestablishedamodelforfutureFrench-speakingvideogamemagazines,withastablestructure(news,previews,tests)presentinanymagazineuntiltheearly2000’s.

Graphoftherelationsbetweenmagazinesofthecorpus.

The1990’sareaveryprofitabledecadeforthesemagazinesastheeditorialfieldisstructuredtosupportgamedevelopers,withapro-Nintendopoleandapro-Segapole.Whilemagazinetitlesstoodinrhetoricopposition(SuperPowerpro-Nintendovs.MegaForcepro-Sega,about120000monthlycopieseach),theysharedthesameeditorialboards:thesamejournalistswroteindifferent

Page 66: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

66

magazinesofonepublisher,butwithdifferentpseudonyms.Attimes,theysimulatedcompetitionbetweenthevariouseditorialboards,givingtothereadersthefeelingofbelongingtoacommunity.Thiskindofstrategieswascommonuntil1996,butwhenanewchallenger(Sony)cameintothedance,somemagazineschosetomergewitholdcompetitorsofthesamepressgroupinordertosurvive.

In2003,“FutureFrance”boughtalmostallthevideogamesmagazinestitlesavailableontheFrenchmarket.Thishegemonicstrategy,however,hasnotproventobeprofitableonthelongterm:alotofthesetitles,evenlong-runningmagazineswithfaithfulaudiences,discontinuedtheirpublicationintheyearsfollowingthebuyout.Mytalkwillquestionthecontextofthesecessationsofactivities.Differentreasonscouldbegiven:theinternetexplosionofvideogamesinformation’swebsites,theweaknessoftheeconomicmodelofthepaperpressorthedemotivationofjournalists.Otherinitiativesemergedatthistime,asCanardPCandGamingforexample,proposingadifferentbusinessmodel(independentpress).Afterthisfirststage,Iwillfurtheranalysethecareer-pathofthesespecializedjournalistswithasocialnetworkanalysis,followingtheirpathbetweendifferentredactionsinthisverysmallworld.ThedatabasethatIuseiscompiledfromtheexaminationofabout80titlesofFrench-speakingvideogamemagazinesover30years.Withthesedata,Iwillshowtheevolutionofthefield,withthemigrationofsomejournalistsbetweendifferentpublications,sometimesonthebasisofakindof“mercato”oflocalwritingstars.

BibliographyBae,Arram,DoheumPark,Yong-YeolAhn,andJuyongPark.2016.‘TheMulti-ScaleNetworkLandscapeofCollaboration’.PLOSONE11(3):e0151784.doi:10.1371/journal.pone.0151784.

Falk,Casey.2014.‘UsingNetworkAnalysisontheDuCheminMusicDatasettoReconstructMissingMusic’[unpublishedpaper].

Giannetti,Francesca.2016.‘AReviewofNetworkApproachesinMusicStudies’.MusicReferenceServicesQuarterly19(2):156–63.doi:10.1080/10588167.2016.1166842.

Gresham-Lancaster,Scot.2014.‘ComputerMusicNetwork’.Leonardo47(3):266–67.doi:10.1162/LEON_a_00771

2.Musicalnetworks–NetworksofmusicMarnixvanBerchum,UtrechtUniversity

NetworkmodelsarewidelyusedinDigitalHumanitiesforunderstandingrelationalstructuresininformation.Theavailablemathematicaltoolsusedinnetworkscienceallowscholarstoanalysetheirmaterialinaquantitativemanner,andforexamplefindrelativecentralitymeasuresforcertainnetworkentitiesordiscovercommunitystructures.Visualisationtoolsassistindiscoveringlargescalepatternsinthenetwork,pointingtoareaswhereamorethorough,qualitativeanalysisisneeded.ThepresenceofnetworkrelatedcontributionsintheprogrammesofDigitalHumanitiesconferences,theongoingemergenceofnewtoolsforbuildingandvisualisingnetworks,andthemanyhumanitiesprojectsmakinguseofthesepublicationsandtoolsattesttothispopularity.Withadifferentlevelofintensity,mostHumanitiesdisciplinesmakeuseofnetworkmethodologies.Thereisforexampleastrongcommunityofhistoriansworkingwith/onnetworks,demonstratedbytheextendedbibliographyandoverviewoftoolsathttp://historicalnetworkresearch.org.Thesessionsofthe‘Arts,HumanitiesandComplexNetworks’satellitesattheyearlyNetScimeetingsshowavarietyofdisciplines–includingarthistory,filmhistory,literaryhistoryandmusicology–makinguseofthemethodsandtoolsofnetworkscience.

Page 67: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

67

Inrecentyearsagrowingnumberofpublicationsappeared,thatcombinesnetworksandmusic.Thesubjectsarewideranging,fromsocialnetworksbetweenseventeenthcomposers(SmithandTaylor2014),toco-occurencenetworksofcomposersonCDrecordings(Parketal.2015;Baeetal.2016),to“ComputerMusicNetwork”ofthelate70s/early80s(Gresham-Lancaster2014).Similartotherangeofsubjectsavarietyofnetworkmethodologiesappliedtomusicisdiscernibleinthesepublications.Iwillcomparetheapproachesoftheselectedpublicationsandanswerquestionsonhowtheyrelatetothemore‘traditional’musicologicaldiscourse.Thepaperwilldiscussthebiasespresentinthedatausedinthepublicationsandhowtheseeffectthemusicologicalconclusionsmade.Ittouchesuponthetensionbetweenthequantitative(‘distant’)characterofnetworkscienceandthequalitative(‘close’)characterofmusicologyresearch.

TheaimofmyownPhD-researchistobridgethisgap.InmyresearchIuseanetworkapproachtoshedlightonthedisseminationofpolyphonicmusicinthesixteenthcentury,theageoftheemergenceofprintedmusic.Primarymusicalsourcesandthecompositionstheycontainarethetwoentitiesthatformthenetwork.Sinceonesourcecontainsseveralcompositions,andonespecificcompositionmaybepresentinmultiplesources,abipartitenetworkofsourcesandcompositionscomesintoexistence.Bothextensivemusicologicalstudiesofthesesourcesandcompositions,aswellashighlevelnetworkstructuresareusedtoformulateamodelforthedisseminationofmusic.InthispaperIwillcompareandrelatemyownexperiencewiththeevaluationoftheselectedpublications,concludingwithaninsightintowhatnetworks,socialnetworkanalysisandrelatedmethodsandtoolsoffer–andmayofferinthefuture–tothefieldofMusicology.

IllustrationtakenfromBaeetal.2016,showingtheco-occurenceofmusicians(composersandperformingartists)onCD-recordings;datatakenfromArkivMusic.

Illustrationfromtheauthor’sPhDresearch,showingthenetworkofmanuscriptsfromtheAlamirescriptorium;eachnoderespresentsamanuscript,anedgerepresentsatleastonemusicalcompositiontwomanuscriptshaveincommon.

BibliographyBae,Arram,DoheumPark,Yong-YeolAhn,andJuyongPark.2016.‘TheMulti-ScaleNetworkLandscapeofCollaboration’.PLOSONE11(3):e0151784.doi:10.1371/journal.pone.0151784.

Page 68: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

68

Falk,Casey.2014.‘UsingNetworkAnalysisontheDuCheminMusicDatasettoReconstructMissingMusic’[unpublishedpaper].

Giannetti,Francesca.2016.‘AReviewofNetworkApproachesinMusicStudies’.MusicReferenceServicesQuarterly19(2):156–63.doi:10.1080/10588167.2016.1166842.

Gresham-Lancaster,Scot.2014.‘ComputerMusicNetwork’.Leonardo47(3):266–67.doi:10.1162/LEON_a_00771.

Park,Doheum,ArramBae,MaximilianSchich,andJuyongPark.2015.‘TopologyandEvolutionoftheNetworkofWesternClassicalMusicComposers’.EPJDataScience4(1).doi:10.1140/epjds/s13688-015-0039-z.

Piekut,Benjamin.2014.‘Actor-NetworksinMusicHistory:ClarificationsandCritiques’.Twentieth-CenturyMusic11(02):191–215.doi:10.1017/S147857221400005X.

Smith,DavidJ,andRachelleTaylor.2014.NetworksofMusicandCultureintheLateSixteenthandEarlySeventeenthCenturies:ACollectionofEssaysinCelebrationofPeterPhilips’s450thAnniversary.

3.The“FrameGenerator”.AnalternativemethodforapproximatingcoremeaningsintextsJorisvanEijnatten(UniversiteitUtrecht)JulietteLonij(KoninklijkeBibliotheek)

Tracingsemanticpatternsovertimeonthebasisoftextsisstillinitsinfancy.Mostapproachesbuildonalinguisticprinciplewhichstatesthatthemeaningsofwordsaredetermined‘bythecompanytheykeep’.Inotherwords,meaningsarisefromcontextsdefinedasdistributionsofwords,whichsuggeststhatwecantracemeaningsovertimebyexaminingchangingcontexts.Topicmodellingisatthismomenttheonlytechniquebasedontheprincipleofworddistributionsthathasgonebeyondanexperimentalstageandhasprovenitsvaluebyachievingresultsthatdomainexperts(inthiscasehistoriansnotnecessarilyinvolvedincomputer-assistedresearch)recognize.

Thispaperdiscussesanewtool,dubbedthe‘FrameGenerator’,aimedatmeaningfullyreducingasetof(possiblythousandsof)Dutchtextstowordpatternsthatcutacrossthedistributionsgeneratedbytopicmodelling,thusprovidingadditionalinsightintothecontentofthedataset.Themethodimplementedbuildsontopicmodellingbycombiningitwithtwootherproventechniques:(1)theautomaticextractionofkeywordsand(2)theidentificationofcollocates.ThePythonsourcecodeofthetool,offeringacommandlineinterface,isavailablefordownloadonGitHub(https://github.com/jlonij/frame-generator).Anonlinedemowithagraphicaluserinterface,showcasingthetool’smainfunctionalityforasmalldataset,canbefoundathttp://kbresearch.nl/frames/.

TheFrameGeneratorwasdevelopedtoassistintheinvestigationofpopularperspectivesontheconceptof‘Europe’arisingfromtheKBcollectionofDutchhistoricalnewspapers.Tothisend,adatasetwaspreparedofarticlesthatmentionedtheword‘Europe’atleastonce.Asubsetofarticleswasthenselectedonthebasisof(Dutch-language)synonymsforthewords‘unity’and‘unification’(suchas‘integration’,‘agreement’,‘settlement’,‘consensus’,‘treaty’,‘harmony’,etc).ThissubsetwasassumedtocontainnewsarticlesthatdiscussEuropeasaunifiedpolitical/cultural/economicentity,orasanentityinvolvedinaprocessofunification.Theothersubsetwasbasedonsynonymsforcompetitions(suchas‘match’,‘prize’,‘winner’,‘cup’,etc);thissubsetwasassumedtocontainarticlesonsportsandothercompetitions.

Page 69: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

69

TheFrameGeneratorprocessofanalyzingthesedatasetsconsistsoffourstages.Thefirststageconcernsthepre-processingofthedataset.DuringthisstagethedatasetiscleanedbynormalizingspellingvariationsandcorrectingOCRerrorsonthebasisofuser-providedlistsofregularexpressionsandtheirreplacements.Inaddition,thedatasetistokenized,lemmatizedandpart-of-speechtaggedwiththeNaturalLanguageProcessingsuiteFrog(https://languagemachines.github.io/frog/).Theuserhastheoptionofsplittinglargerdocumentsintosmallerunitsofanalysisbyspecifyingthemaximumnumberofsentencestobecontainedineachunit.

Thesecondstageintheprocessistopicmodelling,whichgeneratesspecific,substantivethemesortopicsbasedonfrequentlyrecurringdistributionsofwords.TheFrameGeneratorofferstwomethodsoftopicmodelling:onebasedonMallet(http://mallet.cs.umass.edu),theotherontheGensimtopicmodellinglibrary).Theuserisabletocontrolthenumberoftopicsgeneratedandnumberofwordsmakingupeachtopicbymeansofvariouscommandlinearguments.Thisstagealsoinvolvesthemanual,hermeneuticinterpretationofthetopicsbasedonhistoricaldomainknowledge.

Thethirdstagefocusesontheextractionofasingle,rankedlistkeywordsfromthesetoftopicsresultingfromthepreviousstage.Therelevanceofeachwordoccurringinthesetoftopicsisdeterminedbytakingthesumoftheprobabilityscoresforthewordoveralltopicsinwhichitoccurs.Awordisaccordedthestatusofkeywordifitsscorereachesacertainthreshold,setatthediscretionoftheresearcher.TheFrameGeneratorcanalsoproduceakeywordlistonthebasisoftf-idfscores,thusallowingtheresearchertocomparetheresultsofdifferentapproaches.Theoptionisavailabletorestrictthecandidatesforthekeywordlisttowordswithspecificpart-of-speechtags.Thekeywordsthusobtainedmayberegardedascoreelementsinaseriesofthematicallyuniformtexts;theirsignificancearisesfromthefrequencyoftheiroccurrencewithinaswellasacrosstopics.

Thefourthandfinalstageoftheanalysisprocessconsistsofcontextualisingthekeywordsbyfindingcollocatesinthetextsfromwhichtheywereoriginallyextracted.Theusersetsamaximumworddistancefromthekeywordaswellasthedirection(left,right,orboth)inwhichcollocatesmustoccurinordertoqualify.Aswiththeextractionofkeywords,theoptiontoincludeonlyspecificpart-of-speechtagsisalsoprovidedforcollocates.Thesetofcollocatesthusgatheredforagivenkeywordiscalleda‘frame’.Thewordsappearinginaframeareorderedbythefrequencyoftheirco-occurrencewithandtheirdistancetothekeywordwithwhichtheyareassociated,expressingtheirsignificanceinframingaspecifickeyword.

Theresultsofeachofthesestagesaresavedandaccessibletotheuserintheformofcomma-separatedvalues(CSV)files.Thesecan,forexample,beusedtovisualisethegraphofthekeywordsandtheircollocatesinanapplicationsuchasGephi(https://gephi.org)inordertofacilitatetheinterpretationoftheresults.BycreatingsuchnetworkgraphsfortheFrameGeneratorresultsforanumberofdifferenttimeperiods(seeFigure1foranexample)wefoundthatnewspaperreportingon‘Europeanunity’,whileshowingaremarkabledegreeofcontinuity,becamelessrichrhetorically,lessinternational,andmorefocusedoninstitutionaltechnocracythanonintra-continentalrelationsoverthecourseofthetwentiethcentury.

ThispaperhypothesisesthattheFrameGenerator,bylayingbarethefundamentalpatternsinsetsofthematicallycoherenttexts,enableshistorianstobetterdeterminecontinuitiesanddiscontinuitiesinexpressionsofpublicopinion.TheFrameGenerator’sperformancedependsonthatofitsconstituenttools(suchastopicmodelling),whichhavebeendescribedintheliterature.Itsadvantagesincludeitsadaptabilitytootherlanguages(giventheavailabilityofpart-of-speechtagging),itsflexibility(theusercansetallvariables)andits‘all-in-one’packaging(itrequiresnoprogrammingskillswhilegeneratingnotjustframesbutalsokeywordsandtopics).Fordomainexperts(historians)theproofofthepuddingwillbeintheeating:doesthisparticularcombinationoftools–topicmodelling,keywordextractionandidentificationofkeywordcollocates–offerusefulresults?Thequestioncanonlybeansweredbyrunningthetoolonavarietyofrelativelyhomogenousdatasets.

Page 70: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

70

Figure1.NetworkgraphinGephishowingaframeofcontextualisedkeywordsrelatedtospatialentities(green),conceptsrelatedtocommunityformation(blue)andabstractterms(purple),basedonnewspaperarticlesfromDeTelegraaf(1925-1929;n=767)

4.Hybridapproachestohistoricalresearch:analysingtheAnneFrankdiarieswithdigitaltoolsDr.GerbenZaagsmaLichtenbergKolleg,Georg-August-UniversitätGöttingen

Thispaperarguesforahybridapproachtohistoricalresearchthatcombines’traditional’withdigitalhermeneuticalapproachesinanewpracticeofdoinghistory.Asthedigitalturnaltersandaffectsallpartsofthehistoricalresearchprocess,thisisapressingchallengeandneedforallhistorians,notjustforthoseengagedin‘bigdata’projects.Indeed,hybridityis,andshouldbe,thenewnormal.Yet

Page 71: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

71

whilemosthistoriansareaccustomedtodeployingdigitalapproachesintheinformationgatheringstageoftheirresearch,theyoftenrefrainfrom‘goingdigital’initsprocessingandespeciallyanalysisstages.DescribinganumberofdigitaltoolsusedinworkdoneonthediariesofAnnefrank,thepapercriticallyanalysesanddemonstratestheaddedvalueofincorporatingtheminallstagesofhistoricalresearch.Digitalapproachesenhancethemethodologicalrepertoirefurnishedby‘traditional’closereadingpractices.Hybridapproachesthusexpandourintellectualhorizonsandtheanalyticalpowerwebringtobearuponoursources.

Thepaperconsistsof1)atheoreticalpart,contextualisingnotionsof‘traditional’anddigitalapproachestohistoricalresearchandtheuptakeofthelatter;and2)aconcretecasestudyofahybridapproachtohistoricalresearch.

Thefirstpartwillbrieflydiscussdiscoursesaround‘goingdigital’thatoftenoppose‘traditional’todigitalapproaches.Onagenerallevel,thiseither/orattitudeismisleading;despitewhatisoftenassumed,orimplicitlysuggested,distinctionscannotbeneatlymappedalonglinesofclosereading/distantreading,quantitative/qualitativeorpositivist/narrativeanalysiseither.Morespecifically,thisoppositionisalsoproblematicbecause,forinstance,closereadingcanalsoinvolvetheuseofdigitaltools,andthesameobviouslygoesforqualitativeanalysis.Inthisrespect,oneshouldmentionFrédéricClavert’suseofFrancoMoretti’sconceptof‘distantreading’toproposeanewwayofreadingandinterpretinghistoricalsourcesinthedigitalageusingtwoaxes–closereading/distantreadingandhumanreading/computationalreading.

Thefocuswillthenshifttotheproblemofuptakeofdigitalapproachesamonghistorians.Here,adistinctionisdrawnbetweentwobroadstrandsofhistoricalresearchinthedigitalera,asmeasuredbytheirapplicationofdigitalapproaches:

• ontheonehand,anumberofdigitalhistorianstake(big)dataanddigitaltools(development)astheirpointofdeparture;theirfocusisondigitaldatasets(forinstancenewspapers)andtheapplicationofdigitaltoolstoananalysisofthatdata.Thisyieldsresearchresultsthatareoftenasmuch,ifnotprimarily,concernedwithcriticalreflectionondataandtoolsaswiththeresearchtopicathand.Tooldevelopmentisoftenalsopartoftheprocessandprojectandresearchquestionstendtobedictatedbytheavailabledataandtools.

• ontheotherhandtherearethosehistorians,arguablythemajority,whoseresearchdoesnotstartwithdatasetsandtools;theydepartfromparticularresearchquestionspertinenttotheirtopicofresearchthatcouldbeanswered,atleastinpart,bydigitalmeans;thequestionthenbecomeshowdigitalapproachescanaid,enhanceandcomplementtheiranalyses.Aswillbeclear,theproblemofuptakeispertinenttothisgroupofhistorians.

Thesecondpartofthepaperconcernsitselfwithahybridanalysisofaconcretehistoricalsource:thediariesofAnneFrank.Thispartofthepaperisoneoftheoutcomesofathree-yearresearchproject(ThediariesofAnneFrank.Research—Translations—CriticalEdition)whichwascarriedoutattheLichtenbergKollegoftheGeorg-August-UniversitätGöttingen.Theprojectinvolvesanewscholarlyeditionofthediariesaswellasanaccompanyingmulti-authorresearchmonographwhichwillfocusoncontextualisation,receptionandrepresentationsofthediaries.Describinganumberofdigitaltools(notablytextminingandQDAsoftware)usedinanalysingthediaries,thepapercriticallyanalysesanddemonstratestheaddedvalueofincorporatingtheminallstagesofhistoricalresearch.TheaimhereistoapplyClavert’sbasicmodel,asmentionedabove,toaconcretecasestudyand,ultimately,toprovidehistorianswithaconcreteexampleofahybridapproachtohistoricalresearch.

Page 72: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

72

SessionJ

1.TowardsatoolanddatacriticismframeworkAdeveloper’sanduser’sperspectiveSallyChambers1,JokeDaems1,GretaFranzini2,MarcoBüchler2,SusanAasman3

1GhentCentreforDigitalHumanities,GhentUniversity2InstituteofComputerScience,UniversityofGöttingen3GroningenCentreforDigitalHumanities,UniversityofGroningen

Astheamountofaccessibledigitiseddatagrows,sodoestheneedformachineassistancetohelpprocessthisoverloadofinformation.Asculturalheritageinstitutionsincreasinglydigitisetheircollections,theyareineffectconvertingthecollectionsintodata.ParticularlyintheareaofDigitalHumanities,theneedfor‘full-text’collectionsforanalysis,isbecomingincreasinglyimportant.Forexample,in2016theNationalLibraryoftheNetherlandsorganisedaworkshop‘HistoricalNewspapersasBigData’.53ThefocusofthisworkshopwastobringtogetherresearchersfromarangeofdisciplineswhowereinterestedinusingthedigitisednewspapersandotherdigitalcollectionsmadeavailablebytheDelpherplatform54for(digital)humanitiesresearch.

Inrecentyears,internationalinitiativessuchastheDiRTDigitalResearchToolsdirectory55,theCommonLanguageResourcesandTechnologyInfrastructure(CLARIN)56andtheDigitalResearchInfrastructurefortheArtsandtheHumanities(DARIAH)57havebeenbringingtogethertoolsandresourcestohelpscholarsrepurposedatafortheadvancementofresearchandknowledge.

Despitetheproliferationoftools,littleisknownabouttheirdevelopmentanduse.AsGibbsandOwensobserved(2012),thisgreyareaofknowledgeconcernsboththeproductionandtheusersideoftools,raisingquestionsaboutusability,purpose,effectivenessandusage.

Fromadevelopmentstandpoint,oftenassumptionsaremadewithregardtousersandtheuseofatool.Whiletoolsaretypicallydesignedtobepartofthesolutiontoaproblem,byassumingknowledgetheybecomepartoftheproblemtobesolved.

Fromauserperspective,perhapsthebiggestbarriertotheadoptionofatoolistheabsenceof(sufficient)documentationontheirapplication(i.e.“howto”instructions)andontheirfunctionality(i.e.the“blackbox”).Functionalityiskeytotheevaluationofcomputedresults,inthatiftheinnerworkingsofatoolareopaque,howortowhichextentcantheusertrusttheresults?Howusefulisthetool?

Whereasdevelopersneedtobeclearaboutwhatthetoolisintendedfor,usersneedtobecarefulinselectingtheappropriatetooltoaddresstheirresearchquestion.Animprovedunderstandingofboththedeveloper’sintentionsintooldevelopmentaswellastheuser’srequirementsinordertoanswertheirresearchquestionareneeded.Additionally,theparticulardata-setthatauserwishestoanalyse,isacrucialfactorwhenitcomestotoolselection.

53 See:https://www.kb.nl/nieuws/2016/historische-kranten-als-big-data-ii-concepten-op-drift(Accessed:9

February2017).54 Availableat:http://www.delpher.nl/(Accessed:9February2017).55 Availableat:http://dirtdirectory.org/(Accessed:9February2017).56 Availableat:https://www.clarin.eu/(Accessed:9February2017).57 Availableat:http://www.dariah.eu/(Accessed:9February2017).

Page 73: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

73

Inlightoftheissuesdescribed,thiscontributionreiteratestheneedfortoolcriticism,previouslyexpressedatthe‘ToolCriticismforDigitalHumanities’workshop(TraubandOssenbruggen,2015).58WearguefortoolcriticismasapedagogicalandeffectivemeansoftacklingtheinterdisciplinarychallengesposedbytheDigitalHumanitiesandoffosteringcommunicationbetweendevelopersandusers.Furthermore,weproposeanextensionofthetoolcriticismframeworktoalsoincludedata.Asanintegralpartoftheresearchprocess,wearguethatthedata-setisanimportantfactortoconsiderwhenselectingtheappropriateresearchtools.Forexample,ifauserhasachoicebetweentwotoolswithequivalentfunctionality,thenthestructureofthechosendata-setmayperformbetterwithoneorotherofthetools.Additionally,weproposethat‘datacriticism’isanimportantelementinitsownright.Forexample,itisimportanttocriticallyselectthesourceofaparticulardata-set,basedonarangeofcriteria.Ifaparticulartextisneededforanalysis,itmaybeavailablefrommultiplesources.Aframeworktofacilitatetheselectionofthemostappropriatedatasourceisthereforeneeded.Thiswillbuildonexisting‘sourcecriticism’and‘informationevaluation’frameworks(Hjorland,Birger,2012).

Asafirststeptowardstoolanddatacriticism,weproposeanumberofevaluationcriteriathatseektoencourageamorecriticalapproachtotools.Thesebuilduponanalogoussoftwarestudies(Jacksonetal.,2011),theEVALITAcampaigns59andtheveryrecentRIDEDigitalTextCollectionsevaluationguidelines60,andaregroupedasfollows:

Tools1. Usability

a) UserExperience(UX)b) GraphicalUserInterface(GUI)

2. Documentationa) Provenance(authors/organisationsbehindthetools)b) “Howtoinstructions”c) Algorithmsormethodsimplementedd) Limitationse) Targetaudience/researchf) Availabilityoftutorialstotrainuserstoproficientlyworkwiththetoolg) Accessandcitationh) Rights

3. Maintenancea) Developmentresponsestouserfeedback

4. Flexibility/ExtentofApplicability

Data-sets5. (Re-)Usability

a) Format(s)6. Documentation

a) Provenance(curators/organisationsbehindthedata-sets)b) Metadata(e.g.size,source,author,etc.)c) Limitationsd) Accessandcitation

58 See:http://event.cwi.nl/toolcriticism/(Accessed:12February2017).59 FormoreinformationaboutEvaluationofNLPandSpeechToolsforltalian(EVALITA),see:

http://www.evalita.it/2016(Accessed:12February2017).60 See:http://ride.i-d-e.de/reviewers/call-for-reviews/special-issue-text-collections/(Accessed:13February

2017).

Page 74: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

74

e) Rights7. Maintenance

a) Developmentresponsestouserfeedback

Weevaluateourcriteriaonthreedifferentprojects-onedata-setproject,onetoolandoneapplicationofourselectedtoolonourselecteddata-set-tocomparetheuseranddeveloperperspectives.

Theintentionistofosteranunderstandingoftoolanddatacriticismtowardsadialoguebetweenusersanddevelopers,includinghowsuchaframeworkcouldbeputintopractice.

ReferencesGibbs,F.,Owens,T.(2012)‘BuildingBetterDigitalHumanitiesTools:Towardbroaderaudiencesanduser-centereddesigns’,DigitalHumanitiesQuarterly,6(2)[Online].Availableat:http://www.digitalhumanities.org/dhq/vol/6/2/000136/000136.html(Accessed:12February2017).

Hjorland,Birger(2012)‘Methodsforevaluatinginformationsources:Anannotatedcatalogue’,JournalofInformationScience,38.3(June2012):258-268.

Jackson,M.,Crouch,S.,Baxter,R.(2011)SoftwareEvaluation:Criteria-basedAssessment[Online].Availableat:https://www.software.ac.uk/sites/default/files/SSI-SoftwareEvaluationCriteria.pdf(Accessed:12February2017).

Traub,M.C.,Ossenbruggen,J.van(2015)WorkshoponToolCriticismintheDigitalHumanities:TechReport[Online].Availableat:http://oai.cwi.nl/oai/asset/23500/23500D.pdf(Accessed:12February2017).

2.SupportingDigitalHumanitiesinDealingwithQualityofWebDocumentsDavideCeolinLoraAroyo

JuliaNoordegraaf

VrijeUniversiteitAmsterdamdeBoelelaan1081a1081HVAmsterdamTheNetherlands

UniversityofAmsterdamTurfdraagsterpad151012XTAmsterdamTheNetherlands

[email protected]@vu.nl

[email protected]

Thispaperdiscussesthedevelopmentofanewapproachforassessingthequalityofonlinedocuments,contributinganewmethodologicalreflectionononlinesourcecriticism.Onlinedocumentsare,infact,ausefulsourceofinformationforverydiversegroupsofusers,rangingfromresearchersandjournaliststogovernmentofficials,activistsorparents.However,thisinformationisonlyusefulifwemanagetofilteroutthespamand,mostimportantly,ifwemanagetoretrievethedocumentsthatbetterfitthequalitativerequirementsthatspecificusershave.Forexample,whileforlaymenneutralityandreadabilitymaybeimportant,forscholarsaccuracyandcompletenessmaybemorerelevant.

Assessingthequalityofonlinedocumentsisachallengingtaskbecauseoftheirintrinsicpeculiarities:theirvolume,variety,andvelocitymakeitimpossibleforhumanstoprocessthemmanually.A

Page 75: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

75

combinationofhumanandautomatedprocessingneedstobedevisedtohandletheirqualityassessment.Moreover,qualityassessmentisachallengingtaskonitsown.Theoverallqualityofagivendocumentistheresultoftheaggregationofmultiplefacets(orqualitydimensions),suchasaccuracy,completeness,andneutrality.Howthesefacetsarequantifiedandaggregatedismostlyasubjectiveandcontext-dependentmatter.Userswithdifferenttasksathandhavedifferentqualitativerequirements.Also,userswithdifferentbackgroundsarelikelytoevaluatethesamedocumentinadifferentmanner.

Ageneraldefinitionofqualityis‘fitnessforpurpose’,whereby‘fitness’varieswithbothcontextandpurpose.Althoughthismeansthattheassessmentofthequalityofonlinedocumentsisaflexible,fluidprocess,webelieveitisnotimpossibletomodelit.Todojusticetothefactthatdifferentpurposesimplydifferentqualitativerequirements(e.g.,forwritinganewspaperarticle,sourceneutralitymaybelessrelevantthanaccuracy),itiscrucialtocreateareferencesystemthatallowsforthequantificationofdocumentqualities(e.g.,theextenttowhichagivendocumentisneutraloraccurate).Whenthisreferencesystemexists,thenwecanidentifythemostaccurate,precise,orneutraldocumentsthatcorrespondtothoseofhigherqualityforagiventask(seeFigure1).

Figure1.Referencesystem.Throughtheassessment,wecreateareferencesystemforassessingtherelativequalityofthedocuments(e.g.,highaccuracy,neutrality,orboth).

Forthepurposeofcreatingareferencesystem,wearebenchmarkingalargeportionofonlinedocumentsbyemployingacombinationofhumanassessmentandmachinelearning.Inapilotstudy,weusedthishuman-machineinteractionapproachtoassessaselectionofonlinedocuments[1].Thesedocumentsfocussedonthetopicofvaccinations,andtheywereselectedtoprovideanoverviewofthetypesofdocuments(blogposts,officialdocuments,etc.),stances(pro,anti,neutral),andtypesofsources(governmentauthorities,activists,etc.).Thesedocumentswereassessedbyexperts(journalistsandmediascholars)whowereaskedtojudgetheirrelevanceforwritinganarticleonthevaccinationdebate.Theexpertswereasked,first,tojudgerelevancebasedoncertainautomaticallygeneratedqualityfeatures(suchasthetrustworthinessofthedocument,theentities

Documents

Assessment

Biasedandinaccuratedocuments

Neutralandaccuratedocuments

Page 76: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

76

mentionedandthesentimentexpressedinit)and,second,tohighlightpartsofthedocumentandmanuallyannotatethequalityfeatures(suchasprovenance,references,specificstatementsinthetextitself).Theresultscollectedshowedthatthesubjectivityoftheseassessmentsislimitedbythefactthatcontributorsshareasimilarbackgroundandbythecleardefinitionofthetaskproposed(documentswereassessedsupposingtheywereusedforaspecifictask).Thisexploratorystudyprovidedpromisingindicationsforautomatingandscalingupthisprocess.

Currently,weareemployingcrowdsourcingtoextendthecoverageofhumanassessmentsofWebdocuments.Byemployingthecrowdinplaceofnicheexperts,wecanextendthenumberofdocumentsassessed.Nevertheless,suchashiftrequiresthedocumentassessmenttaskstobesimplifiedbecauseofthedifferenttypologyofcontributors,andbecausecrowdsourcingtasksareusuallyshorterthannichesourcingtasks[2]:assessingthequalityofWebdocumentsisalengthyanddemandingtask.However,sincethecrowdsourcingversionofthesetasksisintendedtocaptureimplicitqualityevaluationsthatusersusuallydowhenreadingonlinedocuments,suchasimplificationwillaffectthegranularityandnotthereliabilityoftheresults.Forexample,westillaskthecontributorstoassesstheprecision,completeness,andneutralityofdocuments,butweuseaBooleanscaleinsteadofaLikertscale,andwelimitthedepthoftheargumentationsrequested.

Wearealsoexploringthepossibilitytoautomatesuchassessmentprocess(seeFigure2).Weextractedasetoffeaturesfromthedocumentsinourpilotstudy.TheseincludedNLPfeatures(e.g.,namedentities,sentimentanalysis)andprovenance(e.g.,sourcetrustworthiness),andwefoundthatitispossibletoemployalgorithmslikeSupportVectorMachines[3]topredictthequalityassessmentsbyusingthesefeatureswithanaccuracyupto72%.Weareextendingthisprediction,toscaleupthenumberofdocumentsassessedandtoimprovetheaccuracyofthepredictions.Wearescalinguptheprocessoffeatureextraction,byparallelizingthenaturalprocessanalysistoextracttextualfeaturesfromlargecollectionsofdocuments.Wearealsoscalingupthepredictionpart,whichtakesasinputthefeaturesextractedandthetrainingdataprovidedbythecrowdandbytheniches,andproducesthequalityestimations.

Figure2:Automaticassessmentsetup.

NLP

Provenance

...

...

Nicheandcrowdsourc

MachineLearningPrediction

Training

data

Trainingdocuments

Documentstobeassessed

Assesseddocuments(inred:low-qualitydocs;ingreen:high-quality

PredictionParallelizedfeatureextraction

Page 77: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

77

Convolutionalneuralnetworks[4]willbeevaluatedasapossiblesolutionforextendingthesetofdocumentsassessed.Importantwillbealsoananalysisoftherelationamongthequalitydimensionsconsidered.Sofar,wehaveconsideredthediversequalitydimensionsasindependenttargetstobepredicted.However,itcouldbethecasethatsome,lowerorderqualitiesprovidethepreconditionsforthevaluesofqualitiesofahigherorder.Forexample,highneutralityandprecisioncouldbethenecessarypreconditionsforhighaccuracy.Thiskindofdependencieswouldbefavorableforimprovingtheestimationprocess.

References:[1]D.Ceolin,J.Noordegraaf,L.Aroyo,CapturingtheIneffable:Collecting,AnalysingandAutomatingWebDocumentQualityAssessments.InProceedingsofthe20thInternationalConferenceonKnowledgeEngineeringandKnowledgeManagement(EKAW2016),pages:83-97.Springer.2016.

[2]V.DeBoer,M.Hildebrand,L.Aroyo,P.DeLeenheer,C.Dijkshoorn,B.Tesfa,G.Schreiber“Nichesourcing:harnessingthepowerofcrowdsofexperts”.In:InternationalConferenceonKnowledgeEngineeringandKnowledgeManagement.pp.16-20.,2012.Springer.

[3]C.CortesandV.Vapnik,“Support-vectornetworks,”Mach.Learn.,vol.20,no.3,pp.273–297,1995.

[4]Y.LeCun."LeNet-5,convolutionalneuralnetworks".Retrieved15February2017.

3.BuildingtheARTECHNEdatabase:NewdirectionsinDigitalArtHistoryMariekeHendriksen,ARTECHNEproject/DepartmentofArtHistory,UtrechtUniversityMartijnvanderKlis,DigitalHumanitiesLab,UtrechtUniversity

TheARTECHNEprojectatUtrechtUniversity/UniversityofAmsterdamstudieshowtechniquewastransmittedamongartists,artisans,andscholarsbetween1500and1950.Aspartoftheproject,theresearchersarecurrentlyworkingwithUtrechtUniversityDigitalHumanitiesLab(DHLab)todevelopanonlinedatabasecontainingsearchablefull-textearlymodernrecipes,artisthandbooks,andtechnicalinstructions,linkedtootherrelevantinformationsuchasrecordsofobjects,worksofart,conservationresearchandreconstructions.(http://artechne.hum.uu.nl)Weaimforbothquantityandquality:themoreenrichedtextsweadd,themorecomplexthequestionswecananswerusingthesearchandvisualizationfunctionsinthedatabase.

Forexample,asetofquestionslike‘howdidtheuseofcochinealasapigmentinoilpaintschangeinEuropebetween1500and1950,canwediscernpatternsinthespreadofrecipesforsuchpaints,andarecertainusesspecificforparticulargeographicalregions?’cancurrentlyonlybepartlyansweredthroughmanyyearsofresearchonprimarysourcessuchasobjectsandtexts.Giventhenumberofrelevantsourcesandtheirlimitedaccessibility,itwillbeverydifficultforaresearchertodiscoverandvisualizesuchpatternsrelyingontraditionalarthistoricalmethods.Thisdatabase,containingagreatnumberoffullysearchableandannotatedsources,willallowresearchersinarthistory,conservation,andculturalheritagetoasksuchcomplexquestionsandanswerthemwithaspeedandaccuracythatwasimpossiblebefore.Moreover,toolstodetecthierarchicaldistance/patternsofproximityorco-occurrenceofparticulartermswillbeintegrated,whichcangiveusinsightinthechangingmeaningsofconcepts.

Toreachthesegoals,incollaborationwiththeDHLab,wecreatetheARTECHNEdatabase.WeuseDrupal(https://www.drupal.org/)tomanagethedatabasecontents.ThedatabaseisindexedusingApacheSolr(http://lucene.apache.org/solr/),allowingresearcherstousefacetedsearchtofind

Page 78: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

78

relevantresultsinthemanuscripts.Thedataisgeotaggedandcontainsdatinginformation,allowingtoalsoshowsearchresultsinaGISwithanextratimedimension.Moreover,thedatabaseallowstoexportdatafromtheapplicationto.csv-format,hasstableURIsandlinkstotheGettyVocabularies(ULANforartistnames,AATforglossarytermsandCONAforartefacts).Thedatabasethusadherestothe5-staropendataplan.

Byintegratingvarioustechnologiesandrecentlydevelopedmethodsindigitalhumanities,suchasOCR,GIS,semanticannotation,crowdsourcing,andLinkedOpenDatainthisdatabase,wehopetofirmlyestablishtheuseofenrichedtextualprimarysourcesindigitalarthistoricalresearch,whichtraditionallyreliesheavilyonimages.Thetwoauthorsofthepaper–ahistorianandascientificprogrammer–willpresentthefirstresultsofthedatabaseproject.Wewillalsoreflectonthequestionhowmuchdigitalliteracyonthepartofhistoriansandhowmuchhistoricalliteracyonthepartofscientificprogrammersisrequiredtosuccessfullysetupresearchprojectsrelyingonnewtechnologies.

4.FromToolsto“Recipes”:BuildingaMediaSuitewithintheDutchDigitalHumanitiesInfrastructureCLARIAHCarlosMartinez-Ortiz,RoelandOrdelman,MarijnKoolen,JuliaNoordegraaf,LilianaMelgar,LoraAroyo,JaapBlom,VictordeBoer,WillemMelder,JasmijnVanGorp,EvaBaaren,KasparBeelen,NorahKarrouche,OanaInel,RositaKiewik,ThemisKaravellasandThomasPoell

IntroductionScholarsrequireaccesstomultiple,large,multimediacollectionsofdigitalresources,aswellastouseawiderangeofinformationprocessingtoolstoaccessandworkwiththosecollections.Theserequirementsraisetheneedfordevelopingasynchronizednationalandcross-nationalinfrastructure.

CommonLabResearchInfrastructurefortheArtsandHumanities(CLARIAH)61isadistributedresearchinfrastructurefortheHumanities,includedontheNationalRoadmapforLarge-ScaleResearchFacilities(2015-2018)drawnupbytheNetherlandsOrganisationforScientificResearch(NWO).CLARIAHdesigns,implementsandexploitstheDutchpartoftheEuropeanCLARINandDARIAHinfrastructures.

TherearedifferentresearchdomainswithinCLARIAH:linguistics,socio-economichistory,andmediastudies.EachworkpackagewithintheCLARIAHprojectplacesatthecentreofdevelopmentboththetechnicalrequirementsofeachmediatype(text,structureddata,audio-visualmedia),aswellasthespecificresearchneedsoftheirusercommunities.

TheCLARIAHMediaStudiesworkpackagefocusesoncreatingaresearchenvironment,theMediaSuite(CLARIAHMS)62,aspartoftheCLARIAHinfrastructureaimingtoservetheneedsofmediascholarsbyprovidingaccesstoaudio-visualcollectionsandtheircontextualdata.ThispaperdescribestheapproachtakentobuildCLARIAHMS.

BackgroundCLARIAHMSincorporatesaseriesofDigitalHumanities(DH)toolsandaimstomakethemsustainable.PrototypesarecurrentlyhostedonanewinfrastructureattheTheNetherlandsInstitute

61 http://www.clariah.nl/62 http://mediasuite.clariah.nl/

Page 79: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

79

forSoundandVision(NISV)datacentre.Theseprototypesare:AVResearcherXL,TROVe,CoMeRDa,OralHistoryToday(OHT)andDIVE+.Furthermore,CLARIAHMSaimstosupportaudio-visualarchivesinopeningupcollectionsinamorestandardizedway.Oncetheseobjectiveshavebeenaccomplished,scholarswillbeabletosearchandanalysethesecollectionsviaacentralworkspace,thus,enablingdataintensiveresearchinthehumanities.

AVResearcherXLisanexploratorytoolwhichenablessimultaneousqueriesandanalyticvisualizationsofthecollections´metadata(VanGorpetal(2015)).TROVewasdevelopedtoeasethecombinedaccessandvisualizationofarchivalcollectionsandonlinesocialmedia.CoMeRDaisawebbasedaggregatedsearchsystemforvisualizingsearchresults(Bronetal(2013)).OHTisaprototypeforsearchandenrichment(throughAutomaticSpeechRecognitiontechnology)ofdistributedOralHistorycollectionsinTheNetherlands(OrdelmananddeJong(2011)).Finally,DIVE+isalinked-datadigitalculturalheritagecollectionbrowserwhichprovidesaccesstoheritageobjectsfromheterogeneouscollections,usinghistoricaleventsandnarrativesascontextforsearching,browsingandpresentingtheobjects(deBoeretal.(2015)).

Thesefivetoolssupportscholarsinthe“exploration”and“contextualization”phasesoftheirresearch,aframeworkproposedin(Bronetal.(2015)).Theoriginaltoolscouldnotinteroperateanddidnotoperateonthesamedata,whichlimitstheirpotential.Recreatingtheminasingleconfigurableenvironmentmakesitpossibletoreusefunctionalitiesacrossdatasetsandtoreusedataacrossfunctionalities.

CLARIAHMediaSuiteTheDHcommunityincludesscholarswithawidediversityofresearchinterestsandgoals;everyresearchgroupinDHisworkingwithdifferenttypesofdataandtheirresearchobjectiveshavespecificrequirementswhichcannotbeeasilyfacilitatedbytoolsusingasingle,genericapproach.Simultaneously,therearesimilaritiesinthemethodsusedbydifferentscholars(deJongetal.(2011))thatcanbeusedforgeneralisedtooldevelopment.Therearecommonalitiesinresearchquestionsandmethodsamongmediascholars,whichwegroupedintoMediaaesthetics,Socialhistoryofmedia,Aesthetichistoriography,Socialandculturalhistory,Mediarepresentationsorcoverage,Transmediaanalysis,andMemorystudies(Melgaretal.,2017).

Figure1.CLARIAHMSconsistsoffunctionalities,APIsandrecipes,version1,April2017

Page 80: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

80

Agenericinfrastructureisrequiredtocaterforthegeneralneedsofeveryusergroup.Theinfrastructureneedstoincorporateflexiblefunctionalitycapableofaddressingveryspecializedresearchquestions.Mediascholarsexpressedtheirdesiretousethecollectionsandtoolswhichwerepreviously“locked”togetherintheindividualprototypes.CLARIAHMShasbeendesignedinamodularway(Figure1);eachmoduleperformsasingle,well-definedtask.Modulescaninteroperatetoconstructmoresophisticatedfunctionality.

Metaphoricallyspeaking:whereaspreviouslyusershadaccesstopredefined‘meals’-toolswhichcouldperformcross-collectionsearchandvisualizetheresultsintheformoftimelines,wordclouds,snippetsand/orthumbnails-wenowprovideuserswithsingleingredients(individualfunctionalitiessuchassearching),andready-maderecipes(combinationsofseveralfunctionalities).Someingredientsmaybeusedindifferentrecipes,existingrecipesmaybecomplementedbyaddingextraingredients.

MediaSuiteArchitectureCLARIAHMSconsistsoffourlayersoffunctionality,explainedbelow(Figure2):

Figure2-ArchitecturaldesignofCLARIAHMS.

DataSourcescontainthecollections(e.g.,televisionbroadcastsfromNISV,EYEJeanDesmetcollection,DANSOralHistorycollection).Allcollectionsareregisteredinacommoninventory(CKAN63)whichdescribestheirmetadata.CollectionsareavailableinElasticsearch(fulltextsearch)andRDFformat(semanticsearch).

APIsfacilitatetheinteractionwithdatafromvariouscollections:

• CollectionsAPI-high-levelcollectioninformation(metadata:dataformat,size,etc.)• SearchAPI-searchingforcollectionitems.• AnnotationAPI-annotatingexistingdatausingW3CWebAnnotationstandard(mainlyfor

manualannotations)(Melgaretal.,(2017)).• DataEnrichmentAPI-collectionenrichmentthroughautomaticmechanisms(e.g.nameentity

recognition)orbyhumaninteraction(e.g.crowdsourcing).

TheAPIsdesignallowstheintegrationofnewdataofdifferentformatsanddatamodels.

ComponentsinCLARIAHMSaresoftwareunitswhichperformasinglefunctionality:eachcomponenttakesdataasinputandproducesameaningfuloutputusingstandardformats,tobeconnectedwithothercomponents(e.g.,wordcloud,timelinevisualizations,topicidentificationinnewspapers,searchingcontentincollections).

Recipesclosethecirclebyintegratingcomponentstorecreatethefunctionalitiesoftheoriginaltools.Wefocusonprovidingthecomplexfunctionalityoftheoriginaltoolsintheformoffour

63 http://mediasuite.clariah.nl/datasources

APIs Data sources

Components Recipes

Page 81: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

81

'recipes'.Followingwiththemetaphorabove,theconceptofingredients(components)allowsresearcherstopreparetheirownpersonalrecipes(functionalities).

ConclusionInthispaperwehaveexplainedthestructureoftheCLARIAHMSandhowpreviouslydevelopedDHtoolsarebeingintegratedinasustainableinfrastructurethatallowsflexibleuseofdatacollectionsandfunctionalitiesfittingtheresearchneedsofscholars.Wehavealsosketchedourstrategytoenabletheintegrationofalternativefunctionalitiesanddatacollectionsusingamodularapproach(ingredientsandrecipes).FutureworkincludesuserevaluationofthefirstversionoftheMediaSuite(launchedinApril,2017),andco-developmentinvolvingsixCLARIAHresearchpilotprojects64.

References[Bronetal.(2013)]MarcBron,JasmijnVanGorp,FrankF.Nack,MaartendeRijke,LotteB.Baltussen.Aggregatedsearchinterfacesinmulti-sessiontasks.SIGIR2013:36thinternationalACMSIGIRconferenceonresearchanddevelopmentininformationretrieval.Dublin:ACM(2013)

[Bronetal.(2015)]MarcBron,JasmijnVanGorp,andMaartenRijke.Mediastudiesresearchinthedata-drivenage:Howresearchquestionsevolve.JournaloftheAssociationforInformationScienceandTechnology(2015),https://doi.org/10.1002/asi.23458.

[deJongetal.(2011)]FranciskadeJong,RoelandOrdelman,andStefScagliola.Audio-visualcollectionsandtheuserneedsofscholarsinthehumanities:acaseforco-development.InProceedingsofthe2ndConferenceonSupportingDigitalHumanities(SDH2011),Copenhagen,Denmark,2011.CentreforLanguageTechnology,Copenhagen.

[Melgaretal.(2017)]LilianaMelgarEstrada,MarijnKoolen,HugoHuurdeman,andJaapBlom.Aprocessmodeloftime-basedmediaannotationinascholarlycontext.InACMConferenceonHumanInformationInteractionandRetrieval(CHIIR),Oslo,2017.

[OrdelmananddeJong(2011)]RoelandOrdelmanandFranciskadeJong.Distributedaccesstooralhistorycollections:Fittingaccesstechnologytotheneedsofcollectionownersandresearchers.InDigitalHumanities2011:ConferenceAbstracts,pages347–349,Stanford,2011.StanfordUniversityLibrary.URLhttp://purl.utwente.nl/publications/78347.ISBN=978-0-911221-47-3.

[deBoeretal.(2015)]VictordeBoer,JohanOomen,OanaInel,LoraAroyo,ElcovanStaveren,WernerHelmich,DennisdeBeurs:DIVEintotheevent-basedbrowsingoflinkedhistoricalmedia.J.WebSem.35:152-158(2015)

[VanGorpetal(2015)]JasmijnVanGorp,SonjadeLeeuw,JustinvanWees,BoukeHuurnink.DigitalMediaArchaeology-DiggingintotheDigitalToolAVResearcherXL.VIEW.JournalofEuropeanTelevisionHistoryandCulture/E-journal,4(7):38-53(2015)

64 http://www.clariah.nl/projecten/research-pilots

Page 82: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

82

SessionK

1.Digitallymediatedemotions:representationsandreinforcements

AncaȚenea-DoctoralSchool"Space,Image,Text,Territory"–CESI

In2014,aresearchconductedbyFacebookanalyzed689.003users’newsfeeds,bydeliveringmorepositiveornegativecontenttoeachuser(Krameretal.8788).Theconclusionsstatedthecontagiousfactorofemotionsindigitalspace,byshowingthatpeoplewhowereprovided,say,morepositivecontent,tendedtoexpressmorepositivityinreturn,bydistributingthemselvespositivemessages.Theassumptionsthatmachinesstimulateemotionsinordertoenhancethemwasthusreconfirmed.Yet,therearestillmanyquestionsemergingfromthecurrentstateofdigitalmedia.

TheresearchprojectIamproposinganalysestheconnectionbetweendigitalmediaandhumanemotions,asusersparticipateinadigitalspectacleinwhichtheygettheiractivepartthroughtheiremotions.Themanifestationsoftheiremotionsarereadandinterpretedbytheplatforms'algorithmsinordertorespondbyprovidingaspectacleaccordingtotheuser’sdesires,beliefs,andactions.Iproposeatheoreticalapproachtothewayhumanemotions,feelings,andaffectsaremirroredandenactedindigitalspace.Inlightofthenewtheoriesregardingtheimportanceofsentimentminingandaffectivecomputinginshapinghumanknowledge,affectsandbehavior,Iarguethenecessityofanalysinghowemotionsarestimulatedinnetworkedpublics(Boyd)inordertoenhancetheparticipationtothedigitalspectacle.

Thedigitalspectacleaddressedinthisresearchreferstothecollectionofinformationthatreturnstotheuserintheformofpersonalizedcontent,inresponsetotheironlineactivity,andtotheinformationtheydisplayviatheInternet.Transformingemotionsintodataconstitutesapivotalmechanismfordigitaltechnology,wheretheuserisnotonlythespectatorbutafullyengagedparticipant.Interestingly,Iargue,thishasthepotentialtorevealthemechanismthroughwhichtheuserrelatestotheLacanianOtherinanonlineprocessofvirtualidentityformation.TheelementsthattriggeraFreudiandriveandencouragetheinteractionwiththeLacanianOther,aswellasthecharacteristicsofdigitalplatformsmeanttoprovidefantasmaticandidentitaryprojectionswillalsobeexamined.

Myresearchfocusesonaseriesofplatformsandsoftwaresthatensuresuchinteractions.Forinstance,IaminterestedinhowthePersadocognitivemechanismsaresettodetectemotionsthroughsentimentmining,forthebenefitofcommercialadvertising.Persadoidentified16emotionsastriggersforuseraction,andthecasestudiespostedontheirwebsiteshowthattheirstrategyincreasinglyimprovedmarketingperformancefordifferentcampaigns.Aquestionoccurs:howarethesecommercialmessagesbecomingtriggersforuserreaction?

IamalsoenquiringintoFacebook’sreactionbuttonsand“Onthisday”featureeffectsonsentiment,affectandemotionreinforcement.Emotionsaredeeplyconnectedandinfluencedbysocialnorms,whichdictatehowoneshouldfeel,andbybehavioralcodes,whichinfluencespeopleonexpressingemotions(BenskiandFisher3).IarguethatFacebook,bymeansofitsarchitecture,isbuilttoenhancefeelingsandaffects,andexposea

Page 83: Abstracts DHBenelux Wednesday - Universiteit Utrecht · Abstracts DHBenelux 2017 conference Wednesday 5 July 2017 Session A 1. TRACING TEXT TYPES IN BIBLICAL HEBREW ... which Harald

83

usertobothsocialnormsandbehavioralcodes,inductedbyotherpeople’sposts,magazineandbrandspostsandads.Thereactionsbuttonsstimulatetheiremotionstodifferentsituations:like,love,angry,sadandwowareverysimilartotheonesdefinedbyPaulEkmanin1972,in“EmotionintheHumanFace”(surprise,fear,sadness,happiness,disgustandrage),askeyandprevalentemotionsinhumanbehavior.

ThereactionssymbolsalsopartlycorrespondtoemotionspostulatedbyJacquesLacanandMelanieKlein,asfundamentalresortsofidentityconstruction,suchasfear,pleasure,anxiety,fury,joyetc.Thedigitalpreponderantemotionscanbelinkedtothegoodandbadobjects(MelanieKlein),throughthemanifestationofthethingsweobserveandinteractwithfromthedigitalscreen.Iwouldarguethateventhoughthespeedofonlineemotionalreactivenessishigherthaninreallifeinteractions,theoperativemechanismisbasiclythesame.

Thisopensanotherdiscussion,onthevisualtriggersinsocialmedia,throughpsychoanalyticaltheories.Sincethedigitalscreeniscomposedofobjectsofdesires,orLacanian“objetspetita”,thedigitalinterfacescanbeinvestigatedthroughvisualpleasureandidentification.Technologyisoftencomparedwithfetishism,whichinvolvessyncingdifferentsymbolswithobjectsorpleasure.Thisidea-statesAndreNusselder(19)-issupportedbythefactthetechnologiestranscendthelimitsofregularlife,offeringpleasure,openingendlesspossibilities.Hedescribesthisaspectasahallucinatoryimaginationofrealitybecausedigitaltechnologiessynchronizehumanswiththepleasureprinciple,postulatedbyFreudandreinforcedbyfurtherpsychoanalyticaltheoriesasthemotorofhumanpleasure.Thespectatoristhereforenolongerpassive,astheycontinuouslyinteractwiththescreen,influencingthecontent.Imagesandsymbolsonlinerepresenttheobjectsofdesire,whicharepartoftheImaginarythatsimulatesandstimulates,creatingthedigitalfantasy.

Platformsaimtoofferusersaspectaclecompatibletotheirconceptualapparatus,reinforcingfamiliarmythologiesandbeliefs,aswellastheirregisteredcommondesires.Inthelightofthequestionsonwhydopeoplereacttocertaincontent,beitonsocialmedia,newslettersorotherparticularadvertisingmessages,Ifinditlegitimatetoaskwhicharethetriggersthatmakeausertakeanaction.

WorksCitedBoyd,Danah(2010)."SocialNetworkSitesasNetworkedPublics:Affordances,Dynamics,andImplications."NetworkedSelf:Identity,Community,andCultureonSocialNetworkSites(ed.ZiziPapacharissi),2010,pp.39-58,www.danah.org/papers/2010/SNSasNetworkedPublics.pdf.

Fisher,Eran;,Benksi,Tova,InternetandEmotions:Routledge,NewYork,2014

Kramer,AdamD.I.,JamieE.Guillory,andJeffreyT.Hancock."Experimentalevidenceofmassive-scaleemotionalcontagionthroughsocialnetworks."SocialSciences-PsychologicalandCognitiveSciences,vol.111,no.24,pp.8788-8790,www.pnas.org.

Nusselder,André,InterfaceFantasy:ALacanianCyborgOntology,Cambridge:TheMITPressCambridge,2009.