abstracts dhbenelux wednesday - universiteit utrecht · abstracts dhbenelux 2017 conference...
TRANSCRIPT
1
AbstractsDHBenelux2017conferenceWednesday5July2017
SessionA
1.TRACINGTEXTTYPESINBIBLICALHEBREWWidovanPeursenEepTalstraCentreforBibleandComputer(ETCBC),VrijeUniversiteitAmsterdam
IntheNWO-fundedproject“TracingSyntacticDiversityinBiblicalHebrewTexts”,theEepTalstraCentreforBibleandComputerinvestigatesthevariousvariablesthataccountforthelinguisticvariationthatcanbeobservedintheHebrewBible,acollectionofwritingsthatwerecomposedoveraperiodofaboutamillennium.Oneoftheparameterstakenintoaccountistexttype.
Varioustypesofcommunicationshowdifferentusagesofthelanguage.Thelanguageofnarrativesisdifferentfromthatoflegalorsapientialtexts.However,foralinguisticanalysisaclassificationbasedongenre(“story”,“laws”)isinsufficient.Genremaysuggestacertaintexttype(e.g.afairytaleis“narrative”),butwithinonetextvarioustexttypesmayoccur:inastory,thecharactersmayusediscursivetextinquotedspeech;inpsalmsordirectspeechsections,ashortstorymaydevelop.Sometimesthenarratoraddressesthelisteners/readersdirectlyandswitchesfromnarrativetodiscursivespeech.Thishappens,e.g.,whenafairytaleendswithUndwennsienichtgestorbensind,solebensieheutenoch.1
InBiblicalHebrew,thereisa“narrativetense”(wayyiqtol)similarto,e.g.,theFrenchpassésimple,whichHaraldWeinrichusedfordistinguishingtwoTempusregister:besprechenanderzählen(correspondingmoreorlesstoÉmileBenveniste’shistoireanddiscours).Weinrich’sfocuswasonRomanlanguages.ThroughtheworkoftheSemiticandEgyptainscholarHans-JakobPolotskyhisworkfoundanentranceinEgyptianandSemiticstudiesandthroughArielShisha-Halevy,oneofPolotsky’sstudents,alsoinCelticstudies.WolfgangSchneider2introducedWeinrich’sinsightsintoBiblicalstudies,wheretheywerefurtherdevelopedbyEepTalstra,3AlvieroNiccacci,4GinoKalkman5andothers.
Buildingoftheworkofthesescholars,weusethecategory“texttype”asafeatureinourlinguisticdatabaseoftheHebrewBible.WedistinguishbetweenNarrative(N),Quotation(Q)andDiscursive(D).Weconsidertexttypeafeatureofaclause,ratherthanofalargerliteraryunit,sothatwecaneasilyhandletexttypeshiftswithinoneandthesameliteraryunit.Weassignthelabelsonthebasisofsyntax,ratherthanliteraryconsiderations.E.g.:aclausecontainingwayyiqtolisassignedthetexttypelabelN;aclausecontainingavocativeora1stor2ndpersonreferencethelabelQ;theso-calledHebrewImperfectinterruptinganarrativeyieldsthelabelD,indicatingthosecaseswherethenarratordirectlyaddressesthereaders.
1 HaraldWeinrich,Tempus:BesprocheneunderzählteWelt(München:H.C.Beck,1964).2 WolfgangSchneider,GrammatikdesbiblischenHebräisch(München:Claudius,1974).3 E.g.,EepTalstra,“TextGrammarandHebrewBibleI”,BibliothecaOrientalis35(1978),168–175.4 E.g.,AlvieroNiccacci,TheSyntaxoftheVerbinClassicalHebrewProse(Sheffield:SheffieldAcademicPress,
1990).5 G.J.Kalkman,VerbalFormsinBiblicalHebrewPoetry:PoeticFreedomorLinguisticSystem?(PhD
dissertation,VrijeUniversiteitAmsterdam,2015).
2
Theuseofonetexttypewithinanother,resultsinaccumulativelabels.Thus,adirectspeechwithinanarrativedomainreceivesthelabelNQ.Sometimesadirectspeechmaybequotedinanotherdirectspeechsection,resultinginthelabelNQQ.Withinadirectspeechanarrativetexttypemayappear,whichintroducesashortstoryinthemouthofoneofthecharacters,resultinginthelabelNQN.SuchaSprosserzählung(Schneider)mayagaincontainadirectspeech,resultinginaNQNQtexttype.Cf.2Kings1:6.
N They[themessengers]said(wayyiqtol)tohim[theking]
NQ “Therecameamantomeetus,
NQN andsaid(wayyiqtol)tous
NQNQ “Gobacktothekingwhosentyou,andsaytohim,
NQNQQ “ThussaystheLord
NQNQQQ “IsitbecausethereisnoGodinIsraelthatyouaresendingtoinquireofBa’al-zebub,thegodofEkron?
InanearlierphaseoftheETCBC,thetexttypeswereassignedbyhumanresearchersininteractiveprocedures.Recentlywechangedthisworkflowanddevelopedalgorithmsforautomaticallyassigningtexttypesbasedontheabove-mentionedsyntacticobservations.Itisinstructivetoseethedifferencesbetweentheformerassignmentoftexttypesinthehuman-computerinteractionandthecurrentautomaticassignment.OnesuchcasewheretheautomatictexttypeassignmentleadstoadifferentanalysisoccursinIsaiah3:14–15,wherewefindQwithoutanexplicitdirectspeechintroduction:“TheLordwillbringhischargeagainsttheeldersandofficersofHispeople:“Itisyou…”.”HeretheprogramassignedaQonthebasisofthe2ndpersonform.Thishadescapedthehumanresearcher.
Theautomaticassignmentoftexttypeshasseveraladvantages:
• Thewell-definedformalcriteriarenderthetexttypeassignmentstraceableandrepeatable.• Thestrictapplicationofformalcriteriarevealsphenomenathatescapehumanintuition(cf.
examplefromIsaiah3:14–15).• Intheabove-mentionedNWOprojectitprovedtobeagoodstartingpointforinvestigatingto
whatextenttexttypeaccountsforlinguisticvariationintheBible.• Itmakescomplextextuallayersandembeddingvisible(cf.2Kings1:6)
However,therearealsosomechallenges:
• Thereistheriskofcircularargumentbasedoninterrelatedlabelssuchas“narrativetense”and“narrativetexttype”.
• Thetexttypeassignmentsmayleadtocounter-intuitivecomplexlabelssuchasQNDNDN(Psalm78:45)orNDNDN(Isaiah9:19).
• Thisapproach,whichweinheritedfromthestartoftheETCBC40yearsago(whenthecreationofourHebrewdatabasestarted)runstheriskofbeingsomewhatidiosyncratic,relyingtoomuchononesinglestudyfromthe1960s.
Wetrytoparrythesechallengesby
• Analysingtexttypesininteractionwithotherparameters,suchasgenre.InourstatisticalanalysisofthedistributionoflinguisticphenomenainR,wetakeintoaccountallotherkindsofvariables,aswellastheirpossibleinterdependence.
• InvestigatingifandhowwecanharmonizeWeinrich’suseful,butalsosomewhatprovocativeandoutdatedviews(e.g.hisdenialoftheexpressionoftenseandaspectasfunctionsoftheverb)
3
withcurrentinsightsabouttenseandaspectinBiblicalHebrew,6andmorerecentstudieson“DiscourseModes”andtheirlinguisticcorrelates.7
Thisresearchprovidesthusaninterestingcasestudyoftheinteractionbetweenlinguistictheoryandtextualanalysis,oftheconfrontationbetweenresearchtraditionswithinacertaindisciplineandadata-drivenapproach,andofthepotentialandlimitationsofautomatedanalysisofancienttexts.
2.Automatinggenreclassificationofhistoricalnewspaperarticles.Mappingthedevelopmentofjournalism’smodesofexpressionFrankHarbers–UniversityofGroningenJulietteLonij–DutchNationalLibrary(KB)
ThispaperdiscussesamachinelearningapproachtoautomatethegenreclassificationofDutchhistoricalnewspaperarticlesandreflectsonthechallengesanditsvalue.First,wediscusshowweusedanexistingsetofmetadatatocreateatrainingsetforthegenreclassifierandthechallengeswefacedinconnectingthemetadatatotheoriginaldigitizedhistoricalnewspaperarticles.Subsequently,thepaperoutlinesamachinelearningapproachtopredictthegenreofanewspaperarticles,discussingandevaluatingthedifferenttoolsthatweretestedintheprocess8.Finally,itreflectsonthewayatraditionalrule-basedapproachtodetermininggenrerelatestoamachinelearningapproach.
ExamininggenreDefinedas“languageuseinaconventionalizedcommunicativesettinginordertogiveexpressiontoaspecificsetofcommunicativegoalsofadisciplinaryorsocialinstitution,whichgiverisetostablestructuralforms”(Handford2010),genrecanelucidatetheunderlyinggoals,normsandpracticesofjournalismasadiscourse.Examiningjournalisticgenresfromahistoricalperspectivethereforeelucidateshownewspapers’conceptionofjournalismdeveloped.Yet,thistypeoflongitudinaltextualresearchishighlytimeconsumingandstillscarce.Moreover,thefewattemptstosystematicallyexaminenewspapermaterial,usingsocialscientificmethodssuchasquantitativecontentanalysis,stillonlycoverafractionoftheavailablematerial(Broersma,2011;Harbers,2014).
Automatingsuchcontentanalyseswouldbehighlybeneficialforresearchintothediscursivedevelopmentofnewspaperjournalism.Thispaperthereforecriticallydiscussesanapproachtoautomategenreclassification.Thisisadauntingtaskasgenresaredynamicandcanchangeorfadeawayovertimewhilenewonescanemerge.Moreover,genresareidealtypes,whichmeansthetextualmanifestationsdonotalwaysmatchallthecharacteristicsperfectly,norcantheyalwaysbeclearlydelineatedfromothergenres.
6 E.g.,JanJoosten,TheVerbalSystemofBiblicalHebrew.ANewSynthesisElaboratedontheBasisof
ClassicalProse(JerusalemBiblicalStudies10;Jerusalem:Simor,2012).7 E.g.,CarlotaS.Smith,ModesofDiscourse.TheLocalStructureofTexts(CambridgeStudiesinLinguistics
103;Cambridge:CambridgeUniversityPress,2003).8 ThesourcecodefortrainingtheclassifierandapplyingittonewexamplesisavailableonGitHub
(https://github.com/jlonij/genre-classifier)andeverybodycanexperimentwiththeclassifierthroughagraphicalwebinterfacecreatedathttp://www.kbresearch.nl/genre
Thisdatasetwastheresultofa
large-scaleresearchprojectintothehistoricaldevelopmentofEuropeannewspaperswiththetitle‘Reportingattheboundariesofthepublicsphere.Form,StyleandStrategyofEuropeanJournalism,1880-2005’.
4
AmachinelearningapproachtoautomaticallyclassifygenreBuildingonanexistingsetofmanuallycodedmetadata,describingseveraltextualcharacteristics,
suchasgenre,ofalargesample(33.000)ofhistoricalnewspaperarticles2,thispaperoutlinesa
machinelearningapproachtoautomatethegenreclassificationofhistoricalnewspaperarticles.Thisdatasetthusprovideduswithmetadataaboutanumberofhistoricalarticlesthatwasusedtotrainandformallyevaluateaclassifierthatisabletoautomaticallypredictthegenreofadditionalsamplesofhistoricalnewspaperarticles.Yet,theexistingmetadataneededtobelinkedtothecorrespondingdigitizedarticlesinthedigitalnewspaperarchivesoftheKB.
Thepaperwillfirstdiscussthislinkingprocess.Wefirstselectedthemostpromisingcandidatelinksforeachitemintheoriginaldataset,basedonthepositionofthearticleonthepage,itssize,andthepresenceofimagesandquotes.Asimpleclassifierwasthentrainedtoselectthebestlinkfromthecandidateset,ifany,basedonmoreprecisefeaturessuchasthesizedifferencebetweenthearticleandthecandidate,aswellasauthormentionsandsubjectmatter.Byonlyacceptinglinkspredictedwitharelativelyhighconfidencevalueapproximately50%ofallarticlescouldbeautomaticallylinked,withanerrorrateof0.5%.
Subsequently,wewilloutlineanddiscusshowtheresultingdatasetwasusedtotraintheactualgenreclassifier.Afterthearticleswerepre-processedwiththeNaturalLanguageProcessingsuite‘Frog’,theannotatedtextswereexaminedfortheirtextualfeatures,includingthelengthofthearticle,thenumberofdirectquotes,thenumberofadjectives,varioustypesofpronouns,andthenumberandpositionofnamedentitiesinthetext.Theselectionofthesefeaturesisbasedonthegenredefinitionsofthecodebookofthemanualcontentanalysis.
Thesefeatureswereusedtotrainaclassifiertochooseoneofeightpossiblegenresforeacharticle,rangingfromnewsreporttoopinionarticle.Weevaluatedtheperformancethrough10-foldcross-validation,usingstratifiedsamplingtocreaterelevantsubsets.AlinearSVMclassifierwaschosenaftercomparisonofvariousevaluationmetricswithanumberofotheroptions(NaïveBayes,non-linearSVMsandsomesimpleneuralnetworks),yieldingthebestresultswithanaccuracyof65%.Itisimportanttonoteherethathumancodersdonotalwaysagreeonwhattherightgenreis.Theintercoderagreementforgenreinthemanualcontentanalysiswasaround80%(Krippendorf’salpha,takingintoaccountchance,wasbetween0.7and0.8indifferentgroupsofcoders).Assuch,65%isconsideredaverypromisingresult.
Finally,wereflectontherelationbetweenarule-basedandmachinelearningapproachtotheclassificationofgenre.Wewilldiscussthesignificanceofindividualfeaturesinthemachinelearningprocessandshowhowthe‘confusionmatrix’providesvaluableinformationaboutthecommonmistakesoftheclassifierandwhichgenresaremostdifficulttopredict.Moreover,astheprobabilityforthepredictedgenreaswellasfortheothergenresisknown,wewilldiscusshowthesenumbersofferinsightsinthedynamicnatureofjournalisticgenres.
Bibliography-Broersma,M.(2011).‘Nooitmeerbladeren.Digitalekrantenarchievenalsbron’.In:TijdschriftvoorMediageschiedenis14(2):29-55Handford,M.(2010).‘Whatcanacorpustellusaboutspecialistgenres’.In:‘oKeeffe,A.&McCarthy,M.(eds.),TheRoutledgeHandbookforCorpusLinguistics.NewYork:Routledge.-Harbers,F.(2014).BetweenPersonalExperienceandDetachedInformation.TheDevelopmentofReportingandtheReportageinGreatBritain,theNetherlandsandFrance,1880-2005.PhDUniversityofGroningen
5
3.GeneratingInteractiveNarrativesfromWikipediaArticlesKeywords:ComputationalCreativity,InteractiveNarratives,Wikipedia,Chatbots
BenBurtenshawComputationalLinguistics&PsycholinguisticsResearchCenterUniversiteitAntwerpenbenjamin.burtenshaw@uantwerpen.be
TomDeSmedtExperimentalMediaResearchGroupSintLucasAntwerpenSchoolofArtsinfo@emrg.be
MikeKestemontComputationalLinguistics&PsycholinguisticsResearchCenterUniversiteitAntwerpenmike.kestemont@uantwerpen.be
Storiesplayavitalroleinthelivesofchildren.Thealternativeworldstheyproduceencourageimaginationandcreativity,butalsotransformknowledgeintostructuresthatchildrencanunderstandandrelateto.WepresentaninteractivestorysystemthatcreatesnarrativesfromWikipediaarticles,andrevealsthemthroughdialoguewithauser.Usingstate-of-the-artnarrativegenerationtoolsandachatbotdialoguesystem,informationfromWikipediaisrevealedtothechildbasedontheirinput.GeneratingnarrativesfromanytexthaslongbeenagoalofArtificialIntelligenceresearchersbecausenarrativestructuresareusefulforlearnersofallages.However,manyoftheexistingstorygenerationsystemshavereliedonhand-writtentechniquesthatcannotmeetthescaleofdataonline.Inrecentyearssearch-basedsystemshavebeenabletoincorporatebroadertopics,buttheyhavesacrificedcontinuity,producingfragmentednarrative.Hereweproposeasearch-basedsystemthatscoursWikipediaforarticlesrelatingtouserinput,andthenrestrictsitsgenerationmaterialtothatarticle;indoingso,thesystemutilisesthedefinedtopicandchronologyofthearticletoretaincontext.
Narrativegenerationhasbeenacentraltopicwithinthefieldofcomputationalcreativityfordecades.OneofthefirstexamplesisTale-Spin,asystemthatgeneratesAesop'sFablesguidedbytheuser’sinput(Meehan1977).Tale-spinproducesaninnovativeformofinterface;however,itstrugglestodealwithundirecteduserinput.UniversebyMichaealLebowitzdrawsonadatabaseofcharacterdefinitions,plotoutlines,anddialogues,toweavetogethernewstories;however,thesystemhasatendencytobecomerepetitiveduetoitslimitedcontent.CallawayandLesterproducedStorybook,anarrativeprosegeneratorthatidentifiesthetemporalmarkersinatext,andusesthemtogenerateanewnarrativetext(CallawayandLester2002).Systemsofthiskindtendtoreproducenarrativetropes,andultimatelybecomeboring(SwansonandandGordon2008).McIntyreandLapatadevelopedoneofthefirstsearch-basednarrativesystems,thatusesuser-inputtosearchadatabaseforrelatingphrases(McIntyreandLapata2009).Thisarchitectureproducesinterestingrelationshipsbetweenphrases,butistosporadictocreateameaningfulnarrative.Tocounterthesesemanticfluctuations,RiedlandBultikousedialoguetoguidethegeneratedtext(RiedlandBulitko2012).However,bytheirownadmission,thisbecomesmonotonoustotheuser,withlittleroomforsurprise.
WeproposeadialoguesystemthattakesinputfromtheuserandgeneratesastoryinrelationtoarelevantWikipediaarticle.Thisusesasearch-basedretrievalapproachsimilartoprevioussystems(RiedlandBulitko2012;McIntyreandLapata2009);however,wetakeadvantageofWikipediafortopicandcontextualgrounding.Forexample,bytakingthepropernounsinatemplatesentence,andreplacingthemwiththosefromaWikipediaarticle,thesystemproducesastoryfeaturingplacesand
6
charactersfromhistory:Theusermightask“WhydidtheEgyptianshavepyramids?”,thesystemwouldusekeypointsintheWikipediaarticle‘PyramidsofAncientEgypt’,toassembleasetofnarrativeconcepts,andquerytheusertoseeiftheyareinterestedinthosetopics.
Wehavetrainedthesystemonchildren’sstories,whichmeansitusesamachinelearningapproachtoselectresponsesthataremostsimilartothosestories.AdialogueisstartedbyinputtingatopicwhichisusedtoretrieveacompleteWikipediapage.Thepagestructureguidesthedialoguesequence,andthelanguageisusedtocreateresponses.Firstthesystemdefinesasetofnarrativeconceptsbasedonthesectionsofthewikipediapage,thenitusesgrammarbasedtechniquestocreatephrasesforeachconcept.Thenarrativeconceptsarevectorbasedrepresentationofeachpagesection,whichareusedtocomparethemostimportantstringsfromasection.Dialoguebeginswhenthesystemhasbuiltadatabaseofpossibleresponsesthatarestoredinadatabase.Asthedialogueprogressesthecorephrasesareaddedtoandremoved;ineffect,actingasanarrativecontext.DialogueisfacilitatedbyaMarkovDecisionProcessthatmatchestheuser’sinputtopossibleresponsesbasedonarewardfunctionfromtraining.Dialoguehistoryisalsoaddedtothedatabase,whichmeansthesystemlearnsfromthedialogueitself,andalsoavoidsrepetition.
Usingnamedentityrecognitiontoaddnamesandplacesgroundsthenarrativeofthestory.Thispracticedrawsonpedagogicaltheoriesofinquiry,thatproposechildrenlearnefficientlythroughselfinitiatedrequestsforinformation(Conle2000;Mcquigganetal.2008).Childrenareabletoconstructthisinformationintotheirownstory,whichinturnfortifiesthatknowledge.Inpractice,anarrativesystemlikethiswouldneedtobecontextualisedforthechild;thiscouldbeafictionalcharacterisationofthesystemasnaiveandinneedofguidance.Forexample,theforgetfulrobotthatneedshelptoexplaintheirstory.
Narrativesareaprovenpartofeducation,andgeneratingthemautonomouslyisalong-termaimofartificialintelligence.Herewespeculateuponacontainedusageofnarrativegeneratingtechnology,whichusesthestructureofWikipediatorearticulatetextinaformsuitableforchildren.Overthenextfouryearsouraimistodevelopageneralstorygenerationsystemforchildren.AtDHBeneluxwewillpresentaworkingprototypeofthissystem.
ReferencesCallaway,C.,andJ.Lester.2002.‘NarrativeProseGeneration’.ArtificialIntelligence139(2):213–252.
Conle,Carola.2000.‘NarrativeInquiry:ResearchToolandMediumforProfessionalDevelopment’.EuropeanJournalofTeacherEducation23(1):49–63.doi:10.1080/713667262.
McIntyre,Neil,andMirellaLapata.2009.‘LearningtoTellTales:AData-DrivenApproachtoStoryGeneration’.InProceedingsoftheJointConferenceofthe47thAnnualMeetingoftheACLandthe4thInternationalJointConferenceonNaturalLanguageProcessingoftheAFNLP:Volume1-Volume1,217–25.AssociationforComputationalLinguistics.
Mcquiggan,ScottW,JonathanPRowe,SunyoungLee,andJamesCLester.2008.‘Story-BasedLearning:TheImpactofNarrativeonLearningExperiencesandOutcomes’.InInternationalConferenceonIntelligentTutoringSystems,530–539.SpringerBerlinHeidelberg.
Meehan.1977.‘TALE-SPIN,AnInteractiveProgramThatWritesStories’.In5thInternationalJointConferenceonArtificialIntelligence,91–98.
Riedl,MarkO.,andVadimBulitko.2012.‘InteractiveNarrative:anIntelligentSystemsApproach’.AiMagazine34(1):67.
Swanson,R.andGordon,A.2008.‘SayAnything:AMassivelyCollaborativeOpenDomainStoryWritingCompanion’.InProceedingsofthe1stInternationalConferenceonInteractiveDigitalStorytelling.LectureNotesinComputerScience.Vol.5334.Berlin:Springer.
7
SessionB
1.Linkingmulti-disciplinarydatasourcesforahistoricalresearchplatformKalliopiZervanou1,WouterKlein2,PetervandenHooff2,FransWiering1andToinePieters21Information&ComputingSciencesDepartment,UtrechtUniversity2FreudenthalInstitute,History&PhilosophyofScience,UtrechtUniversity
Theproblemofinformationaccessisachallengeinmakingdigitiseddatasourcesavailable.Historiansneedtoidentifyinformationindigitalmaterialpools,scatteredacrosscollectionsandoftenlackingsemanticlinkstoatopicofinterest.Thisproblemisaddressedbythedevelopmentofvariouscollection-specificmetadataschemas,suchasMARC21(LibraryofCongress,2010),andgenericmetadataschemas,suchastheDublinCoreMetadataInitiative(DCMI,2011).Moreover,diversemetadataschemasaremappedtoeachother(BountouriandGergatsoulis,2009),ortocustom(Liaoetal.,2010)orstandardontologies(Lourdietal.,2009),suchastheCIDOCConceptualReferenceModel(CIDOC,2006).Adominanttrendinrecentapproachesisthelinked-dataapproach(Berners-Lee,2006;Bizeretal.2009).Besidesinformationaccess,theamountandthecomplexityofinformationaccessiblegivesrisetoaninformationpresentationchallenge,wherebydataoverviewsshouldhighlightinterestingdataaspectsrequiringdetailedinspection.Additionally,fordigitalmethodstosupportcollaborativeresearch,theproblemofinformationvalidationandsharingmustbeaddressed.Thisissuecallsfortransparencyofdataandmethodsandreproducibilityofresults,orverificationoftheargumentsmade.Italsoentailsvalidationofcomputationalresultsandalgorithmicprocessesdetermininginformationaccess,insuchawaythattheeventualdataorsystemlimitationsandbiasesareknownandtheprocessesaretrustworthyandverifiable.
Inourwork,weaddressthesechallengesofinformationaccess,presentation,validationandsharingfromatwofoldresearchperspective:
I. Integrationandsemanticlinkingofexisting,multidisciplinarydatasources;II. Developmentofaresearchplatformthatsupportsaccess,presentation,validationand
sharingofcomplex,interlinkeddata.
OurparticulardomainofapplicationrelatestothehistoryofbotanicaldrugcomponentsfromtheNewWorldintheearlymodernperiod(17-18thcentury).Morespecifically,itconcernshighlightingphenomenadenotingdevelopmentalprocessesofremediesordrugtrajectories,suchastheevolutionofeconomicimportance,ethicalattitudes,scientificinterests,tradeandknowledgecirculation(Gijswijt-Hofstraetal.,2002;Pieters,2004;FriedrichandMüller-Jahncke,2009;Klein&Pieters,2016).
Forthispurpose,weintegratesourcescomprisingofpharmaceuticaldata,suchasthePharmaceuticalHistoricalThesaurus(Klein&vandenHooff,2013),archaeobotanicaldata,suchasRADAR(vanHaaster&Brinkkemper,1995;RCE,2013),botanicaldata,suchastheNationalHerbariumoftheNetherlands(Creuwels,2014),theEconomicBotanydatabase(Hoffman,2011)andtheSnippendaalCataloguedatabase(vanReenen,2007),colonialtradedata,suchasthedatabaseoftheaccountingbooks(Boekhouder-Generaal)oftheDutchEastIndiaCompany(Schooneveld-Oosterlingetal.,2013)andlinguisticdictionaries,suchastheChronologicalDictionaryofDutch(vanderSijs,2001).
AnotablerecentapproachtotheissueofdigitalformatsintegrationistheoneadoptedintheTimbuctooinfrastucture(Andersen,2013).Mostapproachesoptforconversiontoarecommendedmetadataschema,suchasSKOS(Miles&Bechhofer,2009),oracommondatamodelsuchastheEuropeanaDataModel(EDM,2016).However,apartfromthediversityofdigitalformatsan
8
importantaspectinintegrationliesinthereuseandre-purposingofresourcesoriginallybuiltforadifferentaudienceandpurpose.
Inourapproach,integrationentailsconceptmapping,notonlyacrossdisciplines,butalsointime.Thus,datasourceintegrationcallsforsupportfortheevolutionofsciencefromthe16thcenturyonwardstore-classifyandre-defineconcepts.Additionally,itentailsdealingwithphenomenaofhistoricaltermvariationandambiguitywhichgraduallygivewaytospellingstandardisationandcurrentnomenclatureconventionsine.g.botanyandbiology.Furthermore,weaccountforunder-specificityandambiguityofinformationfoundinhistoricalsourceswhilemaintainingassociationswithpotentiallyrelatedconceptsandcontext.Mostimportantly,weprovidereferencesforinformationprovenancetracingandvalidation.Forthesepurposes,weresorttodesigningourownontology,wheree.g.ambiguoustermsareconnectedtomultipleconcepts,temporalperiodsandreferencesources,andwheremappingsareprovidedacrossessentialhistoricalandcurrenttaxonomies.Ourdatasourcesaresemi-automaticallyenrichedwithadditionalinformation,suchasgeographicalcoordinatesandnamedentities.Moreover,inconsistencieswithinandacrossdatasetsaresemi-automaticallyidentifiedandnormalised.Finally,datasourcesareintegratedfollowingalinkeddataapproachallowingforextensionstootherlinkedopendataandeventuallycapitalisingontechniquessuchasreasoning,whichmayextendexplicitinformationinourdatasetswithimplicitlyinferredinformation.
OurTimeCapsuleresearchplatform9implementsoursolutionstoinformationaccess,presentationandvalidationchallenges.Itisascalableworkingplatformcurrentlyqueryingmorethan55millionRDFtriples.Itisoftendifficultforanon-expertusertoperformqueries,eitherbecausetheyareunfamiliarwiththerequiredterminology,orbecausetheyareunfamiliarwiththeunderlyingdatamodel.Oursolutiontothisissueliesinprovidingtwoqueryingstrategies,onethatsupportsafaceted,exploratory,guidedsearchandbrowsingofinformationbymeansoflinks,photos,andkeywordauto-completionsuggestionsandonethatsupportsthecreationofadhocqueries.Ourexploratorysearchmodeisintendedtoengageawideraudienceandrevealtobothexpertsandnon-expertuserstheunderlyingdatacontentandstructure.AdhocqueriesareinessenceadhocRDFSPARQLqueries(Prud'hommeaux&Seaborne,2008)toourdata.However,giventhatmostusersareneitherfamiliarwithSPARQL,norwiththecontentandstructureofourdatastore,aquerywizardisprovidedthatassistsusersinformingnaturallanguagequeries,suchas“Whichdrugcomponent(s)aremadeoutoftheplantAcoruscalamusL.andwhichpartsoftheplantwereused?”
Searchresultsarepresentedasanoverviewofallavailableinformationonthequerytopicandusersmay"zoom-in"onspecificinformationbyfollowinglinksthatprovidemoredetailedgeographical,temporalandconceptrelationvisualisations.Suchvisualisationsaremainlyintendedtoprovideoverviewsintheevolutionofphenomenarelatedtodrugtrajectories,suchasforinstancechangeinaplantpartusedasmedicalingredient,traderoutesofbotanicalproducts,orgeographicaldistributionsintimeofknownconceptsinLatin/scientifictermsvs.layterms,thelatterindicatingpublicknowledgeandfamiliaritywithagivenplantordrug.
ReferencesAndersen,J.A.,Filarski,G.J.,HaentjensDekker,R.,Maas,M.&Ravenek,W.(2013).Timbuctoodatarepositoryinfrastructure(version1.0),HuygensING–ICT,Amsterdam,TheNetherlands.
Berners-Lee,T.(2006).LinkedData.Documentversion:June2009.In:DesignIssues,W3C.Availableonlineat:https://www.w3.org/DesignIssues/LinkedData.html
9 TimeCapsulesystem:http://timecapsule.science.uu.nl/timecapsule/#/loginLogginginasaGuestallows
fullaccesstothesystemfunctionalityexceptsavingyoursearchresults.
9
Bizer,C.,HeathT.andBerners-LeeT.(2009).LinkedData-TheStorySoFar.InternationalJournalonSemanticWebandInformationSystems,vol.5(3),pp.1-22.DOI:10.4018/jswis.2009081901
BountouriL.andGergatsoulisM.(2009).Interoperabilitybetweenarchivalandbibliographicmetadata:AnEADtoMODScrosswalk.JournalofLibraryMetadata,9(1-2):98–133.
CIDOC(2006).TheCIDOCConceptualReferenceModel.CIDOCDocumentationStandardsWorkingGroup,InternationalDocumentationCommittee,InternationalCouncilofMuseums.Availableonlineat:http://www.cidoc-crm.org/.
Creuwels,J.(2014).TheNationalHerbariumoftheNetherlands.NaturalisBiodiversityCenter,Leiden.Availableonlineat:http://herbarium.naturalis.nl/
DCMI(2011).TheDublinCoreMetadataInitiative.Availableonlineat:http://dublincore.org/.
EDM(2016).EuropeanaDataModel–MappingGuidelinesv2.3,18November2016,EuropeanaNetworkAssociation.Availableonlineat:http://pro.europeana.eu/page/edm-documentation
Friedrich,C.andMüller-Jahncke,W.-D.(eds.)(2009).Arzneimittelkarrieren:zurwechselvollenGeschichteausgewählterMedikamente:dieVorträgederPharmaziehistorischenBiennaleinHusumvom25-28.April2008,Stuttgart:WissenschaftlicheVerlagsgesellschaft.
Gijswijt-Hofstra,M.,VanHeteren,G.M.andTansey,E.M.(eds.)(2002).Biographiesofremedies:drugs,medicinesandcontraceptivesinDutchandAnglo-Americanhealingcultures.Cliomedica66,Amsterdam:Rodopi.
vanHaasterH.andBrinkkemperO.(1995).RADAR,aRelationalArchaeobotanicalDatabaseforAdvancedResearch.VegetationHistoryandArchaeobotany,vol.4(2),pp.117-125,Springer.
Hoffman,B.(2011).TheNaturalisEconomicBotanydatabase.NaturalisBiodiversityCenter,Leiden.
Klein,W.andPieters,T.(2016).TheHiddenHistoryofaFamousDrug:TracingtheMedicalandPublicAcculturationofPeruvianBarkinEarlyModernWesternEurope(c.1650–1720).JournaloftheHistoryofMedicineandAlliedSciences,Vol.71(4),pp.400–421.DOI:10.1093/jhmas/jrw004
Klein,W.andvandenHooff,P.C.(2013).FarmaceutischeHistorischeThesaurus.NationalMuseumfortheHistoryofPharmacy,Utrecht.
Liao,S.-H.,Huang,H.-C.,andChen,Y.-N.(2010).Asemanticwebapproachtoheterogeneousmetadataintegration.In:ProceedingsofICCCI’10,LNCSvol.6421,pp.205–214,Kaohsiung,Taiwan.Springer.
LibraryofCongress(2010).MARCstandards.NetworkDevelopmentandMARCStandardsOffice,LibraryofCongress,USA.Availableonlineat:http://www.loc.gov/marc/index.html.
Lourdi,I.,PapatheodorouC.,andDoerrM.(2009).Semanticintegrationofcollectiondescription:CombiningCIDOC/CRMandDublinCorecollectionsapplicationprofile.D-LibMagazine,15(7/8).
Miles,A.andBechhoferS.(eds)(2009).SKOSSimpleKnowledgeOrganizationSystemReference.W3CRecommendation,18August2009.Availableonlineat:http://www.w3.org/TR/skos-reference
Pieters,T.(2004).Historischetrajectenindefarmacie:medicijnentussenconfectieenmaatwerk.Inaugurallecture–Hilversum.
Prud'hommeaux,E.andSeaborneA.(eds.)(2008).SPARQLQueryLanguageforRDF.W3CRecommendation,15January2008.Availableonlineat:https://www.w3.org/TR/rdf-sparql-query/
10
RCE(2013).RADAR,aRelationalArchaeobotanicalDatabaseforAdvancedResearch.RijksdienstvoorhetCultureelErfgoed,MinisterievanOnderwijs,CultuurenWetenschap.Availableonlineat:https://archeologieinnederland.nl/bronnen-en-kaarten/radar
vanReenen,G.(2007).Snippendaalcatalogusdatabase.HortusBotanicusAmsterdam.Availableonlineat:http://dehortus.nl/en/Snippendaal-Catalogue
Schooneveld-Oosterling,J.,Knaap,G.,Karskens,N.,Smit-Maarschalkerweerd,D.,Tetteroo,S.,vandenTol,J.,Nijhuis,H.,vanWijk,K.,Kunst,A.,Buijs,J.,Jongma,M.,Boer,R.(2013).Boekhouder-GeneraalBatavia.HuygensING.Availableonlineat:http://resources.huygens.knaw.nl/boekhoudergeneraalbatavia
vanderSijs,N.(2001).ChronologischWoordenboek.Availableonlineat:http://dbnl.org/tekst/sijs002chro01_01/
2.ALinkedDataApproachtoDiscloseHandwrittenBiodiversityHeritageCollectionsLiseStork,LeidenInstituteofAdvancedComputerScience(LIACS),LeidenUniversity,NielsBohrweg1,2333CALeiden,[email protected]
AndreasWeber,DepartmentofScience,TechnologyandPolicyStudies(STePS),UniversityofTwente,POBox217,7500AEEnschede,[email protected]
Overthelastdecade,naturalhistorymuseumsinandbeyondtheNetherlandshaveheavilyinvestedindigitizingandextractingbiodiversity information frommanuscript and specimencollections(Heerlien et al. 2015; Pethers and Huertas, 2015; Svensson, 2015). In particular handwrittenfieldnotesdescribingoccurrencesofspeciesinnature(seeillustration)formanimportantbutoftenneglectedstartingpointforresearchersinterestedinlong-termhabitatdevelopmentsofaspecificareaand thehistoryof scientificordering,writingandcollectingpractices (Blair2010;Bourget2010;Eddy2016).Inordertodisclosehandwrittendescriptionsof flora andfauna and relatedspecimenanddrawingscollections,natural historymuseums usuallyresort tomanualenrichmentmethods such as full texttranscriptionorkeywordtagging(Ridge2014;Franzonietal.2014).Oftenthesemethodsrelyoncrowdsourcing, whereonlinevolunteersannotatepageswithunstructuredtextual labels (FieldBookProject2016).More recently, curatorsofarchives,datascientistsandhistorianshavestartedtoexperimentwithsemi-automaticannotationsystemsforhistoricalmanuscriptcollectionssuchastheMONKsystem(Schomakeretal.2016).SinceMONKisasupervisedlearningsystem,alargeamountofproperlyrecognizedtextuallabelsisnecessarytosafeguardthesystem’srecognitionabilities.
Thus,althoughsuchpracticeshavethepotentialtoyieldhighqualitydata,merelyannotatingpageswithunstructuredtextuallabelsraisestwoproblems:First,withoutsuggestionsdriven by semantic
11
knowledge, itwill be hard for volunteers or amachine to start annotatinghandwrittenpages.Notonlyinthecontextofourcasestudy,whichdealswithfieldnoteswritteninearly nineteenth centuryinsular SoutheastAsia, but also in the contextof othermanuscriptcollections, one needs athorough knowledge of paleography, and historical and taxonomicbackgroundinformation(CauserandTerras2014).Semanticscanaidtheannotationprocesswhendealingwithambiguityorprovidesuggestionsincaseswherewordsarehardtoreadandtoolittleexampleinstancesareavailable.For instance,whenafieldnotedescribesanexpedition inEast-Java,aspeciesoffrogsofWest-Celebescanberuledout.Second,unstructuredtextualannotationwilleventuallyresultinaninefficientsearchprocessonthesideoftheuser.Traditionalkeyword-basedsearch leadstomanyirrelevantresultsorrequiresspecificpriorknowledgeregardingthecontent.Toanswermoregeneralandexpressivequeries,semanticrelationsbetweenannotationsneedtobeconsideredaswell(Elbassuoni,etal.2010).
Inordertohelpsolvesuchproblemsthispaperarguesforthedevelopmentandapplicationof asemantic model for semi-automatic semantic annotation. The model aggregates existingmetadatastandardsandontologies,followingtheLinkedDataprinciples,andpreparesthemforsemantically annotating and interpreting theNamedEntities (NEs) in the fieldnotesofdigitizednaturalhistoricalcollections.10
Thecasestudyofthispaperisacollectionof8000fieldnotesgatheredbytheCommitteefor NaturalHistory of the Netherlands Indies (Natuurkundige Commissie voor Nederlandsch-Indië,furtherreferredtobytheacronymNC).Inthefirsthalfofthenineteenthcentury,naturalistsoftheNCchartedthenaturalandeconomic state of the IndonesianArchipelagoand returnedawealthofscientificobservationswhicharenowstored in the archives anddepotofNaturalisBiodiversityCenterinLeiden(Mees1994;Klaver2007).Anin-depthhistoricalanalysisrevealsthatHeinrichKuhl(1797-1821), Johan Coenraad vanHasselt (1797-1823) and othertravelersof the NC use thefollowingNEstostructuretheirfieldnotes(seeillustrationdisplayingabundleofNCfieldnotes)whiletravelingininsularSoutheastAsia:collectinglocalities,dates,collectors’names,taxonomicnames,andreferencestootherprintedorhandwrittensources.KuhlandVanHasselt, for instance,regularlyusethe illustrationsofprintedworkssuchastheVoyagededécouvertesauxterresaustrales(1807-1816)byM.F.Péronasvisualpointofreferencefortheirfieldnotedescriptions.WhilelinkstopublishedresourcescanbeeasilyestablishedbylinkingthemtodomainspecificrepositoriesofdigitizedbookssuchastheBiodiversityHeritageLibrary(BHL),collectionlocalities,taxonomicnamesandcollectors’namesaremoredifficulttoprocess.
Inordertobeabletoidentify,annotateandinterlinksuchNEsinasemi-automaticway,thispaperproposestheimplementationofaKnowledgeBase(KB).TheKBhastwogoals:first,theunderlyingdatastructureof theKBenablescross-matchingofresourceswithinandacrossfieldnote
10 The project Semantic Blumenbach thinks in a similar direction, but then with a focus on publishedmaterial(Wettlauferetal.2015).
12
collections.InordertorealizethisfunctionalightweightapplicationontologywritteninRDF11andOWL12issuggestedthatservesasaschematosemanticallystructuretheKB.Itexpressesspeciesobservations,ensurestheirprovenanceinrelationtothedigitizedfieldnotesandbuildsonexistingmetadataandontologystandards.Entitiesinturnaredescribedusinguniformresourceidentifiers(URIs).ThisallowsforanintegrationofthefieldnoteannotationsintothewebofLinkedData(LD)andensuresinteroperability with other digital collections (Hallo et al. 2016). Second, the logicalcharacteristicsofthepropertiesintheontologyenableareasonersystemtosuggestpossibleNEs.InordertoprovidepossiblelabelsregardingtheseNEs,theKBisprepopulatedwithlistsextractedfromthesauri,gazetteers,andtaxonomies.Asregardscollectionlocalitieswe,forinstance,drawupontheGEOnetsNamesServer(GNS),alargesemanticallystructureddatabasecontaininghistoricalandpresent-daygeographicallocationsininsularSoutheastAsia.BiologicalspeciesnamescanbedrawnfromtheLinnaeantaxonomyofspecieswhichwasalreadywellestablishedatthetimeoftheNC(Farber2000;Beckman2012).AsregardspersonnameswerelyonthedatabaseCyclopediaofMalaysianCollectorswhichM.J.vanSteenis-Krusemancompiledinthe1960sand1970s.13Takentogether, by prompting users to annotate with terms from the KB, a semantic network ofannotations isformedthat isableto improvethequalityoftheannotationsandbootstrapstheannotationprocess.TheontologyandanimplementationoftheKBbasedonourcasestudy,togetherwithpossibilitiesregardingsupportedqueryingandreasoningtechniques,willbediscussedinmoredetailduringthepresentation.
BibliographyBeckman,J.“TheSwedishTaxonomyInitiative :ManagingtheBoundariesof‘Sweden’and‘Taxonomy’” InScientistsandScholars intheField:Studies intheHistoryofFieldworkandExpeditions,editedbyK.H.Nielsen,H.Harbsmeier,andCh.J.Ries,395–414.Aarhus:AarhusUniversityPress,2012.
Bourguet,M.-N.“APortableWorld:TheNotebooksofEuropeanTravellers(EighteenthtoNineteenthCenturies).”IntellectualHistoryReview20,no.3(2010):377–400.
Causer,T.andM.Terras.“‘“ManyHandsMakeLightWork.ManyHandsTogetherMakeMerryWork”:TranscribeBenthamandCrowdsourcingManuscriptCollections.’” InCrowdsourcingOurCulturalHeritage,57–88.Surrey:Ashgate,2014.
Eddy,M.D.“TheInteractiveNotebook:HowStudentsLearnedtoKeepNotesduringtheScottishEnlightenment.”BookHistory19,no.1(2016):86–131.
Elbassuoni,S.,Ramanath,M.,Schenkel,R.,andWeikum,G.“SearchingRDFGraphswithSPARQLandKeywords”.IEEEDataEng.Bull.,33(1),(2010),16-24.
Farber,P.L.FindingOrderinNature:TheNaturalistTraditionfromLinnaeustoE.O.Wilson.Baltimore,Md.:JohnsHopkinsUniversityPress,2000.
FieldBookProject,SmithsonianNationalMuseumofNaturalHistory:http://naturalhistory.si.edu/fieldbooks/[accessed15February2017].
Franzoni,Ch.andH.Sauermann,“Crowdscience:Theorganizationofscientificresearchinopencollaborativeprojects,”Researchpolicy43,no.1(2014),1-20.
11 https://www.w3org/RDF/[accessedFebruary15,2017].12 https://www.w3org/OWL/[accessedFebruary15,2017].13 Thedatabaseisavailableonline:http://www.nationaalherbarium.nl/FMCollectors/[accessedFebruary15,
2017]
13
GEONetsNameServer,http://geonames.nga.mil/gns/html/[accessedFebruary15,2017]
Hallo,M.,etal."CurrentstateofLinkedDataindigital libraries."Journalof InformationScience42.2(2016):117-127.
Heerlien,M.,J.VanLeusen,S.Schnörr,S.DeJong-Kole,N.Raes,andKirsten Van Hulsen. “TheNatural History Production Line: An Industrial Approach to theDigitizationofScientificCollections.”J.Comput.Cult.Herit.8,no.1(February2015):3:1–3:11.
Klaver,Ch.J.J.InseparableFriendsinLifeandDeath:TheLifeandWorkofHeinrichKuhl(1797-1821)andJohanConradvanHasselt(1797-1823),StudentsofProf.TheodorusvanSwinderen.Groningen:Barkhuis,2007.
Mees,G.F.andC.vanAchterberg.“VogelkundigonderzoekopNieuwGuineain1828:terugblikopdeornithologischeresultatenvandereisvanZr.Ms.KorvetTritonnaardezuidwestkustvanNieuw-Guinea.”ZoologischeBijdragen40(1994):3–64.
Péron,F.,N.Baudin,L.C.DesaulsesdeFreycinet,Ch.AlexandreLesueur,andN.-M.Petit.VoyagedeDécouvertesAuxTerresAustrales(Paris :Del’Imprimerieimpériale,1807).
Pethers,H. andB.Huertas. “TheDollmannCollection:ACaseStudyof Linking LibraryandHistoricalSpecimenCollectionsattheNaturalHistoryMuseum,London.”TheLinnean31,no.2(2015):18–22.
Ridge,M.(ed.),Crowdsourcingourculturalheritage(Ashgate:Farnham,2014).
Schomaker,L.,A.Weber,M.Thijssen,M.Heerlien,A.Plaat,S.Nijssen,etal.“MakingSenseofIllustratedHandwrittenArchives.”InBookofAbstracts,DigitalHumanitiesConference2016Krakow,764–66,2016.
Svensson,A.“GlobalPlantsandDigitalLetters:EpistemologicalImplicationsofDigitisingtheDirectors’CorrespondenceattheRoyalBotanicGardens,Kew.”EnvironmentalHumanities6(2015):73–102.
Wettlaufer, J, Ch. Johnson,M. Scholz,M. Fichtner, and S. GaneshThotempudi.“SemanticBlumenbach:ExplorationofText–ObjectRelationshipswithSemanticWeb Technology in theHistory of Science.” Digital Scholarship in the Humanities 30, Suppl. 1(December1,2015):187–98.
3.Linkedculturalevents:Digitizingpasteventsanditsimplicationsforanalyzingandtheorizingthe‘creativecity’HarmNijboer(HuygensING)ClaartjeRasterhoff(UniversityofAmsterdam)
IntroductionThispaperintroduces‘linkedculturalevents’asanovelmethodologicalframeworkthatallowsforthesystematicanalysisofculturalexpressionsintheirurbancontext.Theevents-basedapproachisinspiredbydatasetsdevelopedintheresearchprogramCREATE:CreativeAmsterdam:AnE-HumanitiesPerspective(UniversityofAmsterdam,2014-present).14Inthisprogram,theculturalsectorsofperformingartstakeupaparticularlyprominentposition,asdataonforinstancemusic,theatreandcinemaprogrammingisavailableinvariousformats.Intermsofmethodology,thedata
14 www.create.humanities.uva.nl.
14
onperformingartsallowsustomovebeyondbiographicaldataonproducers,anddevelopamethodologicalframeworkinwhichdifferentdatatypescanbestudiedinconjunction.TheframeworkofLCEhastwomaincharacteristics:1)itpositsculturaleventsasanalyticalunitswithstructuralpropertiesandlinkagestoactors,institutionsandurbanproperties(linkedculturalevents);and2)itisconnectedtoadatastructurewhichallowsforqueryingtheconnectionsbetweentheseunitsofanalysis(linkeddata).Inthispaper,wediscusshowtheconceptof‘linkedevents’canbeusedtomapandanalyseurbanculturallife.
EventsandthecityStudiesinthesocialsciencesandhumanitiesresearchoffervaluableinsightsinconditionsandmechanismsfavorabletocreativityandinnovation,emphasizingforinstancetheroleofagglomeration,andlabourmobilityanddiversity.15Littletonoattentionisbeingpaidtowhatactuallymakescitiescometolife:theculturalexpressionsthemselves,andinparticulareventssuchexhibitions,concerts,plays,andpublications.Recenthistoricalresearchhas,furthermore,emphasizedthelimitsofsuchgeneralizationsonsourcesofcreativity,stressingtheimportanceoftime-andplace-specificcharacteristicsandcircumstances.16
Theevents-basedapproachmayhelptoaddresssomeoftheseissues.Muchhasbeenwrittenabouthoweventsshouldbeconceptualizedandabouttheroleofeventsinstudyingandwritinghistory.17Moreover,theoreticalandconceptualthinkingabouteventsisnotlimitedtohistoriographybutexpandstothefieldsofactiontheoryinphilosophyandsocialtheory.18Eventsalsofeatureasdevicesinstructuringheritagedataandasbuildingblocksforonlinereconstructionsofhistoricalnarratives.19Datasetsofeventsovertimehave,moreover,beenusedinthebroaderfieldofdataanalytics,forinstanceinevent-basednetworkanalyses,toaddtemporalityanddynamismtootherwisestaticinformationsystems.20Buildingoninsightsfromthesedifferentlinesofresearch,weemphasizethatnetworksofeventsshouldalsobeconsideredasunitsofanalysis.
LinkedculturaleventsAlargenumberofcontemporarysocialtheoristsrejectsthenotionofaculturalactoreventasanexpression(orrepresentation)ofagivenculture.Insteadcultureshouldbeunderstoodasacollectionofperformativeactsorevents.21Byperformativitywemeanthataneventcallsorrecallssomething(apieceofart,aculturalcodeortrait)intobeing.Aplay,forinstance,mustbeperformed(staged,read,remembered)tobethere.Events,moreover,donotoccurinisolation.Eachevent15 Cf.MarjattaHietalaPeterClark,‘CreativeCities’,in:TheOxfordHandbookofCitiesinWorldHistory,edited
byPeterClark.Oxford:OxfordUniversityPress2013.16 IljaVanDammeandBertDeMunck(eds.),CreativeCities1500-2000.TheHistoricalFabricationofCitiesas
AgentsofEconomicInnovationandCreativity,London:Routledgeforthcoming.17 Cf.RyanShaw,‘ASemanticToolforHistoricalEvents’,ProceedingsoftheThe1stWorkshoponEVENTS:
Definition,Detection,Coreference,andRepresentation,Atlanta,Georgia,14June2013:38–46;W.H.Sewell,‘Historicaleventsastransformationsofstructures:InventingrevolutionattheBastille’,TheoryandSociety1996,25:841-881.
18 RobertoCasati&AchilleVarzi,"Events",in:EdwardN.Zalta(ed.),TheStanfordEncyclopediaofPhilosophy(Winter2015Edition),http://plato.stanford.edu/archives/win2015/entries/events.
19 Cf.VictordeBoer,JohanOomen,OanaInel,LoraAroyo,ElcovanStaveren,WernerHelmich,DennisdeBeurs,‘DIVEintotheevent-basedbrowsingoflinkedhistoricalmedia’,JournalofWebSemantics,35/3:152-158..DOI:10.1016/j.websem.2015.06.003.Seealso:http://www.ehumanities.nl/events-working-group.
20 E.g.JoshuaO'Madadhain,JonHutchins,PadhraicSmyth(2005),‘PredictionandRankingAlgorithmsforEvent-basedNetworkData’,SIGKDDExplor.Newsl.,7(2),pp.23-30,doi:10.1145/1117454.1117458.
21 PeterDirksmeier&IlseHelbrecht(2008),‘Time,Non-representationalTheoryandthe"PerformativeTurn"—TowardsaNewMethodologyinQualitativeSocialResearch’,Forum:QualitativeSocialResearch9(2),pp.1-24.http://www.qualitative-research.net/index.php/fqs/article/view/385/839.
15
involvestheactionsand/orpresenceanumberofentities.Theseentitiescanbehumanagents(e.g.performersandspectators),non-humanagents(organizations),materialobjects(places,artifacts,etc.)orimmaterialobjects(concepts,code).Andtheseentitiesareintheirturnlikelytobeinvolvedinothereventsaswell.Alreadyin1964theethnolinguistDellHymesdefined‘communities’as‘systemsofcommunicativeevents’.22Inordertooperationalizethisinterpretationforhistoricalresearch,weconceptualizeculturalcommunitiesaswebsoflinkedculturalevents(LCEs).
LCE’sthusformaninfrastructureforcombining,analyzing,andvisualizingexistingculturaldatasetsinanetworkthatexposestheirrelationsandinterdependencies,andthatallowsforquantitativeanalysis.Onthelevelofadvanceddatahandling,theLCEapproachhasastrongaffinitywithSemanticWebtechnologyandtheassociatedLinkedOpenDataparadigmwhichhaveevolvedinleadingprinciplesinthehandlingofhistoricalandculturalheritagedatainrecentyears.NotwithstandingthelimitationsandcomplexitiesofSemanticWebtechnology,thegreatpracticaladvantageofthistechnologyisthatitenablesustoconnectsingleresourcedatatoexternalresources.ThisisnotonlyimportantforourunderstandingofculturesaswebsasLCEs,butprovestobeinherenttothedataonforinstancetheatreandconcertprograms.
VisualisingandanalyzingLCE’sInthefinalsectionofthepaperwewillusetworecentlydevelopeddatasetstopresentanalysesoflinkedculturalevents.ONSTAGE(OnlineDatasystemofTheatreinAmsterdamintheGoldenAge)containsinformationontherepertoire,performances,popularityandrevenuesoftheculturalprograminAmsterdam’spublictheatreduringtheperiod1637-1772.23TheFELIX:FelixMeritisProgrammingDatabasestoresandlinksdataonconcertsheldinthefamousAmsterdamconcerthallFelixMeritisbetween1832and1888.24Inthesedatasetslinkageshavebeencreatedto,forinstance,genrecharacteristicsandbiographicaldatainexternalinternationalresourcessuchVIAF(VirtualInternationalAuthorityFile)andthedatabranchesoftheWikipediafamily(DBpediaandWikidata).Bylinkingdataonplaysandconcertstotheseresources,awealthofexternaldataonartefactsandactorsenrichesourlocalresources,andwemakeourlocaldataavailableinaglobalcontext.
VisualizingwebsofLCEsovertimerequirestechniquesthatgobeyondthestandardfeaturesofvisualizationtoolsandlibraries.Thechallengeinvisualizingsuchnetworksisthatwehavetoaccountforbothmultimodalityandtime.Inourpaper,weexplorethepossibilitiesandlimitationsofvisualizationsbylookingatthenetworksbehindthereceptionoftheFrenchplaywrightMolièreinDutchtheatreinthe17thand18thcenturies.Thistreatmentoftheinternationallinkagesoflocaltheatricalperformanceseffectivelyshowshowoperationalizingculturallifethroughtheconceptof,anddataon,linkedculturaleventsmayassistresearchersin1)mappingculturallifeinbothquantitativeandqualitativeways,and2)analysingtheorganisationofculturallifebeyondasingleeventorfixednetworkoflocalactors.
22 DellHymes(1964),‘Introduction:Towardethnographiesofcommunication’,AmericanAnthropologist66
(6-II),pp.1-34,p.13.https://www.jstor.org/stable/668159.23 http://www.vondel.humanities.uva.nl/onstage.KimJautze,FransBlom,LeonorÁlvarezFrancés(2016)
‘SpaanstheaterindeAmsterdamseSchouwburg(1638-1672).Kwantitatieveenkwalitatieveanalysevandecreatieveindustrievanhetvertalen’.DeZeventiendeEeuw.CultuurindeNederlandenininterdisciplinairperspectief32(1),pp.12–39.DOI:http://doi.org/10.18352/dze.100006;KimJautze,‘ONSTAGE!PresentationattheConferencefor“WerkgroepvoordeZeventiendeEeuw”inNijmegen,29August2015’,EMagazineeHumantiesRoyalNetherlandsAcademyofArtsandSciences6,http://ehumanities.leasepress.com/emagazine-6/recent-events/onstage.
24 MaschavanNieuwkerkandHarmNijboer,‘Nineteenthcenturyconcertprogramsinadigitalresearchenvironment;thecaseofFelixMeritis’.PosterpresentationatDHBeneluxBelval,9-10June2016.http://www.dhbenelux.org/wp-content/uploads/2016/05/60_nieuwkerk_nijboer_FinalAbstract_poster.pdf
16
SessionC
1.TheQuestforQuestionsinDigitalHistory:AComparativeViewonWerner-andDelorsReportonEconomicandMonetaryUnionFlorentinaArmaseluandElenaDanescu
1. IntroductionInTheFormationoftheScientificMind,Bachelard(2002:25)considers“thesenseoftheproblem”asthecoreoftheconstructionofscientificknowledge:“allknowledgeisananswertoaquestion”.WithintheframeworkofDigitalHumanitiesandtextanalysis,Ramsay(2003:171,173)proposesthetermof“algorithmiccriticism”implyingawaytoassess,beyondhypothesesvalidation,“howsuccessfulthealgorithmswereinprovokingthoughtandallowinginsight”.InthecontextofDigitalHistory(SeefeldtandThomas,2009)andlanguagestudyinhistory(Bertrandetal.,2011),thisproposaldealslesswithhowtextualanalysisconfirms/disconfirmsprevioushypothesesandmorewithhowdigitaltoolshelparticulateresearchquestionsandfosternewpathsforinterpretation.Theanalysedtexts,Werner-andDelorsreport,wereselectedfortheircontrastive,comparativepotentialandtheirimportanceintheEconomicandMonetaryUnion(EMU)history.
2. Thematicsnapshot:Werner-andDelorsReportAttheHagueSummit(December1969),anexpertscommitteechairedbyPierreWerner(PrimeMinisterofLuxembourg)wassetuptoexploretheprogresstowardsEMUintheEuropeanCommunity(EC).TheresultwastheWernerreport(1970),whichofferedafulldefinitionofEMU(3stagesover1971–80).Goals:achieveirreversibleconvertibilitybetweentheMemberStatescurrencies,thecompleteliberalizationofcapitalmovements,theirrevocabilityofexchangerates,andevenasingleEuropeancurrency.Twomainprinciplesunderpinnedthisreport:gradualrealizationofEMUandparallelismbetweeneconomicandmonetaryconvergence.In1974theWernerreportwassuspended.In1988wassetupacommitteechargedwiththeStudyofEMU,chairedbyJacquesDelors(PresidentoftheEuropeanCommission).TheresultwastheDelorsreport(1989)whichwasappropriatingtheoverallphilosophyandstructureoftheWernerreport.
3. MethodologyThequestforquestionsstartedasacomparisonofthedocuments,usingthecorpusanalysisframeworkTXM.TXMhasbeenchosenforitscontrastivepotentialviathespecificitiesfeaturehighlightingwhatpropertiesarespecific,asoveruse/deficit(Lafon,1981),toapartversustherestofacorpus.
Thecorpuscontainsthereportsintxtformat,aswholeandfragments(numberedpartsandsections)inseparatefiles.ThefileswereimportedintoTXM(TXT+CSV)andtaggedviaTreeTagger(French).Partitionswerecreatedfortheentirereportsandtheparts/sections.Theanalysiswasbasedon:
• lexicaltableandspecificities,nom-adjectivequery:[frpos="NOM.*"][frpos="ADJ.*"](Vmax=500,Edit=frlemma);
• specificities,partofspeech(frpos).
Theselectionofpropertieswasdrivenbyspecificityscores,higherthantheTXMdefaultbanalitythreshold(+/-2.0),andrelevance.Thederiveddiagramswereusedtoformulateresearchquestions.
17
4. TXMAnalysisandQuestionsFormulationFigure1showsaselectionofconceptsdefiningthe“monetary”aspectofEMUasreflectedbytheWerner-andDelorsreport.Fromthespecificitiesscores/diagram,thefollowingquestionwasformulated:
Q1:Howmonetarymatters(currency,budgetaryandfiscaltopics)differinWerner-andDelorsreport?
Despiteoftheapparentsimilaritybetweenthetworeports,thediagramshowsacontrastwithinseveralnotions(inter-Communitymargins,Communitycurrenciesversusmonetaryunion).
Fig1.Specificities:EMU“monetary”aspect,Werner-Delorsreport(RWe-RDe)(wholeview)
MoredetailsonthedifferencesareprovidedbyTXMco-occurences:“monnaiecommunautaire”with“fluctuation”,“intervention”,“marge(s)”,focusingonthemonetarystabilisationprocess(RWe);“unionmonétaire”with“convertibilitétotaleetirreversible”,emphasisingthe“monnaieunique”objective(RDe).
Figure2illustratesotheroppositionsrelatedtothe“economic”aspectofEMUandthequestion:
Q2:Howtheeconomicmatters(economicpolicy,market)differinWerner-andDelorsreport?
18
Fig2.Specificities:EMU“economic”aspect,RWe-RDe(wholeview)
Thedistinctionprocess/objectivemaybefurtherobservedviaTXMco-occurrences:“politiqueéconomique”with“convergence”,“coordination”,“centrededecision”(RWe);“marchéintérieur/unique”with“programmed’achèvement”,and“déséquilibreéconomique”with“corriger”(RDe).
Asimilaranalysisappliedtotheparts/sectionscorpusprovidedfurtherincentivesforenquiriesonterminologicaland“actors”-relatedmatters(Q3,Q4).
Q3:CanwespeakofanevolutionoftheEMUterminologybetween1970and1989asreflectedbythestructureofthetwodocuments?
Q4:WhatinfluenceupontheEMUconstructiondidhavethestructureoftheWernerCommitteemembership(mainlypoliticians)andofDelorsCommittee(mainlycentralbankers)?Whatterminologyforwhatpeopleatwhatmoment?
IntheEuropeanintegrationprocess,manyconceptsevolvedfromhypothesestorealitybetween1970and1989.Itiswhysometerms(centralbank,Europeansystem,intra-Communitymargins,monetaryunion)areover/under-representedincertainsections.ThestructureofRWereflectsthedistributionofroles–politiciansdesignedthescope,elementsandstagesoftheEMUprocess(mainpart);centralbankers(appendix5)setupthetechnicalitiesoftheEuropeancurrencyandthearchitectureoftheESCB(EuropeanSystemofCentralBanks)(Fig.3,4).
19
Fig.3.Specificities:EMU“monetary”aspect,RWe-RDe(structureview)
Q5:TheMonetaryUnionandtheEconomicUnionprocesseswerereallydesignedonasymmetricalandsimultaneousbasis?Aprocessesgranularityanalysis.
InRDe,thedegreeofdetaildescribingtheeconomicunionprocessislessthanthatofthemonetaryunion.Thismaybeassumedbylookingatthedocumentsectionsshowinghighspecificityscoresfortheseterms(Fig.3,4).
Fig.4.Specificities:EMU“economic”aspect,RWe-RDe(structureview)
Theanalysisofspecificitiescomputedaccordingtothepartofspeech(frpos)revealedothersalientoppositionsrelatedtotheuseofverbalforms,adjectives,pronounsandcitationmarks(Fig.5,6).
20
Fig.5.Specificities:partofspeech,RWe-RDe(wholeview)
Fig.6.Specificities:partofspeech,RWe-RDe(structureview)
OnecanobserveadominanceofthefutureverbalforminRWeversusconditionalsinRDe,leadingto:
Q6:WhatisbehindtherangeofverbalformsinWerner-andinDelorsreport?Decodinghiddenpoliticalmeaningsandnationalinterests.
Q7:Whatisthedegreeofcertaintyandinter-conditionalitybetweenthesinglemarketandEMU?
RWewasdefiningadecadeprojectionfortheEMUprocesswhileRDewasbuiltuponitsfirststageachievementsbutinanuncertainenvironment.Thismayelicitfurtherinvestigationontheverbalformsusage(Q6,Q7).
21
5. ConclusionsTheproposalrevisitstraditionalmethodologiesincontemporaryhistoryandDHfromanepistemologicalperspective:useofcomparativetextualanalysistoformulateresearchquestions.
ThefirstexperimentswithtwocrucialEMUdocumentssuggestthatdigitaltoolsmayserveashypothesesorconclusionsvalidatorsbutalsoasmeansofdiscoveringexplorationpathsintheconstructionofnewknowledge.
ReferencesBachelard,Gaston.TheFormationoftheScientificMind:AContributiontoaPsychoanalysisofObjectiveKnowledge,ClinamenPress,2002.
Bertrand,Jean-Marie.Boilley,Pierre.Genet,Jean-Philippe.Schmitt-Pantel,Pauline(éditeurs).LangueetHistoire,Paris,PublicationdelaSorbonne,2011.
Lafon,Pierre.1980.“Surlavariabilitédelafréquencedesformesdansuncorpus”,InMots.Saussure,Zipf,Lagado,desméthodes,descalculs,desdoutesetlevocabulairedequelquestextespolitiques,N°1,pp.127-165.http://www.persee.fr/web/revues/home/prescript/article/mots_0243-6450_1980_num_1_1_1008.
Ramsay,Stephen.“SpecialSection:ReconceivingTextAnalysis:TowardanAlgorithmicCriticism”,LitLinguistComputing(2003)18(2):167-174.DOI:https://doi.org/10.1093/llc/18.2.167.Published:01June2003.
Seefeldt,Douglas.Thomas,WilliamG.“Intersections:HistoryandNewMedia.WhatIsDigitalHistory?”,InPerspectivesonHistory,TheNewsmagazineontheAmericanHistoricalAssociation,May2009.
TXM,Textométrie,http://textometrie.ens-lyon.fr/?lang=en.
SourcesRapportauConseiletàlaCommissionconcernantlaréalisationparétapesdel’UnionéconomiqueetmonétairedanslaCommunauté(rapportWerner).Luxembourg:8octobre1970,documentL6.956/II/70-D.InJournalofficieldesCommunautéseuropéennes,n°C136,supplémentauBulletin11/1970,Luxembourg,11novembre1970.
Rapportsurl'UnionéconomiqueetmonétairedanslaCommunautéeuropéenne(rapportDelors).12avril1989.InEuropeDocuments.Bruxelles,20avril1989,n°1550/1551.
2.TransparencyasRupture:OpenDataandtheDatafiedSocietyofHongKongRolienHoyngLingnanUniversity
ThispaperdealswithOpenDataandthedataficationofgovernanceinHongKong.Itaddressescontestationsover“transparency”asatechno-politicalconstructionthatisembodiedin,andperformedby,theinfrastructuresandtechniquesofdata-drivengovernance.Transparencyisasiteofnegotiatingdistributionsofcognitionandperceptioninthecontextoftransformationsofcitizenshipandgovernanceinthedatafiedsociety.Ispecificallyinquireintotheinfrastructures,protocols,techniques,andpracticesofOpenData,whichpromisestosimultaneouslyenhancegovernmentaccountabilityandstimulatedata-driven“smart”governance.Accordingly,Ilookat
22
techno-politicalorganizationsofdigitaldataanddatainfrastructuresthatsupportparticularmodesanddistributionsofcognitionandperception(Halpern2014;Hayles2014;Kitchin2014).Idistinguishtwodataregimesrespectivelyrevolvingaround“representation”and“prediction.”Ireviewtheseissuesinrelationtothequestionofglobalization.ThecasestudyofHongKongsuggeststhatdataficationdoesnotresultingloballyhomogeneouscyberneticcontrol.Rather,theprocessofadaptingOpenDatais(structurally)incomplete,disruptive,anddisruptedintheencounterwithresidualrationalitiesofstatecraft,whichmeansitopensupafieldofstruggleandcontestation.Inthispaper,“disruption”functionsasamethodologicaldevicetoexplorethepoliticsofOpendataanddataficationatlarge.Ratherthanappropriatingdisruptionasarevelatorymomentundoingthe“black-boxing”oftechnologyperse,myaimistorethinkthepoliticsoftransparencyandsecrecyinmorecomplextermsandinquireintothepossibilityofactivismandintervention(Birchal2015).
Ideploymixedmethodsincludinginterviewswithactorsandanalysesofpolicydocumentsandtechnicalliteratureaswellasmaterialarchitectures,formats,protocols,interfaces,anddatavisualizations.Onthebasisofexamplesincludingthedata.gov.hkwebsiteandfintechapps,Iarguethatthetwodataregimesof“representation”and“prediction”enactmentparticular“fields”ofvisibility:organizedarticulationsofstrategies,techniques,anddiscourses(Halpern2014).First,thedataregimeof“representation”providescognitionandperceptionintermsofoversightandtransparency.Orderingdata(capturing,aggregating,andorganizing)formspartandparceloforderingsociety.Dataformsevidenceforwhatexists“outthere”andaffordsreferential,descriptivecapability.Hence,itissupposedtoassistintheproductionofknowledgeandtruth.Second,thedataregimeofprediction,whichisaffordedbydigitaldataprocessingtechniquesandinfrastructure,orientsperceptionandcognitionontodiagnosisofpotentialandthepredictionoftendencies.Dataisgeneratedwithoutaspecificquestionorpurposeinmind.Ratherthandepictingtheworld,atstakeismodelingtheworldfortacticalinterventionsinshiftingpatternsandtrends(Andrejevic2013).Distributionofthismodeofperceptionandcognitioninducessociety’smediationbyalgorithmicdataprocessingtechniques.
Ratherthanrecognizingdataregimesinanideal-typicalfashion,mymainquestionaddressestheprocessesofadaptationandthecontradictionsthatemergeduetointersectingofdataexpediency.ThisfocusunderscoresOpenData’sparadoxofpromisingfortifiedtransparencyandaccountability,whilesimultaneouslyadvancingcovertformsofmodulation,control,dataveillance,andconcentrationsofcognition.Forinstance,citizen-consumersasusersofappsareinterpellatedintopositionsthatseeminglydemocratizepredictiveperceptionandcognition,yettheyaresimultaneouslysubjectedtodataveillanceandalgorithmicgovernance.However,thedatafiedsocietydoesnotpresentitselfasafaitaccompli,inotherwords,fullyoperationalandall-encompassing.Rather,mailto:[email protected](experienced)failure,disruption,anddeferment;itgeneratescontradictions,interferences,andarticulationsbetweenco-existingdataregimesandmultifariouspoliticalrationalities(cf.Chan2013).Thesemomentsmightofferpossibilitiesforimaginingmoreradicalnotionsoftransparencyandsecrecy.
Iftransparencyandsecrecyareco-constituted,thequestioniswhatescapestheparticularconstructionsoftransparencyinOpenData(Birchall2015).Forinstance,ifthegovernmentopensupcertaindatasets,doesthisenablepublicscrutinyofstatecraftordoesitmerelybenefittheexpansionofwhatEasterling(2015)callsextrastatecraft—nowbymeansofdata-drivenapparatusesbelongingtoinstitutionsthatdonotopentheirownproprietarydatasets?Howdodataanddatainfrastructuresmediatecitizens’relationtoprivate-publicgovernance?TowhatextentareOpenDataactivistsabletonotjustreclaimpublicscrutinyoverstatecraftbutradicalizetransparency,forinstancebyintroducinguncontrollabledatamotilityandreversibletransitionsbetweendataandinformation?Followingamorespeculativeturn,shouldtransparencyalwaysbethegoal,ordoessecrecyhaveitsmeritstooinordertointerveneintotheeffectsofpredictiveperceptionandcognitiononsociety?
23
ReferencesAndrejevic,Marc.2013.Infoglut:HowTooMuchInformationIsChangingtheWayWeThink.NewYork:Routledge.
Birchall,Clare.2015.“’Data.gov-in-a-box’:DelimitingTransparency.”EuropeanJournalofSocialTheory.18(2):
Bratton,Benjamin.2015.TheStack:OnSoftwareandSovereignty.CambridgeMA:MITPress.
Chan,AnitaSay.2013.NetworkingPeripheries:TechnologicalFuturesandtheMythofDigitalUniversalism.CambridgeMA:MITPress.
Easterling,Keller.2015.Extrastatecraft:ThePowerofInfrastructureSpace.London:Verso.
Halpern,Orit.2014.BeautifulData:AHistoryofVisionandReasonsince1945.Durham:DukeUniversityPress.
Hayles,N.Katherine.2014.“CognitionEverywhere:TheRiseoftheCognitiveNonconsciousandtheCostsofConsciousness.”NewLiteraryHistory45(2):199-220.
Kitchin,Rob.2014.TheDataRevolutionBigData,OpenData,DataInfrastructuresandTheirConsequences.LosAngeles:Sage.
Ong,Aiwha.2006.NeoliberalismasException.Durham:DukeUniversityPress
3.Oralhistoryonline–UserperspectivesandbehaviorinatransformingWW2memorycultureDr.SusanHogervorst|OpenUniversiteitNederland/ErasmusUniversityRotterdam|[email protected]
Sincethe1980s,SecondWorldWar(WW2)memoryculturehasbeenincreasinglycharacterizedbytheforthcomingdisappearanceoftheeyewitnessgenerations.Onewayinwhichthisproblemhasbeenaddressed,isbyrecordingeyewitnesstestimonies.Bynow,multipleoralhistorycollectionshavebeencreatedthroughoutthewesternworld,inwhichtenthousandsofinterviewshavebeenpreservedonaudioandvideo(ApostolousandPagenstecher2013,Keilbach2013).Currently,weseeashiftfromcollectingandpreservingtestimoniestodisclosingthemforwideraudiences(ScagliolaandF.deJong2014;S.deJong2013;BotheandLücke,2013).Thisispartlyduetotechnologicaldevelopments,butalsotothedynamicsofWW2memoryculture,ofwhichtransmittingeyewitnessmemoriesontoyoungergenerationshasbecomeakeyfeature(Wieviorka2006;ErllandRigney2009;Hogervorst2010;SabrowandFrei2012).Howwillthedisappearanceoftheeyewitnessgenerationsaffectthetransmissionofeyewitnessmemories,andwhatroledoonlineinterviewcollectionsplayinthisprocess?
MethodsanddataTheaimofmypostdocresearchprojectistoprovidesubstantiateddataonthismatter,partlyacquiredbyanalyzingcontent,use,andusersofanonlinevideointerviewcollection:theDutchwebportal‘Getuigenverhalen.nl’.Thisportalgivesaccesstocirca500videointerviewswitheyewitnessesaboutdifferentWW2relatedtopics.Ideploythisinterviewcollectionasadigitalbarometerofcurrent,transformingmemoryculture:throughthewebstatistics,anonlinequestionnaire,focusgroupinterviewswith(student)historyteachers,andscreenrecordingsoftheirinteractionwiththe
24
websitewhileselectinginterviewfragmentstheywoulduseinalessonaboutWW2.Thisinnovativeapproachnotonlyunderlinesthevalueofincorporatingdigitalsourcesandmethodsintothefield.Italsoenablesforegroundingtheuserandhis/heragencyintheanalysisoftherathertop-down,officiallysupportedprocessofWW2memorytransmission,whileusersoftenremainelusiveintraditionalmemorystudies(andhistorical)research(Kansteiner2002;ErllandRigney2009).
FindingsanddiscussionThefindingsindicatebothcontinuityandchangeregardingtheuseofWW2videointerviewscomparedtolivetestimoniesinclassrooms.Theportalisfirstandforemostanonlinearchive;itisnotexplicitlymeantasaneducationaltool.InquiriesintheeducationalfieldintheNetherlandspointataninterestin,butalsoatanunfamiliaritywithsuchcollectionsandtheirdidacticalpossibilities.Thisisconfirmedbyboththequestionnaireandthewebstatistics.Onlyfewrespondentsidentifythemselvesasteacherorstudent,andrelativelymanydonotfindwhattheywerelookingfor.Sitesearchisnotusedoften,andthemostlywatchedinterviewsaretheoneshighlightedonthehomepage.Twofocusgroupinterviewswithaninternationalgroupofstudenthistoryteachersofferedanopportunitytogetacloserandmorein-depthviewontheportal’suse,and–toascholarofculturalmemorymoreimportantly–onusers’selectioncriteriaofrelevantmaterialoutofthequiteabundantreservoirofeyewitnesstestimoniesavailable.First,accordingtotheparticipants,asuitableinterviewfragmentshould‘bringthepastcloser’(whichwastobeachievedindifferentmanners).Indeed,theparticipantscouldquiteeasilyfindfragmentsthatsuitedthesepurposes.Second,suitablefragmentsshould(accordingtotheparticipants)confirmexistinghistoricalknowledge.Thelatterindicatesamorefundamental,bothethicalandepistemologicalviewonthepositionandvalueofeyewitnessesandtheirfunctioningassourcesofhistoricalknowledge.
Bothaspectscorrespondtothewayparticipantswoulduselivetestimoniesintheirlessons.Theplenaryevaluationoftheselectedfragmentsandthecriteriaused,pointedatanimportantdifference:thedistancethroughthescreen.Thisdistanceenabledraisingcriticalquestionsaboutthenatureandvalue(reliability)oforaltestimonies,whichisratheruncommoninsettingsinwitheyewitnessesarephysicallypresent.Anothercharacteristicofsearchableinterviewcollections,thattheyenablecomparingdifferenttestimoniesandexperiences,andtherewithsupplementorchallenge(besidesconfirmandillustrate)existinghistoricalknowledgeandperspectives,wasnotmentionedbytheparticipants.ThismightpointatthefactthatworkingwithdigitaltestimoniesisstillinanearlystageintheNetherlands.Currently,inGermany,Austria,andtheUnitedStates,educationalprojectsaredevelopedaroundWW2videotestimonies.BesidesrecommendationsforimprovementsoftheDutchportalwebsite,thisstudycontributestoamorecritical,anddidacticallymorerelevantinteractionwithtestimoniesinDutchhistoryeducation,aswellastoabetterunderstandingofcurrenttransformingmemoryculture.
ReferencesApostolous,N.andC.Pagenstecher(eds.),ErinnernanZwangsarbeit.Zeitzeugen-InterviewsinderdigitalenWelt(Berlin2013).
Bothe,A.andM.Lücke,‘ShoahundhistorischesLernenmitvirtuellenZeugnissen‘,P.Gautschietal.(eds.),ShoaundSchule.LehrenundLernenim21.Jahrhundert(Zürich2013)55-74.
Erll,A.andA.Rigney,Mediation,remediation,andthedynamicsofculturalmemory(Berlin/NewYork2009).
Hirsch,M.andL.Spitzer,‘Thewitnessinthearchive.Holocauststudies/memorystudies’,MemoryStudiesvol.2(2009)2,151-170
Hogervorst,S.,Onwrikbareherinnering.HerinneringsculturenvanRavensbrückinEuropa,1945-2010(Hilversum2010).
25
Huijgen,T.,&Holthuis,P.(2016).Dutchvoices:exploringtheroleoforalhistoryinDutchsecondaryhistoryteaching.InD.Trškan(Ed.),Oralhistoryeducation:dialoguewiththepast.(1ed.,Vol.1,pp.43-58).Ljubljana:SlovenianNationalCommissionforUNESCO.
Jong,S.de,‘ImSpiegelderGeschichten:ObjekteundZeitzeugenvideosinMuseendesHolocaustunddesZweitenWeltkrieges’,WerkstattGeschichte62(2013)19-41.
Kansteiner,W.,‘Findingmeaninginmemory.Amethodologicalcritiqueofcollectivememorystudies’,Historyandtheory41(2002)179-197.
Keilbach,J.,‘Collecting,IndexingandDigitizingSurvivors.HolocaustTestimoniesintheDigitalAge’,A.Bangertetal.(eds.),HolocaustIntersections.GenocideandVisualCultureattheNewMillennium(London2013)46-63.
Scagliola,S.andF.deJong,‘Clio’stalkativedaughtergoesdigital’,R.Bodetal.(eds.),TheMakingoftheHumanities,VolumeIII:TheModernHumanities(Amsterdam2013)511-526.
Sabrow,M.andN.Frei(eds.),DieGeburtdesZeitzeugennach1945(Göttingen2012).
Wieviorka,A.,Theeraofthewitness(Ithaca2006).
26
SessionD
1.Collectionsasnetworks,UncoveringinformationexchangesandinformationnetworksinthecollectionsoftheMeertensInstitute(KNAW)DouweA.ZeldenrustMeertensInstitute(KNAW)
Thispaperisaboutuncoveringinformationexchangesandinformationnetworksinhumanitiesresearchcollections.Mosthumanitiesresearchersfocusonobtainingdatafromresearchcollections,withoutrealizingthatthosecollectionsalsocanbeseenastheresultsofepistemologicalexperiments.Thatis:everycollectionistheoutcomeoftheprocessofgatheringinformationandthereforeinterconnectedwiththepresuppositions,foundationsandtheactivitiesthathaveledtotheknowledgeitcontains.CharlesJeurgens(2012)statesaboutthisconnectionthat:‘(…)understandingthatbondhastoprecedeunderstandingtherecords’(p.51).The‘bond’Jeurgenswritesabout,isnotonlydictatedbythegoal(s)formingthecollection,butisalsodeterminedby(amongotherthings)thecultural,administrative,scientificandsocialclimate.Moreover,itisdependentontheindividualswhowerecollecting,theirscientificexperience,theirinterestsandpersonalities.
Thepaperwillreflectontheissuesofextracting,visualisingandprocessingthiscontextinformation,usingtheconceptof‘deepnetworks’.CharlesvandeHeuvelintroducedthisconceptinhisarticle‘MappingknowledgeexchangeinearlymodernEuropeintellectualandtechnologicalgeographiesandnetworkrepresentation’(2015).Theconceptallowsthecontextualisationofnetworksandthevisualisationofuncertaintywhilecreatinglayersofhistoricalsourcesinmultipleperspectives.Furthermore,itcombinespatternrecognitionintextualandvisualbigdatawithtraditionalhermeneuticmethods.ThevastcollectionsoftheMeertensInstitute(RoyalNetherlandsAcademyofArtsandSciences)willbeusedasausecaseinordertomakethefirststepsinrealizingthesenetworkswithintheframeworkofarchivalstudies(Meertens,2016).
ThecollectionsoftheMeertensInstitutehavebeenaccumulatedinaperiodofover80yearsandconcentrateonthediversityinlanguageandcultureintheNetherlands(Jongenburger,2013).Accesstothemorethan15terabytesofdata,6000hoursof(digital)audioand2kilometresofarchivalmaterialisprovidedbyarecordkeepingsystemcontainingdataabout,amongstotherthings,theresearchersinvolvedincollecting.Theinformationcapturedinthisrecordkeepingsystemcreatesthefirstlayerofthenetwork.Asecondlayerofinformation,regardingtheprovenanceofthecollections,isextractedfromtheannualreportsoftheMeertensInstitute.Thosereportscontaininformationabout,forinstance,theacquisitionofthecollections.AndathirdlayerofinformationisobtainedfromtheBiographicalPortaloftheNetherlands(Biografischportaal,2016).Thisonlinereferenceworkcontainsshortdescriptionsofthelivesofpersons(amongstthemprominentscholarsandinfluentialmanagersofresearchinstitutes)whodistinguishedthemselvesorplayedaroleofsomesignificanceinthepastintheNetherlands.
Theobjectiveofthisresearchisthreefold:firstitwoulddemonstratethatbuildingsuchanetworkisfeasible.Theweb-basedsoftwareplatformPalladiowillbeusedforprocessingthedataandvisualizingthenetwork(Palladio,2016).Asthisresearchisongoing,experimentswithother,moreversatilenetworkanalysistools,suchasNodegoat,willbeconsidered(Nodegoat,2016).25The
25 Variousdatavisualizationplatformsandnetworkarchitectureshavebeendeveloped.ForvisualizingdataPalladioisoneoftheplatformsthatisadvisedasanetworkvisualizationtoolforthehumanities(Düring,
27
networkwillconsistofhundredsofnodesandthousandsof(potential)edgesinordertoincludetherelationsamongthemostprominentpersonsinvolvedincollectingtheinformation.Second,thismethodcan,withlocalmodifications,bereusedbyotherhumanitiesresearcherstogeneratenetworksforarchivalstudies.Andthird,theoutcomeswillbeincorporatedinmyPhDresearch,whichisaboutthehistoryofthecollectionsoftheMeertensInstitute.AsthisPhDresearchstartedinJanuary2016andisongoing,thispaperwillshowthefirstresults.
References:Düring,M.(2016).OnDilettantesandDialoguesinDigitalHistory.EuropeanSocialScienceHistoryConference2016.
Heuvel,C.vanden(2015).MappingKnowledgeExchangeinEarlyModernEurope.InternationalJournalofHumanitiesandArtsComputing,9(1),95-114.
Jeurgens,K.J.P.F.M.(2012).Informationonthemove.Colonialarchives:pillarsofpastglobalinformationexchange.ColonialLegacyinSouthEastAsia.TheDutchArchives.Eds.K.J.P.F.MJeurgens,A.C.MKappelhof&M.Karabinos.'s-Gravenhage:StichtingArchiefpublicaties.45-65.
Jongenburger,W,A.W.H.Jansen&D.A.Zeldenrust(2013).CollectieplanMeertensInstituut,2013-2018.Amsterdam:MeertensInstituut.
Websites:http://ckcc.huygens.knaw.nl(AccessedDecember08,2016)
https://nodegoat.net(AccessedDecember0817,2016)
http://palladio.designhumanities.org(AccessedDecember08,2016)
http://www.biografischportaal.nl(AccessedDecember08,2016)
http://www.meertens.knaw.nl(AccessedDecember08,2016)
2.MappingControversiesofDigitalCurationDanaMustataUniversityofGroningen,NL
Theemergenceofdigitaltechnologiesanddigitizeddatainhumanitiesresearch–aphenomenonthathasbeenshapingthecontoursandtheincentivesbehindtheorganizationofdigitalhumanitiesasfieldofstudy–hasraisedmorequestionsthanhasansweredany.Aredigitaltechnologieschangingourresearchpracticesandifso,how?Dotheyincitenewresearchquestions?Areestablishedfieldsofstudyinthehumanitiesdrasticallyalteringinthefaceofthesenewphenomenathathavetechnologyattheircentre?Whatisnewandwhatisoldinthewaywedoresearchindigitalenvironments?Ifthereisanyunderlyingassumptiontraversingallthesequestionsisthatdigitalhumanitiesisa‘transformativepractice’(Svensson,2009).Ithasbeenprimarilythedifficultytodescribeandexplainwhatitisthatischanginginourresearchpractices.Thispapertacklesthisparticularconcern.
Hastherebeenashiftinourtraditionalresearchpracticesrootedintheanalogueera?Whatdoesthisshiftconsistof?Howdoweredefineourselvesfromtraditionallyanalogueresearchersinto
2016).Nodegoatwasusedintheproject‘CirculationofKnowledgeandLearnedPracticesinthe17th-centuryDutchRepublic’(Huygens,2016).
28
digitalhumanitiesresearchers?Thesequestionscontributetofurtheringthedefinitionofdigitalhumanitiesasafieldofstudyandmorespecifically,toredefiningthepracticeofdoingscholarshipwithdigitaltechnologiesanddigitizeddata.Withoutadvocatingforessentialist,stableandfixeddefinitionsofdigitalhumanitiesasanarenainwhichscholarshipisproduced,IamparticularlyinterestedinwhatBrunoLatour(1991)callsthe‘socio-logics’characterizingdigitalhumanitiesasatransformativefieldofstudy.Socio-logicsrefertohowknowledgeismobilised,constructedandaccumulatedinthefaceofa‘controversy’.
“Theword“controversy”refersheretoeverybitofscienceandtechnologywhichisnotyetstabilized,closedor“blackboxed”...weuseitasageneraltermtodescribeshareduncertainty.(Macospol,2007:6,citedinVenturini,2010:260).
Inotherwords,asTommasoVenturiniexplains:
“controversiesaresituationswhereactorsdisagree(orbetter,agreeontheirdisagreement).Thenotionofdisagreementistobetakeninthewidestsense:controversiesbeginwhenactorsdiscoverthattheycannotignoreeachotherandcontroversiesendwhenactorsmanagetoworkoutasolidcompromisetolivetogether.Anythingbetweenthesetwoextremescanbecalledacontroversy.”(Venturini,2010:261).
Thestartingpointofmypaperisthus,approachingdigitalhumanitiesasacontroversialarena,oneinwhichresearchers,tools,tooldevelopersanddataproviders–humansandnon-humans,actorsandactantsinLatour’sterms–collide;anarenainwhichscientificandtechnologicalclashesplayout.Itisthroughthechainsofassociationbetweenresearchers,tools,tooldevelopersanddataprovidersaswellasthoughthetransformationspromptedbytheclashesbetweentheseactorsandactantsthatthesocio-logicofthisnewfieldofstudyisrenderedvisible.
Thepaperwill‘mapthecontroversies’ofcuratingdigitalobjectsinavirtualresearchenvironment.Thesecontroversiesrelatetotranslatingacademicknowledgeintotoolsdesignandimplementation,translatinghistoricalnarrativesintouserfunctionalities,findingasharedworklanguageandcollaboratingattheintersectionofdifferentfieldsofexpertiseandfieldsofknowledgeproduction.
Mappingcontroversiesisahands-onmethodrootedintheANTtraditionofthought,whichexplores,describes,visualizesandmakessenseofissuesthatemergeattheintersectionofcollaborativeworkdonebetween–inthisparticularcase-researchers,tools,designers,tooldevelopersanddigitaldataproviders.Thisparticularmethodprovidesinsightsintopracticesofworkingwithdigitaltoolsanddigitiseddataandthesubsequentprocessofknowledgeproduction.
Thepaperwillmapcontroversiesthroughthepracticesofdesigning,researchingandcuratingthevirtualexhibitions(VEs)ontheonlineplatformwww.euscreen.eu.Thisisaplatformthatmakesfreelyaccessiblethousandsofaudiovisualitemsoriginatingfrom21contentprovidersinEurope.TheVEswerecuratedbyresearchersincollaborationwith1)contentproviders(CPs)consistingofaudiovisualarchivesandresponsibleforco-selectinganduploadingtheircontenttothe‘SpecialCollections’onplatform;2)tooldevelopersinchargeofdevelopingtheVEbuilderandalltheuserfunctionalitiesaroundit;3)designersresponsibleforthedesignofthefrontendoftheexhibition,thedesignoftheuserexperienceaswellasformediatingthecommongroundsbetweentheresearcher’sneedsandthepotentialsoftooldevelopment;4)theVEbuilderwhichallowedtheresearcherstocuratetheirexhibitions.TheresearchersdraftedthecontentselectionstrategyfortheCPs;viewed,researched,furtherselectedandthenbookmarkedthecontentuploadedtotheSpecialCollections;definedandadvisedonthedevelopmentanddesignoftheVEbuilderthroughwireframing,paperprototypingandjointworksessionswiththedesignersandtooldevelopers;andlastbutnotleast,curatedtheirvirtualexhibitionsasanendresultofallthesecollaborativeworkpractices.
29
Takingapractice-orientedanthropologicalapproachtotheresearchers’journeythroughcuratingtheVEs,thepaperwillexploredigitalcurationthroughwhatLatour(1991)calledthe‘chainofassociations’andthe‘seriesoftransformations’underwentbytheactorsandactantsinvolvedaswellasthroughthe‘translations’thattookplacethroughoutthecollaborativeworkprocess,whichsawtheinitialenunciationsoftheVEsturnintothefinalproductsthatwerepublishedonline.
Bymappingtheassociationsthatresearchersenteredinto,thetransformationstheyunderwentaspartoftheseassociationsandthetranslationsthattookplacefromtheirfirstVEideastothefinalcuratedobjectsonline,thispapertriestopindownwhatitisthatchangesinthepracticeofdoinghumanitiesresearchwhenknowledgeis(co)producedwithdigitaltools,digitizeddataandattheintersectionatdifferentfieldsofexpertise.
Makingsenseofdigitalhumanitiesasatransformativepracticeof(co-)producingknowledgeisafertilegroundtocometotermswiththisemergingfieldofstudy.Ithelpsusunderstand‘digitalpractices’intermsofwhathumanitiesscholarsdowithdigitaltoolsindigitalenvironments,toparaphraseCouldry’s(2010)understandingof‘mediapractices’.Iargue,thusthatunderstandingtheproductionofknowledgeindigitalhumanitiesbecomesanarchaeologicalactofretracingassociationsandtransformationsthroughdifferentspacesofexpertise,differentactorsandactants,lendingitselftowhatFoucaultcalled‘principlesofdiscontinuity’.Thismayhelpbridgethegapsbetweendigitaltechnologiesand(analogue)researchers,techniciansandscientiststhatareatthecoreofcontroversiesinthefield.
ReferencesNickCouldry,‘TheorisingMediaasPractice’inSocialSemiotics,Vol.14,Issue2,2004,pp.115-132.Publishedonline:13Oct2010,http://dx.doi.org/10.1080/1035033042000238295
MichelFoucault,ArchaeologyofKnowledge,Routledge,1972
BrunoLatour,‘TechnologyisSocietyMadeDurable’in:J.Law,ed.,ASociologyofMonstersEssaysonPower,TechnologyandDomination,SociologicalReviewMonographN°38,pp.103-132,1991
PatrikSvensson,‘HumanitiesCompuringasDigitalHumanities’,DigitalHumanitiesQuaterly,3(3),2009
TommasoVenturini,‘Divinginmagma:howtoexplorecontroversieswithactor-networktheory’in:PublicUnderstandingofScience,19(3),2010,pp.258–273
3.ResearchopportunitiesforthearchivedwebintheBeneluxSallyChambers,GhentCentreforDigitalHumanities,GhentUniversity
PeterMechant,Media,InnovationandCommunicationTechnologies(MICT),GhentUniversity
KeesTeszelszky,NationalLibraryoftheNetherlands
YvesMaurer,NationalLibraryofLuxembourg
Web-archivingorcollectingportionsofthewebtoensuretheinformationispreservedinanarchive,beganin1996withtheInternetArchiveinitiative26anditswell-knowndigitalarchive‘TheWaybackMachine’27.Othershavefollowed,fromnationalandstatelibrariesandarchivestomuseumsand
26https://archive.org/27https://archive.org/web/
30
nonprofitssuchas‘CommonCrawl’whichcorpuscontainsmorethan3.14billionwebpagesandabout250TBofuncompressedcontent28.
Althoughmostresearchersinthehumanitiesstillneedtobegintoexplorethepotentialofthesearchives,someprojectshavealreadyinvestigatedtheirpotential,e.g.theBUDDAHproject29(BigUKDomainDatafortheArtsandHumanities)whichawardedbursariestoresearcherstocarryoutresearchintheirsubjectareausingtheUKwebarchive,ortheRESAWnetwork30(ResearchInfrastructurefortheStudyofArchivedWebMaterials)whichaimsatpromotingacollaborativeEuropeanresearchinfrastructureforthestudyofarchivedwebmaterials.Despitesuchinitiatives,researchersinthehumanitiesstillstrugglewiththecomputationalturnoftheirfieldontheoretical,methodological(e.g.developtheoreticalandmethodologicalframeworkswithinwhichtostudywebarchives)andpracticallevels(e.g.theylackexpertiseandknowledgetousewebarchivesandtoapplydigitalmethodsorbigdatatechniquesontheircorpus).
Althoughgeographicallyveryclose(thehistoryof)nationalwebarchivingisverydifferentforthethreeBeneluxcountries:
IntheNetherlands,theNationalLibraryalreadystartedin1992withmappingtheDutchwebbycompilingwebdirectoriesorweblists.Afirstwebarchivingpilotwasconductedin2003andweb-archivingstartedasaregularactivityin2007usingaselectiveharvestingstrategybasedonaselectionoftheexistingwebdirectory(governmental,culturalandacademicwebsites,sitesthatmirrortrendsontheweb,and‘endangered’websiteswhichareconsideredasDutchdigitalculturalheritage).AsperJanuary2017,12,000websiteshavebeenharvestedwithHeritrix31(26TBofcompressed.arcfiles,211millionURLs).Alinguisticanalysisofthecollectionhasnotbeendoneyet,but368Frisianwebsitesareincluded.TheDutchwebarchiveisavailableinthereadingroomsoftheNationalLibraryoftheNetherlandsorresearcherscanrequestaccesstothedataforspecificprojects32.
InLuxembourg,apilotprojectforweb-archivingwasundertakenin2005andsubsequentlythelegaldepositlawwasextendedin2009toalsocovercontentpublishedontheweb.Duetofundingissues,theregularharvestsforthe.ludomainandotherwebsiteshostedinLuxembourgonlystartedinAugust2016.AsecondcrawlfinishedinJanuary2017.Thesecrawlsweresupplementedwithdatafromanumberoftargetedcrawlsofgovernmentalsites.Currently,thearchivecontains15TBofcompressedwarcfiles(250millionURLs)witharound40%ofthehostsintheccTLD.luand35%inthe.com.SimilartotheDutcharchive,adetailedlinguisticexaminationoftheLuxembourgcollectionhasnotbeendoneyet,butabasiclinguisticanalysisshowsthepresenceof30%English,30%French,15%German,5%Luxembourgishanda‘longtail’ofotherlanguages.TheLuxembourgwebarchivewillbeavailableinthereadingroomsofthenationallibraryandresearcherscanbegrantedaccesstotheunderlyingdatasetoncase-by-casebasis.
Althoughthe.bedomainwasintroducedinJune198833,theBelgianwebiscurrentlynotsystematicallyarchived.AsofFebruary2017,1.561.932domainsareregisteredbyDNSBelgium34.WithoutaBelgianwebarchive,thecontentofthesewebsiteswillnotbepreservedforfuturegenerationsandasignificantportionofBelgianhistorywillbelostforever.InDecember2016,apilot28 http://commoncrawl.org/29 http://buddah.projects.history.ac.uk/30 http://resaw.eu/31 https://webarchive.jira.com/wiki/display/Heritrix32 https://www.kb.nl/en/organisation/research-expertise/long-term-usability-of-digital-resources/web-
archiving33 HistoryoftheBelgianweb:https://www.dnsbelgium.be/en/history34 DNSBelgium:https://www.dnsbelgium.be/en
31
web-archivingprojectcalledPROMISE(PReservingOnlineMultipleInformation:towardsaBelgianStratEgy)wasfunded.Theaimoftheprojectisto(i)identifycurrentbestpracticesinweb-archivingandapplythemtotheBelgiancontext,(ii)pilotBelgianweb-archiving,(iii)pilotaccess(anduse)ofthepilotBelgianwebarchiveforscientificresearch,and(iv)makerecommendationsforasustainableweb-archivingserviceforBelgium.Thispilotprojectisconsideredasafirststeptowardsimplementingalong-termwebarchivingstrategyforBelgium.SimilartoLuxembourg,Belgiumhasvariousofficiallanguages35thatwillneedtobeconsideredduringthepilotphase.
Fromaweb-archivistperspective,akeychallengeishowtocollaborateonjointweb-archivinginitiativestoenable‘trans-national’researchopportunities,forexample,bytakingeitherasite-,topic-,ordomain-centricarchivingapproach,orbyunifyingmethodologicalapproachesfordiscovery,acquisitionanddescriptionofwebcontent.Fromtheviewpointofaresearcherinthehumanities,web-archivesarerich‘born-digital’resources,whichcananalysedalongsideotherdigitisedandanaloguesourcesinawiderangeofhumanitiessubjectareas.
ForresearchingthearchivedwebintheBenelux,possible‘tri-national’researchquestionscouldincludealinguisticanalysisofthe‘Beneluxweb’,orageo-spatialanalysisofthegeographicdistributionofweb-domainsacrossBeneluxregion36.Similarly,researchquestionscouldfocusonjusttwooftheBeneluxcountries,suchasthegeographicdistributionofDutch-languagewebsitesintheNetherlandsandFlanders;ortheGerman-languagewebsitesinBelgiumandLuxembourg.Furthermore,asmanyEuropeanUnioninstitutions(withwebsitesinthe.eudomain)arelocatedwithintheBeneluxregion,thisalsooffersafurtherwealthofopportunitiesforhumanitiesresearchers.
Whiletheincreasedavailabilityofsuch(big)born-digitaldatasetsopensuptheopportunitiesforusingcomputationalresearchmethods,italsopointstotheneedtouptakenewskills.Itwillbethereforebeimportanttoestablisharangeofstandardtoolsandmethodswhicharewidelyacceptedforarchivedwebresearch37.Despitethesechallenges,thearchivedwebofferssubstantialopportunitiesfordigitalhumanitiesresearchers,bothintheBeneluxandbeyond.
Thispaperdiscussesa)thepotentialofwebarchivesfordigitalhumanitiesresearchers,b)introducestheweb-archivesinTheNetherlands,LuxembourgandBelgiumandc)presentsthepossibilitiesfortrans-nationalresearchthatcollaborationbetweentheBeneluxweb-archivescouldenable.
35 ThethreeofficiallanguagesofBelgiumareDutch,FrenchandGermanwithEnglishalsobeingwidelyused
OfficiallanguagesofBelgium:https://en.wikipedia.org/wiki/Languages_of_Belgium36 DNSBelgiumhasmappedthegeographicdistributionofweb-domainsbylocalauthorityacrossBelgium,
see:https://www.dnsbelgium.be/whois/stats.TheusefulofextendingthismappingcouldbeextendedtothewholeBeneluxregion.
37 Forexamples,see:Truman,Gail.2016.WebArchivingEnvironmentalScan.HarvardLibraryReport:https://dash.harvard.edu/handle/1/25658314
32
SessionE
BeFAIRorbesquare:Stakeholders’perspectivesondataqualityintheDigitalHumanitiesReinierdeValk,DataArchivingandNetworkedServices(DANS)
VanessaHannesschläger,ÖAW-ACDH(AustrianCentreforDigitalHumanities)
KlausIllmayer,ÖAW-ACDH(AustrianCentreforDigitalHumanities)
FrancescaMorselli,DataArchivingandNetworkedServices(DANS)
EmilyThomas,DataArchivingandNetworkedServices(DANS)
IntroductionDigitaldataiscreatedeveryday.Notonlyhaveculturalandresearchinstitutesbeenmassivelydigitisingtheiranaloguecontentoverthepastdecades(digitisedobjects),butresearchinstitutesandindividualresearchersarealsoconstantlyproducingnewdigitaldata(born-digitalobjects).Thisisnotanewrevelation:withinthenaturalsciences,researchershavebeenusingandproducingstructured(and,morerecently,machine-readable)dataforcenturies.However,overthelastdecadestheresearchlandscapehasbeenchanging:withinthesocialsciencesandhumanities(SSH)disciplines,too,theuseofexistingdigitaldataandtheproductionofnewdigitaldatahasincreasedenormously[2,12,13,16].Thisentailsseveralissuesthatmustbeaddressed[5,9,18,23].
Thenecessitytopreserveandensurereusabilityfortheincreasingquantityofthisdata,whichtendstobequiteheterogeneous,hasmadetheissueofdataqualityaspecificallyurgentone[1,15,17].Inordertodepositresearchdatainatrustedrepository,itneedstomeetaminimumsetofqualitycriteria,suchascompleteness,reliability,andcorrectformalstructurebymeansoftheimplementationofinteroperableordiscipline-specificstandards[3,22].
Moreover,newactorsaswellasnewrelationshipsamongthemhaveemerged--aconsequenceofthereuseandsharingofresearchdataamongresearchersandinstitutions[20].Notonlyresearchersandresearchinstitutions,butalsoculturalheritageinstitutions,researchinfrastructuresandEuropeanprojects--allofwhichcanbereferredtoasstakeholders--arenowheavilyinvolvedindataexchangeprocesses,withtheaimofincreasingdatainteroperabilityandvisibility.
Againstsuchacomplexbackground,however,itisdifficulttodevelopandmutuallyagreeonatrulysharedvisionofwhathigh-qualitydatais,andwhatisrequiredtoachieveit.OneinnovativeapproachtoreachcommongroundisapplyingtheFAIRprinciples.
TheFAIRprinciplesFollowingalifesciencesworkshopinLeidenentitledJointlydesigningadataFAIRPORTin2014[4],aminimalsetofcommunity-agreedguidingprincipleswereformulatedbyadiversegroupofstakeholders,sharinganinterestinscientificdatapublicationandreuse.Thiswasinordertomakedatamoreeasilydiscoverable,accessible,appropriatelyintegratedandreusable,andadequatelycitableforbothmachinesandpeople.TheprinciplesthatwereconstructedherearenowwellknownastheFAIRprinciples[6,7,8,24],andactasaguidetodatapublishersandstewardsratherthanbeingastandardorspecification.Althoughtheseprincipleswereconceivedwithinalifesciencescontext,socialsciencesandhumanitiesalsofacesimilarissuesastheybecomemoredigitised,makingthetopicofFAIRdatamanagementalsoapplicabletothesefields.Insimplerwords,theFAIRprinciplesprovideasetofmilepostsfordataproducersandpublisherstohelpensurethatalldatawillbeFindable(definedbyapersistentidentifieranddetailedmetadata),Accessible(well-definedlicenseandaccessconditions),Interoperable(readytobecombinedwithotherdatabyhumansand
33
machines:standardisedformatsandvocabulary)andReusable(readytobereusedinfutureresearchandprocessedusingcomputationalmethods).
StakeholdersWhenitcomestorepositoryanddataquality,themainfactorshapingindividualneedsandrequirementsisnotthedisciplinethedatacomesfrom,butratherthetypeofstakeholder,whichiscrucialfortheperspectiveonthedataandthenecessitiesthatcomewithit.InordertomotivatestakeholderstocommittotheFAIRprinciples,thedifferenttypesfirsthavetobeidentifiedandtheirspecificinterestshavetobeinvestigated.Examplesofstakeholdersareresearchcommunities,funders,dataarchives,researchinfrastructures,projects,andculturalheritageinstitutions.Itisimportantthatthesegroupsarebroughttogethertoaligntheirdifferentperspectivesondataasproducers,consumersandproviders.Forinstance,findabilityfromabroaderperspectivemightalsomeanhavingauser-friendlyinterfaceforaresearchertofinddatasets,whilstforaresearchinfrastructure,theavailabilityofsubstantialmetadatawouldbethecoreinterest.Therefore,differentstrategiesofcommunicatingandimplementingtheFAIRprincipleswillbenecessarytoreachthevarioustypesofstakeholders.
Moreover,bringingtogetherdifferentstakeholdersallowsdiscussionsforcollaborationsintheimplementationoftheFAIRprinciplesandsharingexperiencesofongoingefforts.Especiallywhenitcomestodataqualityanddataexchange,adiscussionofthegeneralframeworkoftheFAIRprinciplescanhelptobettercoordinatethedifferentapproachesandinterestsofstakeholders.
TheproposedpanelWeproposeapaneldiscussionwithrepresentativesofdifferentstakeholders,bothcurrentandpotentialfutureFAIRimplementers.TheirdiscussionwillfocusontheapplicationoftheFAIRprinciplestoimprovedataquality,formulatingFAIRdatamanagementrequirements(e.g.,byfunders)andassessingthequalityofdatasets(e.g.,byrepositories).Thiswillhelpdeterminecommonapproachesaswellasvariationsinperspective.Thefocuswillbeontheexchangebetweenthosewhoalready(startedto)implementtheFAIRprinciplesandthosewhohavenotyetdoneso;thiswillhelpanalysewhichgoalsareattainablebythevariousstakeholders.Weplantoinviterepresentativesofthefollowingtypesofstakeholders:
• researchers• researchinstitutes• culturalheritageinstitutions• researchinfrastructures• projects• funders.
Thefollowing(deliberatelyslightlycontroversial)sevenstatementsareintendedtoguidethediscussion:
1. Dataqualityiscompromisedbychangingresearchmethodsandincreasedcollaborationamongstakeholders.
2. Asaconsequence,stakeholdersdonotsufficientlyaddressthechallengeofguaranteeingdataqualityduetochangingresearchmethodsandtechniques.
3. Dataquality(includingimplementationsofFAIR)isnothighenoughontheagendaofthevariousstakeholdergroups.
4. Therefore,stakeholdersshouldimplementtheFAIRprinciples,evenifthismeansthatcurrentapproacheshavetobeadapted.
5. StakeholdersshouldraiseawarenessaboutusingtheFAIRprinciples.6. DataproducersareresponsibleforensuringtheFAIRnessoftheirdata.
34
7. TheimplementationofFAIRprinciplesshouldbemonitoredininstitutionsand/oramongdifferentstakeholders.
Panelists(tobeconfirmedshortly)
Selectedbibliography[1]Batini,C.,andScannapieco,M.(2006).Dataquality:Concepts,methodologiesandtechniques.Berlin:Springer.
[2]Borgman,C.L.(2015).Bigdata,littledata,nodata:Scholarshipinthenetworkedworld.Cambridge,MA:MITPress.
[3]Brown,A.(2008).SelectingFileFormatsforLong-TermPreservation.https://www.nationalarchives.gov.uk/documents/selecting-file-formats.pdf
[4]DataFAIRport.DataFAIRportconference:JointlydesigningadataFAIRport.http://www.datafairport.org/component/content/article/8-news/9-item1
[5]Dix,A.,Cowgill,R.,Bashford,C.,McVeigh,S,andRidgewell,R.(2014).Authorityandjudgementinthedigitalarchive.In1stDigitalLibrariesforMusicologyWorkshop,London,UK.
[6]DutchTechcentreforLifeSciences.FAIRData.http://www.dtls.nl/fair-data/
[7]DutchTechcentreforLifeSciences.GO-FAIRinitiative.http://www.dtls.nl/go-fair/
[8]Force11.GuidingprinciplesforFindable,Accessible,InteroperableandRe-usabledatapublishingversionb1.0.https://www.force11.org/fairprinciples
[9]Giaretta,D.(2011).Advanceddigitalpreservation.Berlin:Springer.
[10]Griffin,G.,andHayler,M.,eds.(2016).ResearchmethodsforreadingdigitaldataintheDigitalHumanities.Edinburgh:EdinburghUniversityPress.
[11]Hayler,M.,andGriffin,G.eds.(2016).ResearchmethodsforcreatingandcuratingdataintheDigitalHumanities.Edinburgh:EdinburghUniversityPress.
[12]Kaplan,F.(2015).Amapforbigdataresearchindigitalhumanities.FrontiersinDigitalHumanities2(1):1-7.
[13]Lane,R.J.(2016).Thebighumanities:DigitalHumanities/digitallaboratories.London:Routledge.
[14]LERU(2013).LERUroadmapforresearchdata.http://www.leru.org/files/publications/AP14_LERU_Roadmap_for_Research_data_final.pdf
[15]NISO(2007).Aframeworkofguidanceforbuildinggooddigitalcollections.3rded.http://www.niso.org/publications/rp/framework3.pdf
[16]Owens,T.(2011).Definingdataforhumanists:Text,artifact,informationorevidence?JournalofDigitalHumanities1(1):n.p.http://journalofdigitalhumanities.org/1-1/defining-data-for-humanists-by-trevor-owens/
[17]Peer,L.,Green,A.,andStephenson,E.(2014).Committingtodataqualityreview.InternationalJournalofDigitalCuration9(1):263-291.
[18]Pryor,G.,ed.(2012).Managingresearchdata.London:FacetPublishing.
35
[19]Purdy,J.P.,andWalker,J.R.(2010).Valuingdigitalscholarship:Exploringthechangingrealitiesofintellectualwork.Profession1:177-195.
[20]Quan-Haase,A.,Suarez,J.L.,andBrown,D.M.(2014).Collaborating,connecting,andclusteringinthehumanities:Acasestudyofnetworkedscholarshipinaninterdisciplinary,dispersedteam.AmericanBehavioralScientist59(5):565-581.
[21]Terras,M.(2010).Digitalcuriosities:Resourcecreationviaamateurdigitization.LiteraryandLinguisticComputing25(4):425-438.
[22]Tjalsma,H.,andRombouts,J.(2011).Selectionofresearchdata:Guidelinesforappraisingandselectingresearchdata.TheHague:DANS.
[23]VanZundert,J.(2012).Ifyoubuildit,willwecome?LargescaledigitalinfrastructuresasadeadendforDigitalHumanities.HistoricalSocialResearch37(3):165-186.
[24]Wilkinson,M.D.,etal.(2016).TheFAIRGuidingPrinciplesforscientificdatamanagementandstewardship.ScientificData3(160018):1-9.
36
SessionF
APragmaticApproachtoUnderstandingandUtilizingEventsinCulturalHeritageLoraAroyo1,MarnixvanBerchum4,LizzyJongma3,WillemRobertvanHage5,GerardKuys6,SusanLegene1,AnneliesVanNispen3,JaccovanOssenbruggen2,LodewijkPetram4andPiekVossen11VrijeUniversiteitAmsterdam,TheNetherlands2CentrumWiskunde&Informatica(CWI),TheNetherlands3NIODInstituutvoorOorlogs-,Genocide-enHolocauststudies,TheNetherlands4HuygensING,TheNetherlands5NetherlandseScienceCenter,TheNetherlands6NationaalArchief,TheNetherlands
IntroductionCulturalheritageinstitutionsarecontinuouslyrethinkingtheaccesstotheircollectionstoallowthepublicaswellasscholarsandprofessionalstointerpretandcontributetotheircollections.TheircollectionsarechallengedwithadvancementoftheWeb.Theyneedtobepresentedinasustainablewayonline,andtobeinstantlysearchableandunderstandableforexpertsandlayaudiences[1].Hermeneuticsishumanitiestheoryofinterpretation.Currentlyitisamendedtodigitalhermeneuticstoformtheappropriatecontexttothinkaboutprovidingaccesstoandinterpretationofonlineculturalheritagecollections[2].
Importantroleintheinterpretationofculturalheritagecollectionsplay‘historicevents’,whichmeaningkeepsbeingre-discoveredandre-interpretedinlightofmoderndiscussions.HistorychangesovertimeandwiththepresenceofthesocialWebitisundercontinuousevolvement.“Itisnotonly‘grand’historicaleventsthataresubjecttochangesininterpretation.Singlewords,concepts,ideasandbookscanalsohavedifferentmeaningsacrosstime,spaceandsocialgroups.”[3].Automatictextanalysistechniquesprovidethemeanstominelargeamountsofunstructureddataandgivescholarsaccessto`bigdata’.Tounderstandbetterthis‘bigdata’weobserveashifttowardsdeeperdataminingfocussedontheretrievalofmeaningfulunits,e.g.answers,entities,events,discussions,andperspectives.Additionally,wealsoobserve,apushtowardstheautomaticcreationofknowledgegraphsthatarepopulatedwithrichsemanticunits,e.g.entities,relations,activities,eventsprovidepossibilitiesofdivingintomorethedetailsandaddressmorecomplexquestions.Allthiscomesasaresponsetotheneedtounderstandbetter‘events’andtheirsemanticstructureandthushelp,ontheonehand,heritageinstitutionsassigningmeaningandvaluetoonlinecollectionobjects,andontheotherhand,helphumanitiesscholarsintheexplorationandcontextualizationoftheirtasks[3].
MethodologyThisworkismotivatedbythe(1)demandsforfacilitatingdeeperunderstandingofonlineculturalheritagecollections,andbythefactthat(2)eventsemergedasakeyelementintherepresentationofdatainareassuchashistory,culturalheritage,andmultimedia.Webringtogethercomputerscientists,computationallinguistsandhumanitiesandsocialsciencesscholarsinordertobuilduponandexpandtheresultsinexistingresearchcommunities,e.g.NLP,InformationRetrieval,SemanticWeb,SocialWebAnalytics,Multimediaanalysis,andprovidestructureanddeeperunderstandinginhistory,media,journalismandculturalheritageresearch,withaspecificfocusonhoweventsareusedasakeyconceptforrepresentingknowledgeandorganisingmediainonlinewebcollections.TheultimategoalistodistillaresearchandapplicationroadmapsforeventsinCulturalHeritage,
37
e.g.achievingasocialconsensusonprocesses,identifypracticalstandardsandprotocols,definingtheinfrastructureneeded.
Ourapproachistwo-foldfollowingtwoparalleltracks.Ontheonehand,wedivetop-downtoprovideancomprehensiveanalysisofthestate-of-the-artaroundeventsandtheirpivotalroleinenrichingthecontentofcollectionsintheseareas.Inthiscontext,westudytheiradded-valueinenablingnewmeaningfulinteractionswithmultimediacollectionsonlineforhumanitiesscholars,heritageprofessionalsandlayaudiences[4].Wealsostudytheirvariousaspectsandpotentialbenefitsofassigningeventsintherepresentationandorganisationofknowledgeandmedia[5].Forthis,weexploremethodsandtechniquestosupport(1)detection,modelingandrepresentationofeventsinonlinecollections;and(2)searching,explorationandinterpretationofonlinecollectionsenrichedwithevents.Forexample,weassesstheutilityofexistingeventmodelstosupportusersinderivingusefulfacts.
Ontheotherhand,weemergeabottom-upanalysisofconcreteusecasesanddatasets.Weguideourexplorationsthrougheventdetectionandanalysisperformedbymachine[6,7]andhuman[8,9]computationondifferentcollectionsinthecontextofconcreteusecases.Weidentifyfourgroupsofresearchquestionsrelatedto(1)eventidentityanddefinitions,(2)eventdetectionandextraction,(3)eventmodellingandrepresentation,and(4)eventrelationshipsandinteractionswithapplications.
Inthecontextofstudyingtheeventidentityanddefinitionsweareinterestedinunderstandingbetterwhatistheinternalstructureofanevent;whatarethedifferencesbetweenevents,actionsandstates;whentwoormoreeventsthesame;whataredifferentpointsofviewandinterpretationsofthesameevent.
Tocontinuouslyimprovemethodsandtoolsforeventdetectionandextractiontheresearchneedstobeguidedbyadeeperunderstandingofhoweventscanberecognisedindifferentmediatypes;howcanweassignanotionofnovelty&veracitytoevents;howcanweassignalevelofgranularitytoevents;howaredifferenteventsrelated.
Tomodeleventsandrepresentknowledgeabouteventsacrossdifferentdomainsweseekdeeperunderstandingofshared,openorproprietaryknowledgestructures,suchasvocabularies,taxonomiesandthesaurithatcanbuildthebackboneofsuchmodels.Wefurtherstudyhowwecanachieveinteroperabilityofeventstructure,andwhataretheeventrepresentationrequirementsfordifferenttypesofevents,e.g.,historical,cultural,personalevents.ItisalsointerestingtoknowhowdoesSocialWebinfluenceorcontributetotheunderstandingofevent.
Inthiscontext,wemovebeyondthetypicalphilosophicalleveldiscussionsabouteventsandprovidethelandscapeofthedifferentpointsofviewsandschoolofthoughtonthatmatter.Tofacilitateasharedandpragmaticapproachtodealwithevents,wefocusonexistingmodels,suchastheSimpleEventModel38,LODE39,EVENT,Schema.org,Wikidata.Eachofthemhasbeendevelopedtomakeuseofexistingvocabulariesanddatasourcesthatdescribeevents,whereeventsrefertoeverythingthathappens,evenfictionalevents.
Finally,weaimtounderstandthediversityofeventrelationshipsandtheirinteractionswithapplicationsanddata,i.e.howcaneventsberepresentedintosupportcollectionbrowsing,serendipitousexploration,narrativebuilding;whatareusefultoolsforeventannotationbyexpertsandlaycrowds;whatareefficientwaysofcrowdsourcingeventannotations;whataresuccessfulmethodsforeventvisualisation&interaction.
38http://semanticweb.cs.vu.nl/2009/11/sem/39http://linkedevents.org/ontology/
38
References[1]CVanDenAkker,AvanNuland,LvanderMeij,L.Aroyo(2013).Frominformationdeliverytointerpretationsupport:evaluatingculturalheritageaccessontheweb
[2]Capurro,R.(2010).DigitalHermeneutics:AnOutline.AI&Society2010,35(1),35-42
[3]Wyatt,S.,Millen,D.(Eds.)MeaningandPerspectivesintheDigitalHumanities.AWhitePaperfortheestablishmentofaCenterforHumanitiesandTechnology(CHAT),KNAW,2014
[4]VDeBoer,JOomen,OInel,LAroyo,EVanStaveren(2015).DIVEintotheevent-basedbrowsingoflinkedhistoricalmedia
[5]MvanErp,JOomen,RSegers,CvandenAkker,LAroyo,GJacobs(2011).Automaticheritagemetadataenrichmentwithhistoricevents.Archives&MuseumInformatics,Toronto
[6]Sprugnoli,R.,Tonelli,S.(2016)‘One,nooneandonehundredthousandevents:Definingandprocessingeventsinaninterdisciplinaryperspective’,NaturalLanguageEngineering,pp.1–22.
[7]TPloeger,MKruijt,LAroyo,FdeBakker,IHellsten(2013).ExtractingactivisteventsfromnewsarticlesusingexistingNLPtoolsandservices(2013)
[8]ADumitrache,OInel,BTimmermans,LAroyo,RJSips(2015).CrowdTruth:Machine-HumanComputationFrameworkforHarnessingDisagreementinGatheringAnnotatedData.
[9]LAroyo,CWelty(2013).Harnessingdisagreementforeventsemantics.IntheproceedingsofDetection,Representation,andExploitationofEventsintheSemanticWeb.
39
SessionG
Panel:StrategiesforintegratingDigitalHumanitiesskillsandpracticesintheHumanitiesCurriculumSusanAasman(UniversityofGroningen)40StefaniaScagliola(UniversityofLuxembourg)41
InthispanelweintendtoevaluatepoliciesandstrategiesthathavebeenappliedtointegratebothDHskillsandpracticesinHumanioracurricula.MostteachingonDHisofferedinseparateMinorand/orMasterprogramsandisscarcelyintegratedintheregularcurriculum.Thisreflectstheskepticismwithregardtothestatusofdigitalapproachestohumanitiesresearch:aretheysupposedtograduallymergeintotheregularhumanitiescurriculumorisDigitalHumanitiesgoingtoremainadistinctivefield?(Reid,2012).Overall,alackofconsensusonthelevelofexpertisethatshouldbetaughtcanbediscerned.Shouldtheybetrainedtobeabletomakeaneducatedchoiceintheirfuturejobofwhichkindoftechnologicalexpertisetheyshouldseek?Orshouldthegoalbetousethetoolsthemselvesandbeabletocustomizethemtotheirspecificneeds?Eventeachingtheverybasicskillstostudentsonhowtheycansearch,access,process,analyseandcreateinformationwithdigitaltools,requiresliterallymorespaceandtimethenisavailablewithinatraditionalsubjectsuchas‘methodsofresearch’.NordoestheintroductiontotheservicesoftheLibrarythatisofferedyearlyatthestartofahumanitiesbachelor,sufficetocoverallnecessaryskills(Ferrarietal,2014,Clement,2012).Thisknowledgegapisremarkable,consideringthewidelysharedbeliefthatDHskillsareanimportantassetforincreasingthechancesofstudentsonthejobmarket(Clement,2012,Scagliolaetal,2014).ItisclearthatthefutureofDigitalHumanitiesteachingfacesanumberofinstitutional,political,logisticalandpedagogicalchallenges.
Ourintentistoofferanalternativetotheusualidealtypicalagendasonwhatshouldbedonetosolvethisproblem.Weintendtogatherbestpracticesthroughtheactiveinvolvementoftheattendantsoftheworkshop.WewillstartwithashortoverviewofexistingDHteachingcourseswithintheBenelux,thatcanberetrievedthroughtheDARIAH/CLARIAHwebbasedCourseRegistry.Followingtheshortoverview,threeexemplaryusecasesfromourownteachingpracticewillbeintroduced.
Case1:Integrationincurriculum:MasterprogramDigitalHumanitiesinafacultyofArts• Context:designingaMasterprogramforaFacultyofArts(History,ArtHistory,Journalismand
MediaStudies,Literature,Film,EuropeanLanguagesandCulture,CommunicationScience,InformationScienceandArcheology),whichisopentoallstudentswithaBAinoneoftheArts
• Goal:offeringanall-roundprogramthatcombinestheoreticalreflectiononDigitalHumanitiesandtheroleofdigitaldataincontemporarycultureandsociety(includingArt),toskillcourses(codingforHumanities,creatingadatabase)anddatahandling(creating/analyzing/visualizing)
• Credits:60ECprogram,noentryrequirements• Obstacle:makingtheshiftinjustoneyearfromaregularBachelorprogrambasedonthe
traditionalhermeneuticframework,tomorequantitativeapproachesrequiringnewskills,newmethodswhilekeepingclosetothedisciplinarybackground.Thisposesdilemmasonwhattoleaveoutandinclude.
40 SusanAasmanismediahistorianandworksattheDepartmentforMediaStudiesattheUniversityof
Groningen.ShealsocoordinatestheMasterprogramDigitalHumanitiesandisDirectoroftheGroningerCentreforDigitalHumanities.
41 StefaniaScagliolaisahistorianandworksasapostdocatC2DH,theCentreforContemporaryandDigitalHistoryoftheuniversityofLuxembourg.SheisdevelopingaplatformforteachingDigitalSourceCriticism.
40
Thesecondcaseisacoursethatisinpreparation.
Case2:Integrationincurriculum:subjectDigitalSourceCriticisminatraditionalfacultyofHistory• Context:DesigningaplatformforDigitalSourceCriticismforbachelorandmasterstudentsto
teachdigitalhistory,withinatraditionalfacultyofhistory.• Goal:teachingstudentsthepracticalandtheoreticalimplicationsofhistoricalsourcesindigital
form,teachthemhowtocreateadigitalobject/exhibit/publication.• Credits:tobedecided• Obstacle:theamountoftimetotrainskillsandtocreateadigitalobject,isnotavailablewithin
theexistingcurriculum.DHismethodoriented,whereasmosthistoryclassesarethematic.
Case3:IntegrationintheeducationalresourcesofahumanitiescurriculumoftheDIgitalHumanitiesCourseRegistry.• Context:asearchenvironmenthasbeendesignedthatoffersanoverviewofDHcoursesthatcan
betakenupintheBenelux• Goal:Thegoalistoofferstudentsandlecturerstheopportunitytogetanoverviewofthecourses
thataretaught.Studentscanorientatethemselvesandchooseabachelor,masterorsubject,lecturerswithinterestintakingupDHintheirteachingcanorientatethemselveswithregardtocontentandapproachbydrawingontheeffortoftheirpeers.
• Credits:notapplicable• Obstacle:Theresourcehasbeencreated,butisnotintegratedintothestandardeducational
resourcesthatareofferedtostudentsandlecturerstoorientatethemselves.Reachingouttotheintendedaudienceisproblematic.
Afterashortintroduction,theparticipantswillthenbedividedinthreegroupsandeachgroupwillbegivenacasestudywithanassignmentrelatedtothechallengesthatthecaseposes.Theywillberequestedtobrainstormonpossiblesolutionsanddocumenttheirsuggestionsinacollectiveonlinedocument.Thiswillformthebasisforabroadercollectivedocumentthatcanbecrowdsourcedwithintheteachingcommunity,turnedintoapublicationandpresentedatanextBeneluxDHgathering.Afterthebrainstorm,eachgrouppresentsitsfindings.
Ourexpectationisthatthefocusonaconcreteteachingpracticebyscholarsdirectlyinvolvedinthefield,willyieldusefulinsightsintotheirbestpracticesandstrategiesforexpandingtheinterestinDH.OneofthecentralissuesremainsthequestionwhetherDigitalHumanitiesshouldbeconsideredasanentityinitselfcompetingwithregularsubjectsorwhetheritshouldbeintegratedintotheregularcurriculumandbecomeastandardpractice.
LiteratureAnuscaFerrari,BarbaraNežaBrečko,YvesPunie,‘DIGCOMP:AFrameworkforDevelopingandUnderstandingDigitalCompetenceinEurope’,in:eLearningPapers38,May2014–www.openeducationeurope.eu/en/elearning_papersn.38
T.Clement(2012),‘MultiliteraciesintheUndergraduateDigitalHumanitiesCurriculum:Skills,Principles,andHabitsofMind’.InHirsch,B.(ed),DigitalHumanitiesPedagogy:Practices,PrinciplesandPolitics.Cambridge,U.K.:OpenBookPublishers.365-388.
R.Reid(2012),‘GraduateeducationandtheethicsoftheDigitalHumanities’,in:MatthewK.Gold(ed),DebatesinDigitalHumanities,Minnesota,USA.:UniversityofMinnesota.
S.Scagliola,F.Maas,E.Stronks,‘TheTeethingTroublesofTeachingDigitalHumanities:Sharingknowledgeandmappingchallenges’,presentationattheDHBenelux2014.
41
SessionH
1.IstheEuropeofKnowledgethetalkofthetown?ExploringthepotentialofdigitaldataonMEPspeechesintheEuropeanParliamentMartinaVukasovic1,a,JulieM.Birkholz1,b,JelenaBrankovic1,2,c1CentreforHigherEducationGovernanceGhent(CHEGG),DepartmentofSociology,FacultyofPoliticalandSocialSciences,GhentUniversity,KorteMeer5,9000Gent,Belgium2UniversitatBielefeld,FacultyofSociology,GebäudeXC2-201,Bielefeld,Nordrhein-Westfalen,DE33501aCorrespondingauthor:[email protected],[email protected],[email protected],+49-521-106-12978
AbstractWeexplorewhetherandhowincreasingcompetencesoftheEuropeanParliament(EP)acrosspolicyareasimpacteditsapproachtohighereducation(HE).Usinganewdigitaldatasetcontainingmorethan10,000speechesdeliveredintheEPplenarybetween2000and2014,weidentifythattotalnumberofspeechesdidincreaseovertime,particularlyduringtheadoptionofactionprogrammesintheareaofHEandrelatedbudgetarydecision.HEwaslessreferredtointheEPspeechesasastand-aloneissuethaninrelationtootherpolicyareasinwhichtheEUhasstrongjurisdiction.WealsoprovidetentativeevidencethatthevarianceinwhetheramemberoftheEP(MEP)speaksaboutHEismorelinkedtoMEP’scountryoforiginthanpartyaffiliation.Promisesandpitfallsofdigitaldataanalysisandpossibleavenuesforfurtherresearcharealsodiscussed.
Keywords:highereducation;policy;EuropeanParliament;EuropeofKnowledge;digitaldatacollection;semi-automaticcontentanalysis
AcknowledgementsWewouldliketothankcolleaguesattheCHEGG(inparticularMarcoSeeber)andparticipantsatCHER2015conference.Anyremainingerrorsareourown.AcknowledgementstoTalkofEuropestaff,editorsandreviewerstobeadded.
FundingThisworkwassupportedbytheResearchFoundation–Flanders(FWO),underGrantG.OC42.13N.
IntroductionSince2000,theEuropeanUnion(EU)hasputknowledgeatthecentreofitsstrategicendeavours.TheaimoftheLisbonStrategywasforEuropetobecomethemostadvancedknowledge-basedeconomyintheworldby2010.Asaresult,duringthe2000s,theEuropeanCommissionputforwardseveralcommunicationsfocusingontheroleofuniversitiesinthisprocessandthenecessityforauniversitymodernizationstrategy(e.g.EuropeanCommission,2006),culminatingwiththeEurope2020inwhichknowledgeisessentialforensuringsmart,inclusiveandsustainablegrowth(EuropeanCommission,2010).Throughoutthisperiod,knowledgehasbeen‘exported’tootherpolicyareasasapolicysolution(Elken,Gornitzka,Maassen,&Vukasović,2011),whilethefundingofEUprogrammesfosteringcooperationinthisareahasbeenincreasing,despitethefinancialcrisis–e.g.forthe2014-2020period,thereisa30%increaseoffundsallocatedtoresearchcooperation,and40%increaseforeducation.
42
Althoughthesedevelopmentshavebeenthefocusofmanystudies,42mostofthemareconcernedwiththecreationofspecificinstitutions(e.g.theEuropeanInstituteofTechnology)ortheBolognaProcessanditsrelationshipwiththeEUinitiatives,wherebytheytypicallyhighlighttheindividualpolicyentrepreneursortheroleoftheEuropeanCommission(andsometimesalsotheEuropeanCourtofJustice,ECJ).OtherEUinstitutions,inparticulartheEuropeanParliament(EP)anditsinvolvementinpolicycoordinationinthisarea,havereceivedfarlessattentionthusfar,whichreflectsneithertheimportanceofHEforthewholeEuropeanprojectnortheincreasingimportanceoftheParliamentinEUdecision-making(seebelow).
Withthisinmind,thepresentstudyfocusesontheextentandthemannerinwhichHEhasbeendiscussedintheEPsince2000,exploringanewdigitaldatasetcontainingMEPspeechesdeliveredduringtheperiodstudied.WestartbyoutliningthechangesinhowEUapproachesthetopicofHEandtheoverallroleoftheEPinEUdecision-making.Fromthis,wederiveasetofexpectationsconcerninghowHEisdiscussedintheEPwhichweinvestigatethroughanexploratoryresearchdesign.Namely,weundertakedigitaldatacollectionmethodsandsemi-automaticcontentanalysiscodingonasetofmorethan10,000speechesgivenintheEPbetweenJanuary2000andDecember2014,identifiedthroughasetofsearchtermsusing‘TheTalkofEurope’dataset.43Weanalysethedataandidentifypatternsrelatedto‘when,whoandhow’speaksaboutHEintheEP.Wethendiscussourfindings,aswellasthepromisesandpitfallsofdigitaldataanalysisasamethod,andoffersomedirectionsforfutureresearch.
Contextandanalyticalpatternsofinterest
HEandtheEUIntheEUcontext,HEhaslargelybeenconsideredaspecialisedpolicyareasteeredbynationalministerialadministrationsandstronglyinfluencedbyexpertcommitteesandlocalsectoralinterests.Educationingeneralhaslongremainedanareaofnationalcompetence(Gornitzka,2009),meaningthatthelegislativebodiesoftheEU(theEPandtheCouncil)donothaveregulativecompetencesintheareaofHE.PrevioustotheTreatyofLisbon,thiswasreinforcedintheprincipleofsubsidiarity–decisionsweretakenatthelowestpossiblegovernancelevel,inthiscasethenationalauthorities.From1December2009onwards,whentheTreatyofLisboncameintoforce,educationhasbeenconsideredasasupportingEUcompetenceallowing‘theUniontocarryoutactionstosupport,coordinateorsupplementMemberStates’actions’inthisarea.ThischangepotentiallyprovidesmoreleewaytotheEUinthisdomain,althoughEUstillcannotengageintheseactionsonitsown.However,anumberofcaveatsneedtobeaddressed.
First,interestinEuropeanlevelpolicycoordinationintheareaofHEhasexistedsincetheearlydaysoftheEuropeanproject.AsCorbett(2005)states,eversincetheEuropeanCoalandSteelCommunity,EuropeanlevelpolicyentrepreneursinvariousEUinstitutionshavebeenpushingfordifferentEuropeaninitiativestargetingeducation.TheireffortseventuallylaidthegroundfortheErasmusprogrammeandanumberofpilotprojectsfocusingonpolicycoordination,suchascooperationintheareaofqualityassurance(ENQA2010;EUCouncil1998).ThistrendhasbeenstrengthenedbyseveralrulingsoftheECJconcerningrecognitionofqualifications(seee.g.Corbett2005onGravierdecision),aswellasregulationconcerningrecognitionofqualifications,inparticularforregulatedprofessions(Beerkens,2008).TherearealsoindicationsthatHE(andresearch)mayincreasinglybecomesubjecttoEUprimarylaw(i.e.EUlevelregulation)concerningcompetition(an
42 Seee.g.Amaraletal.(2009),ChouandGornitzka(2014),Corbett(2005),HuismananddeJong(2014),
MaassenandOlsen(2007).43 http://www.talkofeurope.eu/data/(accessed22January2017).
43
exclusivecompetenceoftheEUaftertheTreatyofLisbon),duetotheblurringofthedistinctionbetweenpublicandprivateaspectsofHE(Gideon,2015).
Second,(higher)education,researchandinnovationcomprisetheso-calledknowledgetrianglewhichhasbeenthecornerstoneoftheEU’sstrategicdocumentssincetheLisbonSummitin2000.ThefocusondevelopingtheEUasaEuropeofKnowledgeorInnovationEuropehasremainedstrong.Thus,onecouldarguethatintegrationinthisareacanbeconsideredasinequanonofEuropeanintegrationassuch(EuropeanCommission,2010).HEisbeing‘exported’tootherpolicyareas–economiccompetitiveness,socialcohesion,environment,security,foreignrelationsetc.–asapolicysolutionandmodernizationofHEisseenasakeyingredientofpolitical,social,economicandculturaldevelopment(Elkenetal.,2011).Duetothisfunctional‘spill-over’fromareasinwhichtheEUdoeshaveformalregulativecompetences,thismeansthatHEisbecomingatopicofgrowinginterestforEUinstitutions.
Third,intheareaofHEtheEUhasbeenemployingtheso-calledOpenMethodofCoordination(hereinafter:OMC).OMCreliesonvoluntarysettingofstandardsandbenchmarks,andincludesdevelopmentofproceduresdesignedtomonitorprogress.Whilethisapproachmay,atfirstglance,seemrathersoftgivenitsvoluntarynatureandampleroomforwindow-dressing,evidencesuggeststhatthepossibilityinherentintheOMCto‘nameandshame’laggardscanactuallybeapowerfulinstrumentleadingtosignificantchangesonboththenationalandinstitutionallevel(Gornitzka,2014).ThefactthatthesechangesdonotnecessarilyresultinclearanddeepconvergenceislessanindicationofOMC’ssoftnessandmoreanindicationofthecomplexityofimplementationprocessesinHE(Musselin,2005).
Insum,theEUhasbeenincreasinglyfocusingonpolicycoordinationintheareaofHE,eitheronitsownorduetospill-overfromotherpolicyareasinwhichithasexplicitcompetences.Whilemostoftheactivitiesinthisareahavebeenledbytheexecutivebranch–theEuropeanCommission(EC),otherEUinstitutionshavefocusedonHEaswell,includingtheEPwhich,amongstother,istaskedwithoversightoftheEC.
TheroleoftheEuropeanParliamentintheEUdecision-makingOverall,inthecaseofEUdecision-making,thedistributionofpowerisassessedasrathercomplex(Börzel,2010).Whiletheexecutive,judicialandlegislativepowersintheEUareshared,thebasicdistinctionbetweengovernment,courtsandtheparliamentthatexistsonnationallevelsdoesnotexistinthesamewayontheEuropeanlevel.ThisisinparticularthecaseforlegislativecompetenceswhicharecurrentlysharedbetweentheEPandtheCouncil,aset-upreferredtoas‘co-decision’.
ThespecificationanddivisionoftasksbetweenthedifferentEUinstitutionshasbeenevolvingsincetheverybeginningoftheEuropeanintegrationprojectandthisisinparticulartrueforthe“legislativepowersoftheEP[which]havegrownsequentially”(Pollack,2010,p.31).TheseedsofEP’sempowermentcanbefoundalreadyinthetreatiesof1970and1975whichgavetheEPsomecontrolovertheEUbudgetandintroducedalsotheCourtofAuditors.TheSingleEuropeanActfrom1986alsogavetheEPincreasedlegislativepowerandexpandedtheoverallEUpolicyscope,extendinganddeepeningEU’scompetencesinmoreareas(Wallace,Pollack,&Young,2010).Co-decisionbetweenEPandtheCouncil–implyingthatadecisionneedstobeacceptedbybothbodies–wasfirstintroducedinthe1992TreatyonEU(Maastricht),whilethe1997TreatyofAmsterdamintroducedstrongrequirementsforEP’sassentonenlargementandappointmentsoftheCommission.Overall,EP’sinvolvementinEUdecision-makinghasevolvedfromanon-bindingconsultationproceduretoaco-decisionprocedurewiththeCouncilinthe1990s,onlytobefurtherstrengthenedinthe2000sbyestablishingco-decisionasthestandardoperatingprocedureusedformajorityofpolicyareas(Pollack,2010).
44
Furthermore,theEPistaskedwithapprovingtheEUbudgetanddischargingtheaccountsofthepreviousyear(Laffan&Lindner,2010).Concerningbudgetapproval,thesedecisionsareimportantbecausetheyarehighlyvisibletoMEPsconstituentsandarepossiblycontentiousgiventhatpotentialwinnersandloserscanbeclearlyidentified–“sinceithasbeengrantedbudgetarypowersin1975,theEPhasregardedtheEUfinancesasoneofitskeychannelsofinfluencevis-à-vistheCouncil”(Laffan&Lindner,2010,p.214).ThisiswheretheEPtriestoinfluencedecisionsatboththemacrolevel–concerningmulti-annualfundingframeworks,aswellasthemicrolevel–concerningspecificprogrammesandprojects.AnexampleoftheformerconcernsthestrongfocusonresearchandtechnologyinthediscussionoftheFinancialPerspectivefor2007-2013(reflectingthefindingsoftheso-calledSapirreport),wheretheEPalsosupportedtheEC’sproposaltostrengthenexpenditureforpublicgoods,effectivelypositioningitselfagainstthesomeMemberStates(Laffan&Lindner,2010).ExampleofthelatteristhedecisionconcerningErasmusMundusbudgetin2003(Corbett,2005)andtheEP’sconcernovertheJunckerCommissiontouse2/3oftheHorizon2020fundingfortheEU’sinvestmentfund.44Giventhatthemulti-annualbudgetplanactuallyhasthestatusofalaw,bindingforseveralyears,theEP’sdeliberationsanddecisionsonthebudgetissueshavebecomeevenmoreimportant.
TheEPalsoplaysanimportantroleinappointingtheCommission;itapprovestheCommissionPresidentandhasthepowertoholdtheCommissionaccountable.Forexample,itwaseffectivelytheEPwhichforcedtheSanterCommissiontoresigninthelate1990s,followingtheclaimsofinsufficienttransparencyinspendingofEUfunds.TheEPalsodelayedtheendorsementofthe2004Commissionduetotheproposedcompositionandtheendorsementofthe2009CommissionPresident.FortheJunckerCommission,theMEPsputforthanumberofrequestsconcerningtheportfoliosofthedifferentCommissionersandtheproportionoffemaleCommissionersbeforeapprovingtheoverallcomposition.
Overall,theEPiscurrentlyinthepositiontoconstraintheagenda-settingactivitiesoftheECanditcanalsoexplicitlyasktheECtodealwithspecificissues(Young,2010).Givenitsroleintheco-decisionprocedure,itcaneffectivelyactasavetoplayerandblockdecision-making(Finke,2010).However,ithasalmostnoinvolvementintheprocessofimplementationandpolicyevaluation(Young,2010).Ingeneral,sincetheearlydaysoftheEuropeanintegrationproject,theEPhasincreaseditsinfluenceonEuropeanleveldecision-making,whereinexchangetheECandtheCouncilhavearguablylostsomeinfluence.However,adetailedanalysisofinter-institutionalpowerrelationshipswouldneedtotakeintoaccountthatnoneoftheseinstitutionsareunitaryactorsandthattheirowninternaldynamicisimportantaswell.
WhatgoesoninsidetheEPandwhyisitimportant?TheEPiscomposedofMEPs,thevastmajorityofwhichareorganizedintoEuropeanpartyfamilies,whiletherestare‘non-attached’MEPs.ThecandidatesforMEPsrunatnationallyorganizedelections,wherethenumberofMEPstobeelectedfromeachstatedependsonthecountry’spopulation.However,onceelected,theMEPsaregroupedintheEPnotaccordingtotheircountries,butinaccordancetotheirpartisanaffiliation.ThecompositionoftheEPandthetotalnumberofMEPsperparliamentarytermofinterestforthisstudyispresentedinTable1.
Giventhatitisasupranationallegislature,EP’sconnectiontotheelectorateisbysomeconsidered“notablyweak”(Young,2010,p.58),althoughitshouldbeacknowledgedthattheEPiseffectivelytheonlyEUinstitutionwhosemembersaredirectlyelected.Evidencesuggeststhat,onceintheEP,thedecisionsoftheMEPsaremoredeterminedbythegenericleft-rightpoliticalcleavagesbetween
44 Seee.g.https://euobserver.com/economic/128867(pageaccessed1March2017).
45
thedifferentEuropeanpartyfamiliesandtheirpositionsconcerningthescopeandlevelofappropriateEuropeanintegrationthanbytheMEPs’countryaffiliations(Finke,2010;Pollack,2010).
Table1–NumberofMEPsacrosspartyfamiliesandparliamentaryterms.Source:EPwebsite.
NumberofMEPs(perpartyfamily)5thterm1999-2004
6thterm2004-2009
7thterm2009-201445
EuropeanUnitedLeft/NordicGreenLeft(GUE-NGL) 42 41 35
ProgressiveAllianceofSocialistandDemocrats(S&D);formerlyPES
180 200 184
Greens/EuropeanFreeAlliance(Greens/EFA) 48 42 55
AllianceofLiberalsandDemocrats(ALDE);previouslyELDR 50 88 84
EuropeanPeople’sParty(EPP),formerlyEPP-ED 233 268 265
EuropeofFreedomandDirectDemocracy(EFDD),formerlyIND/DEMorEFD
16 37 32
EuropeanConservativesandReformists(ECR),formerlyUEN 30 27 54
Non-attached(NA) 9 29 27
Total 626 732 736
WhilepreparatoryworkiscarriedoutinspecialistcommitteesoftheEP(Wallace,2010),plenarysessionstakingplaceeverymonthinStrasbourgserveasanopportunityforMEPstoaddresseachother,aswellasotherEUinstitutionsandthepublic(Proksch&Slapin,2010;Slapin&Proksch,2010).Thespeechesduringthesesessionsserveseveralpurposes:(a)arguinginfavouroragainstalegislativeproposal,(b)scrutinizingotheractors,inparticularthoseoverwhichtheEPhasoversight(e.g.theEC),(c)sendingsignalstonationalconstituents,(d)othermembersofthepartygroupor(e)othermembersoftheEP(Slapin&Proksch,2010).ThesessionsaresometimesstructuredaroundanopeningstatementoraproposalbytheEC,followedbyarapporteuroftherelevantEPcommittee(Proksch&Slapin,2010).Thelatterplayaparticularlyimportantrole:theyaretheonessteeringnegotiationswithinthecommitteesandworkingonensuringthesupportacrossdifferentpoliticalgroups(Kohler,2014).Whilethis‘behind-the-scene’workpotentiallylimitsthepossibilitiesfordebateandconflictintheplenarysessionbetweenthedifferentpartyfamilies(Kohler,2014),rapporteur’sspeechesintheplenaryareneverthelessimportantasindicatorsoftheoutcomesofnegotiationswithinthecommittees.46Aftertherapporteur,thespeakingtimeisallocatedtopartyfamilies,withMEPsofthelargestfamilyspeakingfirst.AllocationoftimebetweentheMEPswithinonefamilyisdoneinternally,andtheindividualspeechcannotlastmorethanthreeminutes.Attheendofthedebateandbeforethevote,theECrepresentativesmayreplyandindicatetheEC’spositionontheproposal(Proksch&Slapin,2010).Importantly,astheEPalsohasthepowertoputforwardissuesonitsown,andnotonlytofollowEC’sagenda,MEPscanspeakonawiderangeoftopics,boththoseinwhichtheEPhasexplicitcompetencesconcerningregulationadoption(theso-called‘hardlaw’),aswellasthosesubjecttosofterpolicycoordination.Effectively,MEPsuseplenarysessionsastheopportunitytogivespeechesbothtocommunicatetheirownpositionstowardsthe
45 ThenumberofMEPsintheSeventhParliamentarytermchangedtwice,firstduetotheLisbonTreaty
enteringintoforceinDecember2009(to754MEPs)andthenduetoCroatiajoininginJuly2013(to766MEPs).
46 The‘TalkofEurope’datasetincludesonlyspeechesmadeintheplenarysession.
46
generalpublicandtheirownconstituents,aswellastocoordinatewithotheractors,relyingondiscursivepracticesasinstrumentsofchange(Schmidt,2010).
ExpectationsGiventhat(a)HEhasbecomeincreasinglyimportantfortheoverallEUstrategicdevelopment,that(b)HEhasbeenexportedtootherpolicyareasasapolicysolution,that(c)despitethelackofstrongregulativecompetences,thereissignificantHEpolicycoordinationattheEU-level,that(d)therehasbeenagradualempowermentoftheEPwithregardstotheEUleveldecision-making,inparticularwhenitcomestobudgetdecisions,andthat(e)thebehaviourofMEPsingeneralseemstobedeterminedmorebytheirpartyaffiliationthanbytheircountryaffiliation,thefollowingpatternswithregardstohowHEisconsideredintheEPcanbeexpected:
1. ThetotalnumberofMEPspeechesreferringtoHEincreasesovertime.ThemostsignificantincreaseisexpectedinrelationtotheadoptionofEUactionprogrammesandrelatedbudgetarydecisions.
2. HEismoreoftenreferredtointheEPspeechesinrelationtootherpolicyareasinwhichtheEUhasregulativecompetences,thanasastand-aloneissue.
3. WhetherornotanMEPmakesaspeechaddressingHEismorestronglylinkedtohis/herpartyfamilyaffiliationthantothecountryoforigin.
DataandmethodToinvestigatetheroleofHEintheEPwestudiedspeechesdeliveredintheEPplenaryusingthe‘TalkofEurope’–alinkedopendatainfrastructure(vanAggelen,Hollink,Kemman,Kleppe,&Beunders,2016),whichcomprisesspeechesgivenintheEPfrom1999–2014(translatedintoEnglish)andrelateddataavailablethroughtheEuropeanDataPortal.47‘TalkofEurope’allowstheuseofsemanticqueriestoretrievedatastoredintheResourceDescriptionFramework(RDF),acomputerdatalanguage(Juric,Hollink,&Houben,2012).Byformallylinkingtraditionallydistributeddatasets,queriescanbeimplementedtoidentifyspecificartefactsandrelatedmeta-data.Thesedigitalprovisionsofferanumberofadvantagestotheresearcherswishingtoinvestigatethisdata.First,onecanautomaticallyidentifyalargeamountofdocumentsinastraightforwardmanner,asopposedtoqueryingonedatabaseofMEPs’speeches,queryinganothertoretrievedataaboutdates,agendaitems,MEPs,etc.andthenmergingthem.Inaddition,insteadofmanuallyinferringpossibleconnections,thedataisautomaticallylinkedandcompiledasoneartefact,whichsavessubstantialtime(e.g.thatJohnSmithinonedatabaseisthesameintheother).Dependingonthequerysize,queryingsuchdatamayberuninminutes,ifnotseconds.
However,withtheadventofthesetoolscomechallengesassociatedwithdesigning,conducting,andinterpretingresearchresults(Bar-Ilan,2001).Asthedataishighlysensitivetothequerycommands,thespecificsofthequeryinfluencethedatareturned.Thustheresearcherhastobeacquaintedwiththedatabase,itspossibilitiesandthenatureoftherequest,soasnottojeopardizevalidityofthedesign.Forexample,theresearchermustknowwhetherthedatabaseiscontinuouslybeingupdated,whatcharacteristicsareavailableoftheartefactsbeingqueried,andhowthedataneedstobestructuredtoconducttheappropriateanalysisgiventhespecificresearchdesign.Takingintoconsiderationthenatureofsuchdataandthefocusonthedevelopmentofaquerytoidentifyspecific,craftedvalidsampleswithintheserelativelylargedatasets,atotaldescriptionoftheentiredatasetisrarelyfeasiblenorconducted.Thisimpliesthatnormalization–comparinganentiredatasettotheselectedsample–forthepurposesofverifyingtherepresentativenessofsample(astandardinquantitativeresearch)isinthisstudyconductedusingothermeans(see‘ResultsandDiscussion’).Althoughitmayquestionsomeassumptionsofspecificdesigns,forexampleprocedures
47https://www.europeandataportal.eu/(pageaccessed1March2017).
47
andcharacteristicsnecessaryforinferentialstatistics,wewouldarguethatsuchissuesdonotwarrantdisregardingsuchdata,butratherrequirethatthesespecificitiesaretransparentlypresentedandopenlydiscussed,asweworktowardsbuildingamethodologicaltoolkitfitforanalysingsuchdigitaldata.Despitethesepitfalls,wecontendthatexploringsuchdatawouldoffervaluableandperhapsuniqueinsightsaboutphenomenastudied.
Inanalysingthesedatawetakeanexploratoryapproach.WestartedourresearchbydevelopingasetoftermsrelatedtoHE,whichwerethenreviewedbyanumberofHEresearchers(seeAppendixforthelistofalltermsqueried).Inordertoidentifyspeeches,aquery,usingthesekeywords,wasdevelopedbythesecondauthor,withtheassistanceofthe‘TalkofEurope’team.Thisreturnedthespeechesinatextformat,aswellastherelatedmeta-data(ifavailable)ofthe:titleofthespeech,dateofthespeech,URLtotheoriginalspeech,identificationofthespeaker,speaker’scountryaffiliation,andthespeaker’spartyaffiliation(ifknownorapplicable).Queryingthesewordsresultedinasetof10,180uniquespeeches(allincludingatleastoneHEtermfromourlist,duplicatesremoved)andrelatedmeta-datarepresentingallpotentialdiscussionsonHEintheEPsince1999.Themeta-data–whichconstituteessentiallytextualdata–werecodedinordertoallowforasystematicanalysisoftemporal,topicalandcountry/partyaffiliationpatterns.Importantly,thequerydevelopedfocusedonidentifyingspecifictermsusedinspeeches,notadescriptionoftheentiredatasetofspeeches.Thisapproachmimicstechniquesimplementedinotherstudiesusingthisdataset(vanAggelenetal.,2016).Inadditiontothisdata,weusedpubliclyavailableinformationonthenumberofMEPspercountryorperpartyfamilyduringtheperiodstudied.ThetreatmentofthedataispresentedintheTable2.
Table2–Variablesandtreatment.Source:Authors.
Variable Typeofdata Treatment
Titleofthespeech TextualdataManuallycodeddatainrelationtothetopics,seeTable3.
Dateofthespeech Date n/a(nottreatedhere)
UniqueIDofthespeaker,givenbyEP Nominal n/a(nottreatedhere)
Speaker’scountryofaffiliation Textualdata Codedtonominaldata
Speaker’spartyaffiliation,ifknown Textualdata Codedtonominaldata
Toefficientlyidentifytopicsaccordingtokeywordsinthespeechtitles,aspresentedinTable3,speechesweresemi-automaticallycoded-usingbothmanualandcomputationalcoding(Lewis,Zamith,&Hermida,2013).Thisresultedinonecodeperspeech,followingahierarchicalschema:withatitlecontainingoneoftheHEtermsidentifiedearliertakingprimacy,thenanon-HEtopic(e.g.geographicaldeterminant,demographicdeterminantorreferencestootherpolicysectors).Forexample,ifthetitlereferstoRomaortheDanubeRegionbutthespeechmentionsaHEterm,itconstitutesanon-HEtopicwhereHEhasbeendiscussedinrelatedtoanotherpolicytopic.Speechesthatincluded‘vote’,‘budget’orreferencetoaproceduralmatterwerealsocodedintoseparatecategories(seeTable3fordetails).Votingandbudgetformallyrepresenttwodifferentactivities,wherebyoneispossiblyaspecificdiscussionofthebudget,comparedtoadiscussiononthevotingitselfasadecision-makingprocessoftheEP.Weacknowledgethatinasmallnumberofcasesthesemayoverlap,giventheEPoftenvotesonbudgets.AllotherformalactivitiesrelatedspecificallytoprocedureandtheorganizationoftheEUingeneralareconsideredasproceduraltopics.
Inordertoexplorethesepatterns,thesecodeddata,togetherwiththeabovementionedmeta-dataoftextorigin(stringvariables),weretransformedintonominalcategoricalvariables.Thedatasetwasthenusedtoexplorea)thetemporalpatternsoftheuseofHEtermsinspeechesovertime,b)the
48
topicalpatternsoftheuseofHEtermsinthedifferenttypesofspeeches,c)theroleofthecountryandpartyinexplainingtheuseofthesetermsovertimeandinspecifictopics.
Table3-Codingscheme.Source:Authors.
Code Description
HE SpeecheswithatitlethatincludedoneofourkeywordsandaddressedHEasthespecifictopic
Non-HE
Speecheswiththementionofageographicalplaceinthetitle(e.g.country,cityorregion),oraspecificgroupofpeopleinthetitle(e.g.women,youth,disabled,elderly,Roma),oranissuethatisnotspecificallyrelatedtoHE(e.g.economy,humanrights,employment,labour,resources,security,environment,defence,transportation,andsoforth).
Vote SpeecheswiththetitleVoteorVotes
Budget Speecheswiththementionofthewordbudgetinthetitle
ProceduralSpeecheswithamentioninthetitleonproceduralmattersoftheEUitself(e.g.reviewofECnotes,announcements,andsoforth)
Unidentified Speechesthatarenotattributabletoatopicgiventhelackofdetailinthetitle
Weacknowledgeanumberoflimitationstoourdesign.ThereliabilityofthepublicdataisrelatedtotheaccuracyofEUOpenDataPortalandthe‘TalkofEurope’infrastructureinbothpublishingandaccuratelylinkingrelateddata.Giventhebiggernatureofthisdata,itisexpectedthatlesssignificant‘bugs’mayoccur,butthatthis‘noise’wouldbesystemicandthuswouldnotsignificantlyinfluenceresults.Inthisrespect,wehaveencounteredanunprecedentedamountofunattributedpartyaffiliationsinthe7thEPsessionwhichreflectedmissingdata.Thus,toensurevalidity,inconsideringtheextenttowhichMEP’sspeakingonissuesrelatedtoHEisdeterminedbyhis/hercountryoforiginorpartyfamilyaffiliationwehavenotanalysedthe7thterm.WithintheavailabledatawewerenotabletoconfirmwhetherallspeakerswereMEPs,orguestspeakers,althoughwecouldsafelyassumethatthenumberofnon-MEPsspeakingintheEPplenarysessionsisverylowandthusnotsignificantinawaythatcoulddistortourfindings.
ResultsanddiscussionAspreviouslyindicated,ourqueryretrievedatotalof10,180speechescontainingoneormoreofthetermsinour‘dictionary’(seeAppendix).Giventhattheoutputofthequerydoesnotcontainalistoftermsthatwerefoundinaparticularspeech,itwasnotpossibletosystematicallymeasuretheco-occurrencesofthetermsacrossallofthe10,180speechesandtousesuchdatatotestthesensitivityofthequerytothecontentofthe‘dictionary’.
Giventheselimitations,wehavedevisedanalternativeapproach.Wefocusedonthepotentiallymostproblematictermsinthe‘dictionary’,i.e.termsthatmayappearinspeecheswithnolinkagetoHEwhatsoever:innovation,mobility,science,technology,andtraining.Wehavequeriedthe‘TalkofEurope’infrastructureforthesefivetermsseparatelyandanalysedtheoverlapbetweenspeechesretrievedthiswayandspeechesretrievedwhenqueryingfortwotermsdefinitelylinkedtoHE–‘highereducation’and‘university’.TheresultsarepresentedinTable4.
49
Table4–Numberofspeechescontainingoneormoreoftheselectedterms.Source:Authors.
X
Numberofspeeches…innovation
mobility sciencetechnology
training
A:containingoneoftheterms(X) 923 752 534 973 1011
B:containingXAND'highereducation'(Y) 262 269 180 264 281
C:containingXAND'university'(Z) 370 322 258 421 403
containing(XANDY)OR(XANDZ)=B+C 632 591 438 685 684
D:containingYANDZ 221 221 221 221 221
containing(XANDNOTY)OR(XANDNOTZ)=B+C-D
512 382 317 509 548
Thus,thereare2,268speeches(22.2%ofthetotalnumberofspeechesinourdataset)thatcontainatleastoneofthefiveproblematicterms(X),butdonotcontain‘highereducation’(Y)or‘university’(Z),i.e.potentiallythereare22.2%ofspeechesinthedatasetthatshouldnotbethere.However,weneedtostressthatthisisactuallythemaximumpossiblevalue,fortworeasons:(1)wejustexploredtheco-occurrenceoftheproblematicterms(X)withtwootherterms(‘highereducation’and‘university’)andnotwithothertermsinthe‘dictionary’whichmayalsobecloselylinkedtoHE(e.g.student,academic);and(2)weignoredthepossibilitythattheremaybeco-occurrencesofthedifferentXsinthesamespeech(e.g.‘innovation’and‘technology’)andmerelyaddedthedifferentnumbersinthelastrowofTable4.Notwithstandingthattheactualproportionofspeechesthatdonotbelonginthedatasetisverylikelylowerthan22.2%,wewillproceedwithourexplorationofthedatatakingthisintoaccount.
InrelationtoourexpectationthatthetotalnumberofMEPspeechesmentioningHEincreasesovertime,Figure1presentsthefrequencyofsuchspeechesforthe5th,6thand7thterm(aggregatedforafour-monthperiod).
50
Fig.1-SpeechesreferringtoHE,overtime.Source:Authors.
AsFigure1shows,thereisanincreaseinthenumberofspeechescontainingatleastoneofthetermsweidentifiedasbeingattributedtoHEovertime.Thefigurealsohelpsusidentifymomentsofincreasedfrequency,suchastheendof2008,partsof2011ortheendof2013.
AcloserlookintothedatasetrevealsthattheseincreasesarerelatedtotheactivityaroundtheadoptionofspecificprogrammesanddecisionsconcerningHE,suchas:
• the‘ErasmusMundusII’programme–31speechesonthistopicon20October2008alone;• EuropeanQualityAssuranceReferenceFrameworkforVET–38inDecember2008;• thereportonthe‘YouthontheMove’(whichalsoincludesstudentmobilityprogrammes,such
as,currently,Erasmus+)–57speechesinMay2011;• Agendafornewskillsandjobs–53speechesinOctober2011;• ModernisingEurope’sHigherEducationSystems–38inApril2012;• adebatetitled‘IsErasmusindanger?’–42speechesinOctober2012;• ‘Erasmus+’programmeunder‘ErasmusforAll’itemontheagenda–142speechesin
November2013.
51
Thisalsomeansthattheincreasedfrequencycannotbeduetothe22.2%potentiallyproblematicspeechesinourdataset.Atthesametime,momentsoflowestfrequencypertaintothetransitionbetweendifferentparliamentaryterms.
ConcerningthecontextinwhichHEisreferredto,inthemajorityofspeechesHEisnotdiscussedasastand-aloneissuebutratherinrelationtootherpolicyissuesorinrelationtothevoteexplanations(Table5).TheseotherpolicyissuesincludeareasthatcouldbeconsideredcloselyrelatedtoHE,suchasgeneraleducation,youthissuesorrecognitionofprofessionalqualifications,butalsoincludeareasthatcanbeconsideredasratherdistantfromHE,e.g.visaissues,armssales,maritimepolicyetc.Thehighproportionofthespeechescategorizedunderthe‘Vote’topicindicatesthatHErelatedtermsarealsoreferredtoduringexplanationsofvotingproceduresaswellasdiscussionsconcerningimplicationsofthevotes.
Table5–DistributionofspeechesreferringtoHEinrelationtotheirmaintopic.Source:Authors.
Maintopic Numberofspeeches %inrelationtoallspeechesincludingHEterms
Non-HE 3,844 37.76%
Vote 3,131 30.76%
HE 1,358 13.34%
Procedure 1,122 11.02%
Budget 713 7.00%
NI 12 0.12%
WhilethestructureofourdatasetdoesnotallowforamorerefinedanalysiswithregardstohowHEisreferredtoinrelationtootherpolicyissuesorvoting,itisclearthatHEdoesnotfeatureprominentlyasastand-aloneissuebutthatitismostoftenreferredtoinrelationtootherpolicyissuesinwhichtheEUhasregulatorycompetences,evenwhentakingintoaccountthatpotentially22.2%ofthespeeches–allofwhichwouldbeontopicsotherthanHE–perhapsshouldnotbeinourdataset.
Concerningourthirdexpectation,wefocusedonlyonthe5thand6thtermforwhichwehadclearpartyfamilyaffiliationforeachMEPandrestructuredthedatasetsotheMEP(andnotanindividualspeech)isthedataunit.WethencalculatedforeachMEPtheproportionofspeechesthathadHEasitsmaintopicinrelationtothetotalnumberofspeechesgivenbysaidMEP(hereinafter:HEspeeches),andbasedonthisexploredthevarianceinproportionofHEspeechesinSPSSwithatwo-wayANOVAusingcountryandpartyasfixedfactors(Field,2009).TheresultsshowthatastatisticallysignificantdifferenceintheproportionofMEPspeechesthatareonHEexistsonlyforcountryaffiliation(andthatonlyatp<0.05levelofsignificance)whilethedifferenceforpartyaffiliationisnotsignificant.ThiscanbeconsideredasasuggestionthatthecountryoforiginismorestronglylinkedtothevarianceinproportionofspeechesanMEPmakesthathaveastheirmaintopicHE,thoughprimarilyatentativeone,giventhepotentialthatacertainnumberofspeeches–likelylessthan22.2%becausethisanalysisconcernsonly5thand6thterm–shouldnotbeconsideredinthisdataset.
ConclusionsThefindingspresentedinthispaperaretheresultofthefirstexplorationofthe‘TalkofEurope’dataset.TheysuggestthatEuropeofKnowledgeisbecomingthetalkofthetownintheEuropeanParliament.ThetotalnumberofMEPspeeches,eitherspecificallydedicatedtoHEormentioningHEinspeechesdedicatedtootherissues,appearstohaveincreasedovertime,particularlyduringtheadoptionofEUactionprogrammesintheareaofHEandrelatedbudgetarydecision.Moreover,overtheperiodanalysed,HEwaslessreferredtointheEPspeechesasastand-aloneissuethaninrelation
52
tootherpolicyareasinwhichtheEUhasstrongjurisdiction.Finally,thetentativefindingsindicatethatthevarianceinwhetheranMEPspeaksaboutHEismorelinkedtocountryoforiginthanpartyaffiliation.
Moregenerally,thesefindingsattesttotheincreasingroleoftheEPinHEpolicymaking,whichhasbeenlargelyoverlookedinstudiesonEuropeanleveldynamicinHE.Asthisstudydemonstrates,acloserlookattheEPpotentiallyoffersawealthofinformationonhowHE,bothinrelationtootherareasandasapolicyissueinitsownright,isconsideredanddecideduponbytheonlydirectlyelectedinstitutionoftheEuropeanUnion.Availabledatabases,suchasthe‘TalkofEurope’,butalsotherichrepositoriesofpubliclyavailableinformationontheEuropeanUnionwebsites,therefore,offerapromiseofabetterinsightintothesematters.
Inthisstudywehavealsotriedtoexplorethepotentialsandlimitationsofusingbig(ger)digitaldata.Webelievewehaveaccuratelyshownhowsuchmethodsandanalysiscanbeusefulinpolicyresearch,whilealsotryingtohighlightsomeoftheirshortcomings.Regardingthelatter,thechoiceofresearchdesigndidnotallowfortraditionallyacceptedmethodsofanalysistobeemployed,togetherwiththeassumptionsnecessarytoconductthoseanalyses.Anexampleofthiswouldbethedifficultieswithdevelopingavalidqueryfromafrequentlychangingdataset,whichinturnpreventsnormalizationofdata.Weargue,andhaveshowninthisresearch,thatsuchdifficultiesshouldnotautomaticallymeanthatsuchdataisofnousebutratherthatthemethodsdistincttodigitaldataneedtobeemployedandtransparencyneedstobeensured.Specifically,tothisproject,weacknowledgealimitationofthedevelopedquerytoindividuallyidentifytermswhereco-occurrencecanbeassessed.Indevelopingfuturequeriesusingthe‘TalkofEurope’infrastructureoneshouldattempttobuildadatasetthatwouldallowaquerytoconsideranadditionalnumberofcharacteristics,atleast:(1)thetotalnumberofspeechesatthedateofdataofcollection(toenablenormalization),and(2)dataonco-occurrenceofdifferenttermsofinterest.Thesetwoadditionswouldallowforfurtherquantitativeanalysis,butalsomoreconfidenceinclaimingthatcertainmechanism(s)andrelationshipsareatplay,whichwecanonlynowprovideastentativefindings.
Takingintoaccounttheabovementionedmeasuresconcerningthedatainfrastructure,anumberofpossibleavenuesforfurtherresearchbecomeopen.First,in-depthanalysisofthetextualdatacontainedinselectedspeecheswouldallowforfurtherexploringthecontentofthesespeeches,e.g.whatpreferencesandpositionsareMEPsputtingforwardandhowthismaychangeovertime.Moreover,relationshipsbetweentheEPandotherEUinstitutions,suchastheEuropeanCommissionandtheCounciloftheEUcanbeanalysedby,forexample,analysingtheextenttowhichMEPsrefertoHEwhenrespondingtoinitiativesofotherEUinstitutionscomparedtospeakingaboutHEwithoutanexternalprompt.Theinitialexplorationspresentedinthispapercanthusserveasthebackdropforfurtherin-depthanalysisofMEPbehaviourconcerninghighereducationusingdigitaldata.
ReferencesBar-Ilan,J.(2001).DatacollectionmethodsontheWebforinfometricpurposes—Areviewandanalysis.Scientometrics,50(1),7-32.doi:10.1023/a:1005682102768
Beerkens,E.(2008).TheEmergenceandInstitutionalisationoftheEuropeanHigherEducationandResearchArea.EuropeanJournalofEducation,43(4),407-425.doi:10.1111/j.1465-3435.2008.00371.x
Börzel,T.A.(2010).EuropeanGovernance:NegotiationandCompetitionintheShadowofHierarchy.JCMS:JournalofCommonMarketStudies,48(2),191-219.doi:10.1111/j.1468-5965.2009.02049.x
Chou,M.-H.,&Gornitzka,Å.(Eds.).(2014).BuildingtheknowledgeeconomyinEurope:NewconstellationsinEuropeanresearchandhighereducationgovernance.Cheltenham:EdwardElgar.
53
Corbett,A.(2005).UniversitiesandtheEuropeofknowledge:Ideas,institutionsandpolicyentrepreneurshipinEuropeanUnionHigherEducationPolicy,1955–2005.Basingstoke:PalgraveMacMillan.
Elken,M.,Gornitzka,Å.,Maassen,P.,&Vukasović,M.(2011).Europeanintegrationandthetransformationofhighereducation.Oslo:UniversityofOslo.
EuropeanCommission.(2006).Deliveringonthemodernizationagendaforuniversities:Education,researchandinnovation.Brussels.
EuropeanCommission.(2010).Europe2020:Astrategyforsmart,sustainableandinclusivegrowth.(COM(2010)2020final).Brussels:EC.
Field,A.(2009).DiscoveringstatisticsusingSPSS:(andsexanddrugsandrock'n'roll).LosAngeles:SAGE.
Finke,D.(2010).Europeanintegrationanditslimits:intergovernmentalconflictsandtheirdomesticorigins.Colchester:ECPRPress.
Gideon,A.(2015).ThePositionofHigherEducationInstitutionsinaChangingEuropeanContext:AnEULawPerspective.JCMS:JournalofCommonMarketStudies,n/a-n/a.doi:10.1111/jcms.12235
Gornitzka,Å.(2009).NetworkingAdministrationinAreasofNationalSensitivity:TheCommissionandEuropeanHigherEducation.InA.Amaral,G.Neave,C.Musselin,&P.Maassen(Eds.),EuropeanIntegrationandtheGovernanceofHigherEducationandResearch(Vol.26,pp.109-131):SpringerNetherlands.
Gornitzka,Å.(2014).HowstrongaretheEuropeanUnion'ssoftmodesofgovernance?TheuseoftheOpenMethodofCoordinationinnationalpolicy-makingintheknowledgepolicydomain.InM.-H.Chou&Å.Gornitzka(Eds.),BuildingtheKnowlegeEconomcyinEurope:NewconstellationsinEuropeanResearchandHigherEducationGovernance(pp.160-187).Cheltenham:EdwardElgar.
Huisman,J.,&deJong,D.(2014).TheConstructionoftheEuropeanInstituteofInnovationandTechnology:TheRealisationofanAmbiguousPolicyIdea.JournalofEuropeanIntegration,36(4),357-374.doi:10.1080/07036337.2013.845179
Juric,D.,Hollink,L.,&Houben,G.J.(2012).BringingparliamentarydebatestotheSemanticWeb..Paperpresentedatthe11thInternationalSemanticWebConference,workshoponDetection,RepresentationandExploitationofEventsintheSemanticWeb(DeRIVE2012),Boston.
Kohler,M.(2014).EuropeanGovernanceandtheEuropeanParliament:FromTalkingShoptoLegislativePowerhouse.JCMS:JournalofCommonMarketStudies,52(3),600-615.doi:10.1111/jcms.12095
Laffan,B.,&Lindner,J.(2010).TheBudget.WhoGetsWhat,When,andHow?InH.Wallace,M.A.Pollack,&A.R.Young(Eds.),Policy-MakingintheEuropeanUnion(pp.208-228).Oxford:OxfordUniversityPress.
Lewis,S.C.,Zamith,R.,&Hermida,A.(2013).ContentAnalysisinanEraofBigData:AHybridApproachtoComputationalandManualMethods.JournalofBroadcasting&ElectronicMedia,57(1),34-52.doi:10.1080/08838151.2012.761702
Maassen,P.,&Olsen,J.P.(Eds.).(2007).UniversityDynamicsandEuropeanintegration.Dordrecht:Springer.
Musselin,C.(2005).ChangeorContinuityinHigherEducationGovernance?InI.Bleiklie&M.Henkel(Eds.),Governingknowledge:Astudyofcontinuityandchangeinhighereducation(Vol.9,pp.65-79).Dordrecht:SpringerNetherlands.
54
Pollack,M.A.(2010).TheorizingEUpolicy-making.InH.Wallace,M.A.Pollack,&A.R.Young(Eds.),Policy-MakingintheEuropeanUnion(pp.15-44).Oxford:OxfordUniversityPress.
Proksch,S.-O.,&Slapin,J.B.(2010).PositionTakinginEuropeanParliamentSpeeches.BritishJournalofPoliticalScience,40(03),587-611.doi:doi:10.1017/S0007123409990299
Schmidt,V.A.(2010).Takingideasanddiscourseseriously:explainingchangethroughdiscursiveinstitutionalismasthefourth‘newinstitutionalism’.EuropeanPoliticalScienceReview,2(01),1-25.doi:doi:10.1017/S175577390999021X
Slapin,J.B.,&Proksch,S.-O.(2010).Lookwho’stalking:ParliamentarydebateintheEuropeanUnion.EuropeanUnionPolitics,11(3),333-357.doi:10.1177/1465116510369266
vanAggelen,A.,Hollink,L.,Kemman,M.,Kleppe,M.,&Beunders,H.(2016).ThedebatesoftheEuropeanParliamentasLinkedOpenData.SemanticWeb(Preprint),1-10.
Wallace,H.(2010).AnInstitutionalAnatomyandFivePolicyModes.InH.Wallace,M.A.Pollack,&A.R.Young(Eds.),Policy-MakingintheEuropeanUnion(pp.69-104).Oxford:OxfordUniversityPress.
Wallace,H.,Pollack,M.A.,&Young,A.R.(2010).AnOverview.InH.Wallace,M.A.Pollack,&A.R.Young(Eds.),Policy-MakingintheEuropeanUnion(pp.3-13).Oxford:OxfordUniversityPress.
Young,A.R.(2010).TheEuropeanPolicyProcessinComparativePerspective.InH.Wallace,M.A.Pollack,&A.R.Young(Eds.),Policy-MakingintheEuropeanUnion(pp.45-68).Oxford:OxfordUniversityPress.
55
Appendix–HEtermsqueried(Note:the‘TalkofEurope’databasecanquerytermsconsistingofoneortwowords)
Academia
Academic,academics
bachelor,bachelors
BolognaProcess
Copenhagenprocess
COST
Curriculum
diplomasupplement,diplomasupplements
ECTS
EHEA
employability
Erasmus
ErasmusMundus
Erasmus+
EuropeanInstitute(toidentifyEuropeanInstituteofTechnology)
EuropeanStandards(toidentifyEuropeanStandardsandGuidelines)
EuropeanUniversity
FrameworkProgramme
Graduate,graduates
highereducation
Horizon2020
innovation
knowledge(toidentifyknowledge-basedeconomy)
learning(toidentifylifelonglearningissuesandLifelongLearningProgramme)
LLP
master,master's
Mobility
polytechnic
Qualityassurance
science
Skillandskills
Socrates
STEM
Studentandstudents
technology
Tempus
tertiaryeducation
Training
university
VET
vocationaltraining
56
2.Miningmeaning.FromnetworkanalysistoalgorithmicsemanticdataminingIntroductionPaulVerhaarandMirkoTobiasSchäfer
Bringingdigitalmethodstolinguistics,ourpaperusesadatasetconsistingoftweetsabouttherefugeecrisisasabaselineforsemanticanalysis.Thispaperdescribeshowtheanalysisofacorpusofstatusmessagescanleadtothedefinitionoflinguisticfingerprintsfordetectingideologicalpositions.Weuseanetworkanalysisofretweetstorevealthedifferentpositionsofparticipantswithinthehighlypolarizeddebate.Mappingthenetworkreturnstwoopposingclustersinthedebate;oneexpressesamildlypositivestancetowardsrefugees,consideringrefugeestobevictims,andisuptoacertainextentwelcomingthem.Theotheroneisopposedtorefugees,portrayingthemascriminalsorprofiteers.Theclusterscanalsoroughlybedividedinpoliticalpreference;leftwingversusrightwing.Asthepoliticaldifferencesareobvious,weusethisasabaselineforfurtheranalysisbasedonthecontentoftweets.Miningthetimelinesprovidesinsightsintothedistinctiveuseoflanguagebytheparticipantsoftheopposedclusters.Ourpaperdescribesageneralmethodforanalysingacorpusofstatusupdatesinordertoidentify‘linguisticfingerprints’revealingideologicalpositions.
LinguisticsmeetsdigitalmethodsOurapproachcombinesdigitalmethodsforTwitteranalysiswithlinguisticmethods.Thismeansthatweanalysethestructureofthedebateandconsiderhowlanguageplaysaroleincarryingkeysforidentifyingideologicalpositionswithinthedebate.AnalysingTwitterorsocialmediamessagesisnotnew.Previousresearchhasfocussedonemailanalysis(GrohandHauffa,2011),networkanalysisonretweetbehaviour(Passmanetal.,2014),abstractingpersonalityfromsocialmedia(Schwartzetal.,2013)andstyleaccommodation(Danescu-Niculescu-Miziletal.,2011).Wearenotawareofastudycombiningdigitalmethodsfornetworkanalysiswithlinguisticanalysistobreedaconnectionwithasolidbaseline.Manyconceptsinnewmediastudiesandlinguisticscanbecombined.Registerisanimportantpartofoureverydayconversation,usefulforresearchontheonlinedomain.Peopletendtoconveytheirmessagesindifferentwaysandformsdependingonthepersontheyshareinformationwith(Danescu-Niculescu-Miziletal.,2011).Registerisastrongandimportantaspectinthis,asproposedbyPennebaker,whoclaimsthatwordscanbea“windowtothesoul”(2011).ThelinguisticcharactersthatareusedcanbeseenasmarkerswithindistinctgroupsasshownonWikipedia(Danescu-Niculescu-Miziletal.2012).Languagecoordinationisstronglydependentontheaccommodationtheoryandpowerdifferenceswithinsocialgroups.OnTwitterthisisdonebyposting,replyingandretweeting.Theirinteractionsrelyonlinguisticstylemarkers,suchastheuseofcontentandfunctionwordsandcertainkeywords(Anger,2011).Allinall,notonlythenetworkapersonmovesin,butalsothelanguageisimportantforanalysingonlinesocialformationsonsocialmediaplatforms.
ThedataDataforthisstudyconsistofDutchTwittermessagesfromJanuary2015toOctober2015:intotal561.179tweets,ofwhich363.079wereuniquetweetsand198.100retweets.Selectedwasbasedontworelevanttermsintherefugeedebate,bothsubjecttodifferentformsofconnotationandrepresentation:“vluchteling”(refugee)and“gelukszoeker”(literallyhappinesseeker,orfortuneseekeroreconomicrefugee).Forthepurposeofbuildingacorpusforsemanticminingthecompletenessofthedatasetwasoflesserimportancethanitsrepresentationofdistinctclusters.
FindingsOurfindingsaddresstwoaspectsinpoliticaldebatesonTwitter:
57
Structureofparticipantsanddebate:networkanalysisconfirmedthepolarisationwithintherefugeedebateintotwoopposingclusters,thedynamicofmediaoutletsandopinionleadersinshapingthedebateandtheinteractionofthevariousparticipants.
Semanticanalysis:aquantitativeanalysisoflanguageuserevealedsignificantdistinctionsbetweenthetwoopposinggroups:e.g.therightwingusesmoreadjectives,needsmorewordstoconveyamessageandusesmoresix-letterwordsthantheopposingcluster.
ThispaperfocusesonthedistinctionsinlanguageusewhichopensnewpossibilitiesforautomaticallyminingTwitteroranycorpusformeaning.Withregardtorecenteffortstocreatemodelsandwaysofalgorithmicanalysisofsocialmediacontent(BurnapandWilliams2015),ourpaperindicatesthepossibilitytomovefromnetworkanalysistosemanticanalysisoflargecorpora.WhileRanganathetal.proposeamodelforpredictingprotesttweets,ourconceptsuggeststhedetectionofextremepoliticalpositionsinsocialmediadebates(2016).ThelimitationinRanagathetalistheirdependenceonthenetworkstructureandhistoryofsocialinteractionofthevariousparticipants.Inourexample,thenetworkprovidesmerelythebaselineforfurtherlinguisticanalysis.Developingthisfurtherwouldentailcreatinganalgorithmbasedonthefindingsfromourinitialcorpus.However,thisraisesissuesaboutthequalityofpublicpoliticaldebate,freedomofexpressionandprivacy.Dataretentionandsocialmediametricsprovidethemeansforacoherentanalysisofpoliticalexpressionsanddeliverpowerfultoolsforsecurityauthoritiestomonitorthepoliticalexpressiononline.
ReferencesAnger,I.,&Kittl,C.(2011).MeasuringinfluenceonTwitter.Proceedingsofthe11thInternationalConferenceonKnowledgeManagementandKnowledgeTechnologies.
Burnap,P.Williams,M.(2015)CyberHateSpeechonTwitter:AnApplicationofMachineClassificationandStatisticalModelingforPolicyandDecisionMaking.Policy&Internet7(2),223–242.
Conover,M.D.,Ratkiewicz,J.,Francisco,M.,Goncalves,B.,Flammini,A.,Menczer,F.(2011).PoliticalPolarizationonTwitter.Proceedings5thInternationalAAAIConferenceonWeblogsandSocialMedia.
Danescu-Niculescu-Mizil,C.,Lee,L.,Pang,B.,&Kleinberg,J.(2012).Echoesofpower:Languageeffectsandpowerdifferencesinsocialinteraction.Proceedingsofthe21stInternationalConferenceonWorldWideWeb.
Danescu-Niculescu-Mizil,Cristian,Gamon,Micheal,Dumais,S.(2011).Markmywords!LinguisticStyleAccommodationinSocialMedia.WWW,78(11).
Gilbert,E.(2012).Predictingtiestrengthinanewmedium.ProceedingsoftheACM2012ConferenceonComputerSupportedCooperativeWork.
Groh,G.,&Hauffa,J.(2011).CharacterizingSocialRelationsViaNLP-basedSentimentAnalysis.ProceedingsoftheFifthInternationalAAAIConferenceonWeblogsandSocialMedia,502–505.
Paßmann,J.,Boeschoten,T.,&Schäfer,M.T.(2014).TheGiftoftheGab:RetweetCartelsandGiftEconomiesonTwitter.TwitterandSociety,331–344.
Pennebaker,J.W.(2011).Youruseofpronounsrevealsyourpersonality.HarvardBusinessReview,89(December),32–3.
Ranganath,S.,Morstatter,F.,Hu,X.,Tang,J.,Wang,S.,Liu,H.(2016).PredictingOnlineProtestParticipationofSocialMediaUsers.AssociationfortheAdvancementofArtificialIntelligence.
58
SchwartzHA,EichstaedtJC,KernML,DziurzynskiL,RamonesSM,etal.(2013)Personality,Gender,andAgeintheLanguageofSocialMedia:TheOpen-VocabularyApproach.PLoSONE8(9)
3.Learningcomplementaryalternativemedicinesocially?TopicmodelinghealthconsciousnesswithbigonlinediscussionforumdataMarjoriikkaYlisiurua,ConsumerSocietyResearchCentre,UniversityofHelsinki
IntroductionIndividualshavevaryingabilitiestounderstandandretainhealth-relatedinformation.Thishealthliteracyformsthebasisofindividuals’knowledgeinmakingthedecisionsconcerningtheirhealth(Chinn,2011;Sorensenetal.,2012;Walsh&Elhadad,2014).Theconceptofhealthliteracyiswidelyemployedinhealthcareresearchasameasurable,rationalskill.
Nevertheless,individualswithhealthliteracylevelsthatstandardinstrumentsscoreas“adequate”,oftenrelyonbiomedicallycontroversialComplementaryandAlternativeMedicine(CAM)treatments(Bains&Egede,2011;Stoneman,Sturgis,&Allum,2012).OnepotentialexplanationcomesfromChinn(Chinn,2011)andPuuronen(Puuronen,2015,inFinnish),bothofwhomhighlighttheeffectsofsocialcommunityonwhatinformationindividualsacceptasrelevant.Puuronen(2015)refersto“extended”healthliteracyashealthconsciousness,whichincludes(sub-)culturalcodesandsociallyconstructedmeanings.Tocomplementearlierresearch,thispaperanalysesalargeonlinedatasettostudyhowonlinecommunitiesshapehealthconsciousnessinthefieldofCAM.
ThematerialforthisresearchconsistsofdiscussionsontheCAMfieldofhomeopathy.Onlinediscussionforumsspread“traditionalbiomedical”knowledge,healthexperiencesandpeeradvice,makingsocialmediaafieldwherehealthliteracyisacquiredandrequired(Centola,2013;Cline&Haynes,2001).Thetaskistoanalyzediscussiontopics(DiMaggio,Nag,&Blei,2013)andtoinvestigatehowwriterslearnhealthconsciousnesssocially,whileexpressinghealthliteracycapabilities.Tothatend,thestudyemployedboththetopicmodelingalgorithmLDA(Blei,Ng,&Jordan,2003),andclosereading.
Materials&methodsSuomi24.fi("Finland24.fi")isthelargestandoneoftheoldestFinnishdiscussionforumsinwhichreadersandcontributorsmayeitherregisteroruseatemporaryalias.Variousdiscussionsubforaconsistofdiscussionthreadsthatengagecontributorsinconversationswhichoccasionallylastaslongasseveralyears.
AdatasetcoveringSuomi24-activityfromyears2001to2015isavailableforacademicuseatTheFinnishLanguageBankFin-Clarindatabase48.Itconsistsofover55milliondiscussioncommentsandtheirmetadata,e.g.timestampsandcontributornicknames.Fromthisdatabase,thefullHomeopathysubforumdatasetwasacquiredinCSVformat.Eachrowincludedanoriginaldiscussionsentence,itslemmatizedsentencewithitsstopwordsremoved,andsentencemetadata.Thedatafiletotals26MB(52,729sentences,or9,326comments).
Asthefirststep,anLDAalgorithmdevelopedwithPythonGensimpackagewasrunwithavaryingnumberoftopics.Afteralgorithmicmodeling,commentsandtheirsurroundingdiscussionthreads
48Resourcedescription:http://urn.fi/urn:nbn:fi:lb-2017021503
59
wereanalyzedwithclosereading.TheoriginaltextswerethensampledusingtopicmodelkeywordsasCSV/Excelsearchkeywords.Thisresultedina15-topicmodel,combinedintoframesasfollows:
• Fieldsofcontroversy:Both“proponents”and“opponents”ofhomeopathydiscussingscientificevidencefor/againsthomeopathyanditsareasofapplication.(topics#A)
• Historicalcontext:Mostlyproponentsdescribingthelengthsoftheirpersonalexperiences,aswellasthehistoryofhomeopathy.(2topics)
• Celebritydiscussion:Mostlyproponentsdiscussingaphysicianwhoisapublicproponentofhomeopathy.(1topic)
• Helpgainedfromhomeopathy:Contributorsemployedthisframetodescribetheirexperienceswithhomeopathy.(topic#B)
• Askingquestions:Primarily“newinitiates”describingtheirconditionandaskingforhelp.(topic#C).
Afterrecognizingthediscussionframesalgorithmically,analysiscontinuedwithfurtherclosereading,includingsampled,originalonlineconversations.
ResultsTypically,homeopathy“proponents”seethetreatmentascomplementarytotraditionalbiomedicine.Theproponents’positioningofhomeopathyinrelationtobiomedicalmedicineisrevealedintheframesonhomeopathichistoryandcelebrities.Inthispaperhowever,thefocusisonthethreeremainingframes(#A,#B,#C)toobservehealthconsciousnessandhealthliteracyexpressionsindialoguesbetweenproponentsandopponentsofhomeopathy.
Self-professed“initiates”tohomeopathy(topic#C)oftenseekpeerexperiencesonhowhomeopathyhandlescertainconditions.Inresponse(topic#B),someproponentsunderlinetheimportanceoffindingtherighthomeopathicpractitioner.Incontrast,someproponentsdescribetheuseofhomeopathicproductsthattheyadministerindependently,withoutthehomeopaths’advice.Furthermore,someauthorsdescribehomeopathy’sfailuretocuretheircondition,whereas“opponents”promoterelianceonbiomedicine.
Experienceandopinionsharingelicitsdialogue,whichoftenturnsheated.Especiallytheself-professed,experiencedproponentssoughttoactivelydefendtheirindividualexperiencesagainstattacksfromopponentsofhomeopathy(topic#B),ashighlightedbelow.
Personalexperiencewithmychild’schronicotitis:Thechildcouldn’tstomachmanyantibioticsduetoallergiesandasensitivitytomedicines.Medicineswouldoftencauseseveresymptomssoalternativeswereneeded,andhomeopathyprovidedtheanswer.Howcouldasmallchildunder3yearsoldpretendthatasubstanceishelpful,theywouldn’tknow.Andyet,homeopathywastheonlyeffectivemeasure.Thelatercheck-upsconfirmedthattheinfectionwasremedied.Ourpets,too,havehadsuccesswithhomeopathy,andtheycan’tpretendeither.Agoodhomeopathcanchooseasuitablesubstance.Itneedstobethecorrectonetobeeffective.49
Thehomeopathiccommunitysupportsitsproponentsfacingopponentscrutiny.Somestrategiesinvolvecapabilitiesandlexisthatsuggesthighhealthliteracy.Forexample,opponentsmaydefendbiomedicaltreatmentsoraccusehomeopathyoflackingscientificevidence,usingbiomedicaltermslike“PediatricNeurotransmitterDisorders”50.Theproponentsthenexpresstheircapabilityin
49 Author”ArnicaD”,31.8.2014,http://keskustelu.suomi24.fi/t/12228806/homeopatiaa-kokeilleiden-
kokemuksia-kaivataan!50 http://keskustelu.suomi24.fi/t/2428127/tarkkaavaisuushairio
60
counterattacks,citingcommercialnatureofbigpharmaordownsidesofexcessiveuseofantibiotics,usingbiomedicaltermslike“Hospital-AcquiredInfections”(topics#A).
Thisstudysuggestsaninter-groupconflictmechanismforsocialhealthconsciousnesslearning.TheSuomi24homeopathiccommunityexhibitstracesofhealthliteracy,yetpositionsCAMtreatmentsdifferentlythanitsopponents.Thematerialandmethodsobviouslydonotallowconcludingthatallcommentsreflecttheauthors’actualexperiencesandopinions.However,forumdiscussionsmaybeanimportantlearningenvironmentforcontributorsandnon-contributingvisitorsalike.Tounderstandthesocialprocessofhealthconsciousnessinthefieldofhomeopathy,thisstudyshouldbecomplementedwithinterviewsandethnographicresearch.
REFERENCESBains,S.S.,&Egede,L.E.(2011).AssociationofHealthLiteracywithComplementaryandAlternativeMedicineUse :ACross-SectionalStudyinAdultPrimaryCarePatients.BMCComplementaryandAlternativeMedicine,11(138),7.http://doi.org/10.1186/1472-6882-11-138
Blei,D.M.,Ng,A.Y.,&Jordan,M.I.(2003).LatentDirichletAllocation.JournalofMachineLearningResearch,3,993–1022.
Centola,D.(2013).Socialmediaandthescienceofhealthbehavior.Circulation,127(21),2135–2144.http://doi.org/10.1161/CIRCULATIONAHA.112.101816
Chinn,D.(2011).Criticalhealthliteracy:areviewandcriticalanalysis.SocialScience&Medicine,73(1),60–67.http://doi.org/10.1016/j.socscimed.2011.04.004
Cline,R.,&Haynes,K.(2001).ConsumerhealthinformationseekingontheInternet:thestateoftheart.HealthEducationResearch,16(6),671–92.
DiMaggio,P.,Nag,M.,&Blei,D.(2013).Exploitingaffinitiesbetweentopicmodelingandthesociologicalperspectiveonculture:ApplicationtonewspapercoverageofU.S.governmentartsfunding.Poetics,41(6),570–606.http://doi.org/10.1016/j.poetic.2013.08.004
LexisNexis.(2007).HowManyPagesinaGigabyte ?Retrievedfromhttps://www.lexisnexis.com/applieddiscovery/lawlibrary/whitePapers/ADI_FS_PagesInAGigabyte.pdf
Puuronen,A.(Ed.).(2015).TerveystajuNuoretpolitiikkajakäytäntö.Helsinki,Finland:Nuorisotutkimusverkosto.
Sorensen,K.,VanDenBroucke,S.,Fullam,J.,Doyle,G.,Pelikan,J.,Slonska,Z.,…EuropeanHealthLiteracyProjectConsortium(HLS-EU).(2012).Healthliteracyandpublichealth:asystematicreviewandintegrationofdefinitionsandmodels.BMCPublicHealth,12(1),80.http://doi.org/10.1186/1471-2458-12-80
Stoneman,P.,Sturgis,P.,&Allum,N.(2012).Understandingsupportforcomplementaryandalternativemedicineingeneralpopulations :Useandperceivedefficacy.Health,17(5),512–529.http://doi.org/10.1177/1363459312465973
Walsh,C.,&Elhadad,N.(2014).ModelingClinicalContext:RediscoveringtheSocialHistoryandEvaluatingLanguagefromtheClinictotheWards.,224–231.
61
4.WebdataextractionallowsindependentevaluationofGlobalAbsolutePoverty51MichailMoatsosUtrechtUniversity
Thewidelyapplied“dollar-a-day”methodologyidentifiesglobalabsolutepovertyasdecliningprecipitouslysincetheearly80'sthroughoutthedevelopingworld.Themethodologicalunderpinningsofthedollar-a-dayapproachhavebeenquestionedintermsofadequatelyrepresentingequivalentwelfareconditionsindifferentcountriesandyears[ReddyandPogge,2010;Deaton,2010;Srinivasan,2010;AtenandHeston,2010;Sub-ramanian,2015;Moatsos,2015].Ifempiricallysubstantiated,suchcriticismdirectlyquestionsthevalidityofthedollar-a-daymethodologysinceininternationalpovertymeasurement“thefirst-orderissueistodemandwelfareconsistency"[Ravallion,2015,p.4].
However,anindependentexaminationofthelevelsandtrendsofglobalpovertyisaverydemandingtask.Inmostofitspartthisisduetotherestrictedaccessonnationaleconomicdistributionsofincomeorconsumptionthatareutmostessentialforthecalculations.TheeasiestwaytousethosedistributionsisthePovcalNetwebsiteofferedbytheWorldBank.Unfortunately,theBankdoesnotmaketheunderlyingdistributionaldataavailable.Insteaditonlyconditionallyallowsdirectcalculationsofglobalpoverty.Theconditionbeingthattheindependentresearcheracceptsthevalidityofthedollar-a-dayapproachthattheWorldBankfollowsratherreligiously.Thusaseriousproblemappearswhenoneseekstoevaluateglobalpovertyusingadifferentmethodology,orwhenonetriestoquantifytheaforementionedconcernsagainsttheWorldBankmethodology.
ThesolutiontothisconundrumistheuseofITmethodsthatscraptheunderlyingdatafromthePovcalNetwebsite,sothattheycanbeusedindependentlyofanyconditions.ThishasbeendoneinafirststepbyDykstraetal.[2014],butwithseveralissuesofdiscrepancybetweenthedataofferedbyPovcalNetandthosemadeavailablebytheauthors.BasedontheirworkIsimplifytheprocessonestepfurtherbyallowingautomatedevaluationofpovertyinbulkbasedonindependentlycalculatedpovertylines.ThisapproachhastheadvantageofqueryingthePovcalNetservice“asitis”withoutdiscrepancies.
Thealternativetothedollar-a-daymethodologicalapproachistoestimateabsolutepovertyonagloballevelusingappropriatelydefinedconsumptionbasketsforeachcountryandyearseparately.Allen[2001]definestheBBBsforuseinthehistoricalrealwagesliterature,anddeZwartetal.[2014]applythemonaglobalscale.Table1containstheoverview.TheBBBsareconstructedsuchastorepresentbareminimumabsolutepovertylevelsinconsumptionterms.However,theabsolutepovertyyardstickcanbeexpandedtoaccountforotheressentialelementsoflifeandwellbeing,suchaseducationandhealth,asboththeCopenhagenDeclarationandtheUniversalDeclarationofHumanRightsstipulate.TabledescribesonesuchBBBderivativethatallowsforconsiderablyhigherwelfarelevelscomparedtothebasicBBB.
51 ThispaperislargelybasedonaforthcomingarticleintheJournalofGlobalizationandDevelopment
entitled“GlobalAbsolutePoverty:BehindtheVeilofDollars”.
62
Table1:Thecompositionofbarebonesbasketsinrealwagesandthetwoderivativesappliedhere.
Item Unit/Year RealWages BBB BCS
Basket
EnergyTarget kcal MDER MDER
1455/2100
Minimization - cheapestbundle meanof3
cheapestbundles
Mainstaple kg basedonkcal/proteintarget**
155-413*
Beansorpeas kg -/20/45 LP 40atminimum
Meatorsh kg 3or6 3or6 12or24
Butteroroilorghee kg 3 3 12
Sugar kg -/2 2 8
Linen(applied) share 8% 8%±2% WBGC
Lampoil liter 1.3 1.3 WBGC
Soap kg 1.3 1.3 WBGC
Candles kg 1.3 1.3 WBGC
Fuel mbtu 3 f(Tin°C) WBGC
Cooking mbtu - MDER WBGC
Housing mark-up 5% 5%±2% WBGC
Health,Education,Water % - - WBGC
Additionalshares % - - WBGC
Note:TheBarebonesbasketwithConsumptionShares(dubbedBCS)usestheaverageofthreecheapestbundles,andfourtimesmoremeat/fish,butterandsugarallowance.Inaddition,anallowancecoveringhealth,education,andwaterisincludedusingtheconsumptionbudgetsharesfromtheWorldBankGlobalConsumptiondataset(notedasWBGConthetable).Consumptionbudgetsharesarealsousedforenergy,housing,andclothing,andallowancesforPersonalCare,ICT,FinancialServices,andOthersareincludedintheadditionalshares.
*:dependingonthecountryandmainstaple
**:Toavoidinflatingthepriceoftheconsumptionbundle,priorityinlinearprogrammingisgiventothekcaltarget,andproteintargetisallowedtoovershootby200%atmaximumifnecessary.OnlyforDominicanRepublicthiscapincreasesthebundlepricebymorethan20%,andforBelarusbymorethan10%,comparedtoallowingforunlimitedproteinovershooting.Forallothercountriesthereincreaseifanyisrestrictedtoonlyafewpercentagepointsincrease.
63
Theresults(figure1)showthat,intermsoflevels,ontheonehandthetargetofalleviatingabsolutepovertyisnotasfaroffaswasthoughtof,butontheotherhand,
absoluteBBBpovertyhasshownremarkablepersistencethroughouttheperiod.ThedifferencewiththePovcalNetestimatesisenormousthroughout.Comparingthe1990and2014estimatesleaveslittleroomforcelebrationsovertheachievementofhalvingabsoluteglobalpovertybetween1990and201552.UsingtheBBBpovertylinesthepointestimateforglobalpovertyin1990is5.6%andfor20143%.UsingtheBCSpovertylines,thecorrespondingratesare62%and33%.Inturn,thisshowsthattheconclusionaboutthequestionableMDG1successdoesnotresultfromtheverylowwelfarelevelthattheBBBpovertylinesencapsulate.
Figure1:EvolutionofpovertyintheDevelopingWorld,1983-2014.PCN2005/11refertotheWorldBankglobalpovertyestimatesbasedonthe2005or2011--socalled--ICProunds.
ThevastdifferencesamongBBBwelfarelevelandtheInternationalPovertyLine(iPL)canbeattributedontwoelements.First,themuchlowercostsofbarebonessubsistencecomparedtothe$1.9valueforthevastmajorityofthecountriesandyears.Andsecond,onthedifferentialbetweenconsumerpriceindexandtheBBBpriceindex.ThealsoverylargedifferencesofiPLwithBCS,especiallyonthelateryearsoftheperiod,isattributabletotheinabilityoftheiPLtoencapsulateexpensesthatarenecessaryinescapingabsolutepovertyasdescribedininternationaltreatiesandconventions.
Thisresearchdemonstratesthattheuseofdigitaltechniquesforscrapingonlinedatathatarenotexplicitlyprovidedfordownloadingcanprovideanswerstobigquestionssuchasthelevelandtrendsofglobalpoverty.Itisimportantthatthisworkcanbeperformedindependentlyofthecustodian
52 MillenniumDevelopmentGoal1:“Target1.A:Halve,between1990and2015,theproportionofpeoplewhoseincomeislessthan$1.25aday”.TheWorldBankhasannouncedthatthisgoalhasbeenachievedasearlyas2010,fiveyearsaheadofschedule.
64
institutionsformonitoringandreducingpoverty(theWorldBank).Institutionallybaseddecisions,evaluationsandcalculationsarenotnecessarilybeyonddispute;andmustnotbe.ThenextstepinthisprojectistoelaborateontheproperaccountingofuncertaintiesintheestimatesusingtheMonteCarlomethodforpseudo-experiments.This,computationallyverydemandingtask,wouldallowforamoreappropriatecomparisonofthepovertyestimatesbetweenthetargetyearsofMDG1.
ReferencesAllen,R.C.(2001).TheGreatDivergenceinEuropeanWagesandPricesfromtheMiddleAgestotheFirstWorldWar.ExplorationsinEconomicHistory,38:411-447.
Aten,B.andHeston,A.(2010).UseofCountryPurchasingPowerParitiesforInternationalComparisonsofPovertyLevels:PotentialandLimitations.InAnand,S.,Segal,P.,andStiglitz,J.E.,editors,DebatesontheMeasurementofGlobalPoverty.OxfordUniversityPress,Oxford.
deZwart,P.,vanLeeuwen,B.,andvanLeeuwen-Li,J.(2014).RealWages.InvanZanden,J.L.,Baten,J.,D'Ercole,M.M.,Rijpma,A.,andTimmer,M.P.,editors,HowWasLife?GlobalWell-beingsince1820,chapter4,pages73-86.OECDPublishing,Paris.
Deaton,A.(2010).MeasuringPovertyinaGrowingWorld(orMeasuringGrowthinaPoorWorld).InAnand,S.,Segal,P.,andStiglitz,J.E.,editors,DebatesontheMeasurementofGlobalPoverty,pages187-222.OxfordUniversityPress.
Dykstra,S.,Dykstra,B.,andSandefur,J.(2014).WeJustRanTwenty-ThreeMillionQueriesoftheWorldBank'sWebsite.
Moatsos,M.(2015).GlobalAbsolutePoverty:BehindtheVeilofDollars.CGEHWorkingPaperSeries,(77).
Ravallion,M.(2015).TowardBetterGlobalPovertyMeasures.CenterforGlobalDevel-opment,WorkingPa(417).
Reddy,S.G.andPogge,T.(2010).Hownottocountthepoor.InAnand,S.,Segal,P.,andStiglitz,J.E.,editors,DebatesontheMeasurementofGlobalPoverty,pages42-51.OxfordUniversityPress.
Srinivasan,T.N.(2010).Irrelevanceofthe$1aDayPovertyLine.InAnand,S.,Segal,P.,andStiglitz,J.E.,editors,DebatesontheMeasurementofGlobalPoverty,pages143-151.OxfordUniversityPress,NewYork.
Subramanian,S.(2015).Oncemoreuntothebreach.Economic&PoliticalWeekly,L(45):35-40.
65
SessionI
1.SociologyofFrenchVideoGameMagazinesBjörn-OlavDozo
ThefirstvideogamemagazineinFrench,Tilt,waspublishedbyEditionsMondialesinSeptember1982,justafewmonthsafterthefirstreleaseofComputerandVideoGames(UK,November1981)andElectronicGames(US,November1981).ItestablishedamodelforfutureFrench-speakingvideogamemagazines,withastablestructure(news,previews,tests)presentinanymagazineuntiltheearly2000’s.
Graphoftherelationsbetweenmagazinesofthecorpus.
The1990’sareaveryprofitabledecadeforthesemagazinesastheeditorialfieldisstructuredtosupportgamedevelopers,withapro-Nintendopoleandapro-Segapole.Whilemagazinetitlesstoodinrhetoricopposition(SuperPowerpro-Nintendovs.MegaForcepro-Sega,about120000monthlycopieseach),theysharedthesameeditorialboards:thesamejournalistswroteindifferent
66
magazinesofonepublisher,butwithdifferentpseudonyms.Attimes,theysimulatedcompetitionbetweenthevariouseditorialboards,givingtothereadersthefeelingofbelongingtoacommunity.Thiskindofstrategieswascommonuntil1996,butwhenanewchallenger(Sony)cameintothedance,somemagazineschosetomergewitholdcompetitorsofthesamepressgroupinordertosurvive.
In2003,“FutureFrance”boughtalmostallthevideogamesmagazinestitlesavailableontheFrenchmarket.Thishegemonicstrategy,however,hasnotproventobeprofitableonthelongterm:alotofthesetitles,evenlong-runningmagazineswithfaithfulaudiences,discontinuedtheirpublicationintheyearsfollowingthebuyout.Mytalkwillquestionthecontextofthesecessationsofactivities.Differentreasonscouldbegiven:theinternetexplosionofvideogamesinformation’swebsites,theweaknessoftheeconomicmodelofthepaperpressorthedemotivationofjournalists.Otherinitiativesemergedatthistime,asCanardPCandGamingforexample,proposingadifferentbusinessmodel(independentpress).Afterthisfirststage,Iwillfurtheranalysethecareer-pathofthesespecializedjournalistswithasocialnetworkanalysis,followingtheirpathbetweendifferentredactionsinthisverysmallworld.ThedatabasethatIuseiscompiledfromtheexaminationofabout80titlesofFrench-speakingvideogamemagazinesover30years.Withthesedata,Iwillshowtheevolutionofthefield,withthemigrationofsomejournalistsbetweendifferentpublications,sometimesonthebasisofakindof“mercato”oflocalwritingstars.
BibliographyBae,Arram,DoheumPark,Yong-YeolAhn,andJuyongPark.2016.‘TheMulti-ScaleNetworkLandscapeofCollaboration’.PLOSONE11(3):e0151784.doi:10.1371/journal.pone.0151784.
Falk,Casey.2014.‘UsingNetworkAnalysisontheDuCheminMusicDatasettoReconstructMissingMusic’[unpublishedpaper].
Giannetti,Francesca.2016.‘AReviewofNetworkApproachesinMusicStudies’.MusicReferenceServicesQuarterly19(2):156–63.doi:10.1080/10588167.2016.1166842.
Gresham-Lancaster,Scot.2014.‘ComputerMusicNetwork’.Leonardo47(3):266–67.doi:10.1162/LEON_a_00771
2.Musicalnetworks–NetworksofmusicMarnixvanBerchum,UtrechtUniversity
NetworkmodelsarewidelyusedinDigitalHumanitiesforunderstandingrelationalstructuresininformation.Theavailablemathematicaltoolsusedinnetworkscienceallowscholarstoanalysetheirmaterialinaquantitativemanner,andforexamplefindrelativecentralitymeasuresforcertainnetworkentitiesordiscovercommunitystructures.Visualisationtoolsassistindiscoveringlargescalepatternsinthenetwork,pointingtoareaswhereamorethorough,qualitativeanalysisisneeded.ThepresenceofnetworkrelatedcontributionsintheprogrammesofDigitalHumanitiesconferences,theongoingemergenceofnewtoolsforbuildingandvisualisingnetworks,andthemanyhumanitiesprojectsmakinguseofthesepublicationsandtoolsattesttothispopularity.Withadifferentlevelofintensity,mostHumanitiesdisciplinesmakeuseofnetworkmethodologies.Thereisforexampleastrongcommunityofhistoriansworkingwith/onnetworks,demonstratedbytheextendedbibliographyandoverviewoftoolsathttp://historicalnetworkresearch.org.Thesessionsofthe‘Arts,HumanitiesandComplexNetworks’satellitesattheyearlyNetScimeetingsshowavarietyofdisciplines–includingarthistory,filmhistory,literaryhistoryandmusicology–makinguseofthemethodsandtoolsofnetworkscience.
67
Inrecentyearsagrowingnumberofpublicationsappeared,thatcombinesnetworksandmusic.Thesubjectsarewideranging,fromsocialnetworksbetweenseventeenthcomposers(SmithandTaylor2014),toco-occurencenetworksofcomposersonCDrecordings(Parketal.2015;Baeetal.2016),to“ComputerMusicNetwork”ofthelate70s/early80s(Gresham-Lancaster2014).Similartotherangeofsubjectsavarietyofnetworkmethodologiesappliedtomusicisdiscernibleinthesepublications.Iwillcomparetheapproachesoftheselectedpublicationsandanswerquestionsonhowtheyrelatetothemore‘traditional’musicologicaldiscourse.Thepaperwilldiscussthebiasespresentinthedatausedinthepublicationsandhowtheseeffectthemusicologicalconclusionsmade.Ittouchesuponthetensionbetweenthequantitative(‘distant’)characterofnetworkscienceandthequalitative(‘close’)characterofmusicologyresearch.
TheaimofmyownPhD-researchistobridgethisgap.InmyresearchIuseanetworkapproachtoshedlightonthedisseminationofpolyphonicmusicinthesixteenthcentury,theageoftheemergenceofprintedmusic.Primarymusicalsourcesandthecompositionstheycontainarethetwoentitiesthatformthenetwork.Sinceonesourcecontainsseveralcompositions,andonespecificcompositionmaybepresentinmultiplesources,abipartitenetworkofsourcesandcompositionscomesintoexistence.Bothextensivemusicologicalstudiesofthesesourcesandcompositions,aswellashighlevelnetworkstructuresareusedtoformulateamodelforthedisseminationofmusic.InthispaperIwillcompareandrelatemyownexperiencewiththeevaluationoftheselectedpublications,concludingwithaninsightintowhatnetworks,socialnetworkanalysisandrelatedmethodsandtoolsoffer–andmayofferinthefuture–tothefieldofMusicology.
IllustrationtakenfromBaeetal.2016,showingtheco-occurenceofmusicians(composersandperformingartists)onCD-recordings;datatakenfromArkivMusic.
Illustrationfromtheauthor’sPhDresearch,showingthenetworkofmanuscriptsfromtheAlamirescriptorium;eachnoderespresentsamanuscript,anedgerepresentsatleastonemusicalcompositiontwomanuscriptshaveincommon.
BibliographyBae,Arram,DoheumPark,Yong-YeolAhn,andJuyongPark.2016.‘TheMulti-ScaleNetworkLandscapeofCollaboration’.PLOSONE11(3):e0151784.doi:10.1371/journal.pone.0151784.
68
Falk,Casey.2014.‘UsingNetworkAnalysisontheDuCheminMusicDatasettoReconstructMissingMusic’[unpublishedpaper].
Giannetti,Francesca.2016.‘AReviewofNetworkApproachesinMusicStudies’.MusicReferenceServicesQuarterly19(2):156–63.doi:10.1080/10588167.2016.1166842.
Gresham-Lancaster,Scot.2014.‘ComputerMusicNetwork’.Leonardo47(3):266–67.doi:10.1162/LEON_a_00771.
Park,Doheum,ArramBae,MaximilianSchich,andJuyongPark.2015.‘TopologyandEvolutionoftheNetworkofWesternClassicalMusicComposers’.EPJDataScience4(1).doi:10.1140/epjds/s13688-015-0039-z.
Piekut,Benjamin.2014.‘Actor-NetworksinMusicHistory:ClarificationsandCritiques’.Twentieth-CenturyMusic11(02):191–215.doi:10.1017/S147857221400005X.
Smith,DavidJ,andRachelleTaylor.2014.NetworksofMusicandCultureintheLateSixteenthandEarlySeventeenthCenturies:ACollectionofEssaysinCelebrationofPeterPhilips’s450thAnniversary.
3.The“FrameGenerator”.AnalternativemethodforapproximatingcoremeaningsintextsJorisvanEijnatten(UniversiteitUtrecht)JulietteLonij(KoninklijkeBibliotheek)
Tracingsemanticpatternsovertimeonthebasisoftextsisstillinitsinfancy.Mostapproachesbuildonalinguisticprinciplewhichstatesthatthemeaningsofwordsaredetermined‘bythecompanytheykeep’.Inotherwords,meaningsarisefromcontextsdefinedasdistributionsofwords,whichsuggeststhatwecantracemeaningsovertimebyexaminingchangingcontexts.Topicmodellingisatthismomenttheonlytechniquebasedontheprincipleofworddistributionsthathasgonebeyondanexperimentalstageandhasprovenitsvaluebyachievingresultsthatdomainexperts(inthiscasehistoriansnotnecessarilyinvolvedincomputer-assistedresearch)recognize.
Thispaperdiscussesanewtool,dubbedthe‘FrameGenerator’,aimedatmeaningfullyreducingasetof(possiblythousandsof)Dutchtextstowordpatternsthatcutacrossthedistributionsgeneratedbytopicmodelling,thusprovidingadditionalinsightintothecontentofthedataset.Themethodimplementedbuildsontopicmodellingbycombiningitwithtwootherproventechniques:(1)theautomaticextractionofkeywordsand(2)theidentificationofcollocates.ThePythonsourcecodeofthetool,offeringacommandlineinterface,isavailablefordownloadonGitHub(https://github.com/jlonij/frame-generator).Anonlinedemowithagraphicaluserinterface,showcasingthetool’smainfunctionalityforasmalldataset,canbefoundathttp://kbresearch.nl/frames/.
TheFrameGeneratorwasdevelopedtoassistintheinvestigationofpopularperspectivesontheconceptof‘Europe’arisingfromtheKBcollectionofDutchhistoricalnewspapers.Tothisend,adatasetwaspreparedofarticlesthatmentionedtheword‘Europe’atleastonce.Asubsetofarticleswasthenselectedonthebasisof(Dutch-language)synonymsforthewords‘unity’and‘unification’(suchas‘integration’,‘agreement’,‘settlement’,‘consensus’,‘treaty’,‘harmony’,etc).ThissubsetwasassumedtocontainnewsarticlesthatdiscussEuropeasaunifiedpolitical/cultural/economicentity,orasanentityinvolvedinaprocessofunification.Theothersubsetwasbasedonsynonymsforcompetitions(suchas‘match’,‘prize’,‘winner’,‘cup’,etc);thissubsetwasassumedtocontainarticlesonsportsandothercompetitions.
69
TheFrameGeneratorprocessofanalyzingthesedatasetsconsistsoffourstages.Thefirststageconcernsthepre-processingofthedataset.DuringthisstagethedatasetiscleanedbynormalizingspellingvariationsandcorrectingOCRerrorsonthebasisofuser-providedlistsofregularexpressionsandtheirreplacements.Inaddition,thedatasetistokenized,lemmatizedandpart-of-speechtaggedwiththeNaturalLanguageProcessingsuiteFrog(https://languagemachines.github.io/frog/).Theuserhastheoptionofsplittinglargerdocumentsintosmallerunitsofanalysisbyspecifyingthemaximumnumberofsentencestobecontainedineachunit.
Thesecondstageintheprocessistopicmodelling,whichgeneratesspecific,substantivethemesortopicsbasedonfrequentlyrecurringdistributionsofwords.TheFrameGeneratorofferstwomethodsoftopicmodelling:onebasedonMallet(http://mallet.cs.umass.edu),theotherontheGensimtopicmodellinglibrary).Theuserisabletocontrolthenumberoftopicsgeneratedandnumberofwordsmakingupeachtopicbymeansofvariouscommandlinearguments.Thisstagealsoinvolvesthemanual,hermeneuticinterpretationofthetopicsbasedonhistoricaldomainknowledge.
Thethirdstagefocusesontheextractionofasingle,rankedlistkeywordsfromthesetoftopicsresultingfromthepreviousstage.Therelevanceofeachwordoccurringinthesetoftopicsisdeterminedbytakingthesumoftheprobabilityscoresforthewordoveralltopicsinwhichitoccurs.Awordisaccordedthestatusofkeywordifitsscorereachesacertainthreshold,setatthediscretionoftheresearcher.TheFrameGeneratorcanalsoproduceakeywordlistonthebasisoftf-idfscores,thusallowingtheresearchertocomparetheresultsofdifferentapproaches.Theoptionisavailabletorestrictthecandidatesforthekeywordlisttowordswithspecificpart-of-speechtags.Thekeywordsthusobtainedmayberegardedascoreelementsinaseriesofthematicallyuniformtexts;theirsignificancearisesfromthefrequencyoftheiroccurrencewithinaswellasacrosstopics.
Thefourthandfinalstageoftheanalysisprocessconsistsofcontextualisingthekeywordsbyfindingcollocatesinthetextsfromwhichtheywereoriginallyextracted.Theusersetsamaximumworddistancefromthekeywordaswellasthedirection(left,right,orboth)inwhichcollocatesmustoccurinordertoqualify.Aswiththeextractionofkeywords,theoptiontoincludeonlyspecificpart-of-speechtagsisalsoprovidedforcollocates.Thesetofcollocatesthusgatheredforagivenkeywordiscalleda‘frame’.Thewordsappearinginaframeareorderedbythefrequencyoftheirco-occurrencewithandtheirdistancetothekeywordwithwhichtheyareassociated,expressingtheirsignificanceinframingaspecifickeyword.
Theresultsofeachofthesestagesaresavedandaccessibletotheuserintheformofcomma-separatedvalues(CSV)files.Thesecan,forexample,beusedtovisualisethegraphofthekeywordsandtheircollocatesinanapplicationsuchasGephi(https://gephi.org)inordertofacilitatetheinterpretationoftheresults.BycreatingsuchnetworkgraphsfortheFrameGeneratorresultsforanumberofdifferenttimeperiods(seeFigure1foranexample)wefoundthatnewspaperreportingon‘Europeanunity’,whileshowingaremarkabledegreeofcontinuity,becamelessrichrhetorically,lessinternational,andmorefocusedoninstitutionaltechnocracythanonintra-continentalrelationsoverthecourseofthetwentiethcentury.
ThispaperhypothesisesthattheFrameGenerator,bylayingbarethefundamentalpatternsinsetsofthematicallycoherenttexts,enableshistorianstobetterdeterminecontinuitiesanddiscontinuitiesinexpressionsofpublicopinion.TheFrameGenerator’sperformancedependsonthatofitsconstituenttools(suchastopicmodelling),whichhavebeendescribedintheliterature.Itsadvantagesincludeitsadaptabilitytootherlanguages(giventheavailabilityofpart-of-speechtagging),itsflexibility(theusercansetallvariables)andits‘all-in-one’packaging(itrequiresnoprogrammingskillswhilegeneratingnotjustframesbutalsokeywordsandtopics).Fordomainexperts(historians)theproofofthepuddingwillbeintheeating:doesthisparticularcombinationoftools–topicmodelling,keywordextractionandidentificationofkeywordcollocates–offerusefulresults?Thequestioncanonlybeansweredbyrunningthetoolonavarietyofrelativelyhomogenousdatasets.
70
Figure1.NetworkgraphinGephishowingaframeofcontextualisedkeywordsrelatedtospatialentities(green),conceptsrelatedtocommunityformation(blue)andabstractterms(purple),basedonnewspaperarticlesfromDeTelegraaf(1925-1929;n=767)
4.Hybridapproachestohistoricalresearch:analysingtheAnneFrankdiarieswithdigitaltoolsDr.GerbenZaagsmaLichtenbergKolleg,Georg-August-UniversitätGöttingen
Thispaperarguesforahybridapproachtohistoricalresearchthatcombines’traditional’withdigitalhermeneuticalapproachesinanewpracticeofdoinghistory.Asthedigitalturnaltersandaffectsallpartsofthehistoricalresearchprocess,thisisapressingchallengeandneedforallhistorians,notjustforthoseengagedin‘bigdata’projects.Indeed,hybridityis,andshouldbe,thenewnormal.Yet
71
whilemosthistoriansareaccustomedtodeployingdigitalapproachesintheinformationgatheringstageoftheirresearch,theyoftenrefrainfrom‘goingdigital’initsprocessingandespeciallyanalysisstages.DescribinganumberofdigitaltoolsusedinworkdoneonthediariesofAnnefrank,thepapercriticallyanalysesanddemonstratestheaddedvalueofincorporatingtheminallstagesofhistoricalresearch.Digitalapproachesenhancethemethodologicalrepertoirefurnishedby‘traditional’closereadingpractices.Hybridapproachesthusexpandourintellectualhorizonsandtheanalyticalpowerwebringtobearuponoursources.
Thepaperconsistsof1)atheoreticalpart,contextualisingnotionsof‘traditional’anddigitalapproachestohistoricalresearchandtheuptakeofthelatter;and2)aconcretecasestudyofahybridapproachtohistoricalresearch.
Thefirstpartwillbrieflydiscussdiscoursesaround‘goingdigital’thatoftenoppose‘traditional’todigitalapproaches.Onagenerallevel,thiseither/orattitudeismisleading;despitewhatisoftenassumed,orimplicitlysuggested,distinctionscannotbeneatlymappedalonglinesofclosereading/distantreading,quantitative/qualitativeorpositivist/narrativeanalysiseither.Morespecifically,thisoppositionisalsoproblematicbecause,forinstance,closereadingcanalsoinvolvetheuseofdigitaltools,andthesameobviouslygoesforqualitativeanalysis.Inthisrespect,oneshouldmentionFrédéricClavert’suseofFrancoMoretti’sconceptof‘distantreading’toproposeanewwayofreadingandinterpretinghistoricalsourcesinthedigitalageusingtwoaxes–closereading/distantreadingandhumanreading/computationalreading.
Thefocuswillthenshifttotheproblemofuptakeofdigitalapproachesamonghistorians.Here,adistinctionisdrawnbetweentwobroadstrandsofhistoricalresearchinthedigitalera,asmeasuredbytheirapplicationofdigitalapproaches:
• ontheonehand,anumberofdigitalhistorianstake(big)dataanddigitaltools(development)astheirpointofdeparture;theirfocusisondigitaldatasets(forinstancenewspapers)andtheapplicationofdigitaltoolstoananalysisofthatdata.Thisyieldsresearchresultsthatareoftenasmuch,ifnotprimarily,concernedwithcriticalreflectionondataandtoolsaswiththeresearchtopicathand.Tooldevelopmentisoftenalsopartoftheprocessandprojectandresearchquestionstendtobedictatedbytheavailabledataandtools.
• ontheotherhandtherearethosehistorians,arguablythemajority,whoseresearchdoesnotstartwithdatasetsandtools;theydepartfromparticularresearchquestionspertinenttotheirtopicofresearchthatcouldbeanswered,atleastinpart,bydigitalmeans;thequestionthenbecomeshowdigitalapproachescanaid,enhanceandcomplementtheiranalyses.Aswillbeclear,theproblemofuptakeispertinenttothisgroupofhistorians.
Thesecondpartofthepaperconcernsitselfwithahybridanalysisofaconcretehistoricalsource:thediariesofAnneFrank.Thispartofthepaperisoneoftheoutcomesofathree-yearresearchproject(ThediariesofAnneFrank.Research—Translations—CriticalEdition)whichwascarriedoutattheLichtenbergKollegoftheGeorg-August-UniversitätGöttingen.Theprojectinvolvesanewscholarlyeditionofthediariesaswellasanaccompanyingmulti-authorresearchmonographwhichwillfocusoncontextualisation,receptionandrepresentationsofthediaries.Describinganumberofdigitaltools(notablytextminingandQDAsoftware)usedinanalysingthediaries,thepapercriticallyanalysesanddemonstratestheaddedvalueofincorporatingtheminallstagesofhistoricalresearch.TheaimhereistoapplyClavert’sbasicmodel,asmentionedabove,toaconcretecasestudyand,ultimately,toprovidehistorianswithaconcreteexampleofahybridapproachtohistoricalresearch.
72
SessionJ
1.TowardsatoolanddatacriticismframeworkAdeveloper’sanduser’sperspectiveSallyChambers1,JokeDaems1,GretaFranzini2,MarcoBüchler2,SusanAasman3
1GhentCentreforDigitalHumanities,GhentUniversity2InstituteofComputerScience,UniversityofGöttingen3GroningenCentreforDigitalHumanities,UniversityofGroningen
Astheamountofaccessibledigitiseddatagrows,sodoestheneedformachineassistancetohelpprocessthisoverloadofinformation.Asculturalheritageinstitutionsincreasinglydigitisetheircollections,theyareineffectconvertingthecollectionsintodata.ParticularlyintheareaofDigitalHumanities,theneedfor‘full-text’collectionsforanalysis,isbecomingincreasinglyimportant.Forexample,in2016theNationalLibraryoftheNetherlandsorganisedaworkshop‘HistoricalNewspapersasBigData’.53ThefocusofthisworkshopwastobringtogetherresearchersfromarangeofdisciplineswhowereinterestedinusingthedigitisednewspapersandotherdigitalcollectionsmadeavailablebytheDelpherplatform54for(digital)humanitiesresearch.
Inrecentyears,internationalinitiativessuchastheDiRTDigitalResearchToolsdirectory55,theCommonLanguageResourcesandTechnologyInfrastructure(CLARIN)56andtheDigitalResearchInfrastructurefortheArtsandtheHumanities(DARIAH)57havebeenbringingtogethertoolsandresourcestohelpscholarsrepurposedatafortheadvancementofresearchandknowledge.
Despitetheproliferationoftools,littleisknownabouttheirdevelopmentanduse.AsGibbsandOwensobserved(2012),thisgreyareaofknowledgeconcernsboththeproductionandtheusersideoftools,raisingquestionsaboutusability,purpose,effectivenessandusage.
Fromadevelopmentstandpoint,oftenassumptionsaremadewithregardtousersandtheuseofatool.Whiletoolsaretypicallydesignedtobepartofthesolutiontoaproblem,byassumingknowledgetheybecomepartoftheproblemtobesolved.
Fromauserperspective,perhapsthebiggestbarriertotheadoptionofatoolistheabsenceof(sufficient)documentationontheirapplication(i.e.“howto”instructions)andontheirfunctionality(i.e.the“blackbox”).Functionalityiskeytotheevaluationofcomputedresults,inthatiftheinnerworkingsofatoolareopaque,howortowhichextentcantheusertrusttheresults?Howusefulisthetool?
Whereasdevelopersneedtobeclearaboutwhatthetoolisintendedfor,usersneedtobecarefulinselectingtheappropriatetooltoaddresstheirresearchquestion.Animprovedunderstandingofboththedeveloper’sintentionsintooldevelopmentaswellastheuser’srequirementsinordertoanswertheirresearchquestionareneeded.Additionally,theparticulardata-setthatauserwishestoanalyse,isacrucialfactorwhenitcomestotoolselection.
53 See:https://www.kb.nl/nieuws/2016/historische-kranten-als-big-data-ii-concepten-op-drift(Accessed:9
February2017).54 Availableat:http://www.delpher.nl/(Accessed:9February2017).55 Availableat:http://dirtdirectory.org/(Accessed:9February2017).56 Availableat:https://www.clarin.eu/(Accessed:9February2017).57 Availableat:http://www.dariah.eu/(Accessed:9February2017).
73
Inlightoftheissuesdescribed,thiscontributionreiteratestheneedfortoolcriticism,previouslyexpressedatthe‘ToolCriticismforDigitalHumanities’workshop(TraubandOssenbruggen,2015).58WearguefortoolcriticismasapedagogicalandeffectivemeansoftacklingtheinterdisciplinarychallengesposedbytheDigitalHumanitiesandoffosteringcommunicationbetweendevelopersandusers.Furthermore,weproposeanextensionofthetoolcriticismframeworktoalsoincludedata.Asanintegralpartoftheresearchprocess,wearguethatthedata-setisanimportantfactortoconsiderwhenselectingtheappropriateresearchtools.Forexample,ifauserhasachoicebetweentwotoolswithequivalentfunctionality,thenthestructureofthechosendata-setmayperformbetterwithoneorotherofthetools.Additionally,weproposethat‘datacriticism’isanimportantelementinitsownright.Forexample,itisimportanttocriticallyselectthesourceofaparticulardata-set,basedonarangeofcriteria.Ifaparticulartextisneededforanalysis,itmaybeavailablefrommultiplesources.Aframeworktofacilitatetheselectionofthemostappropriatedatasourceisthereforeneeded.Thiswillbuildonexisting‘sourcecriticism’and‘informationevaluation’frameworks(Hjorland,Birger,2012).
Asafirststeptowardstoolanddatacriticism,weproposeanumberofevaluationcriteriathatseektoencourageamorecriticalapproachtotools.Thesebuilduponanalogoussoftwarestudies(Jacksonetal.,2011),theEVALITAcampaigns59andtheveryrecentRIDEDigitalTextCollectionsevaluationguidelines60,andaregroupedasfollows:
Tools1. Usability
a) UserExperience(UX)b) GraphicalUserInterface(GUI)
2. Documentationa) Provenance(authors/organisationsbehindthetools)b) “Howtoinstructions”c) Algorithmsormethodsimplementedd) Limitationse) Targetaudience/researchf) Availabilityoftutorialstotrainuserstoproficientlyworkwiththetoolg) Accessandcitationh) Rights
3. Maintenancea) Developmentresponsestouserfeedback
4. Flexibility/ExtentofApplicability
Data-sets5. (Re-)Usability
a) Format(s)6. Documentation
a) Provenance(curators/organisationsbehindthedata-sets)b) Metadata(e.g.size,source,author,etc.)c) Limitationsd) Accessandcitation
58 See:http://event.cwi.nl/toolcriticism/(Accessed:12February2017).59 FormoreinformationaboutEvaluationofNLPandSpeechToolsforltalian(EVALITA),see:
http://www.evalita.it/2016(Accessed:12February2017).60 See:http://ride.i-d-e.de/reviewers/call-for-reviews/special-issue-text-collections/(Accessed:13February
2017).
74
e) Rights7. Maintenance
a) Developmentresponsestouserfeedback
Weevaluateourcriteriaonthreedifferentprojects-onedata-setproject,onetoolandoneapplicationofourselectedtoolonourselecteddata-set-tocomparetheuseranddeveloperperspectives.
Theintentionistofosteranunderstandingoftoolanddatacriticismtowardsadialoguebetweenusersanddevelopers,includinghowsuchaframeworkcouldbeputintopractice.
ReferencesGibbs,F.,Owens,T.(2012)‘BuildingBetterDigitalHumanitiesTools:Towardbroaderaudiencesanduser-centereddesigns’,DigitalHumanitiesQuarterly,6(2)[Online].Availableat:http://www.digitalhumanities.org/dhq/vol/6/2/000136/000136.html(Accessed:12February2017).
Hjorland,Birger(2012)‘Methodsforevaluatinginformationsources:Anannotatedcatalogue’,JournalofInformationScience,38.3(June2012):258-268.
Jackson,M.,Crouch,S.,Baxter,R.(2011)SoftwareEvaluation:Criteria-basedAssessment[Online].Availableat:https://www.software.ac.uk/sites/default/files/SSI-SoftwareEvaluationCriteria.pdf(Accessed:12February2017).
Traub,M.C.,Ossenbruggen,J.van(2015)WorkshoponToolCriticismintheDigitalHumanities:TechReport[Online].Availableat:http://oai.cwi.nl/oai/asset/23500/23500D.pdf(Accessed:12February2017).
2.SupportingDigitalHumanitiesinDealingwithQualityofWebDocumentsDavideCeolinLoraAroyo
JuliaNoordegraaf
VrijeUniversiteitAmsterdamdeBoelelaan1081a1081HVAmsterdamTheNetherlands
UniversityofAmsterdamTurfdraagsterpad151012XTAmsterdamTheNetherlands
[email protected]@vu.nl
Thispaperdiscussesthedevelopmentofanewapproachforassessingthequalityofonlinedocuments,contributinganewmethodologicalreflectionononlinesourcecriticism.Onlinedocumentsare,infact,ausefulsourceofinformationforverydiversegroupsofusers,rangingfromresearchersandjournaliststogovernmentofficials,activistsorparents.However,thisinformationisonlyusefulifwemanagetofilteroutthespamand,mostimportantly,ifwemanagetoretrievethedocumentsthatbetterfitthequalitativerequirementsthatspecificusershave.Forexample,whileforlaymenneutralityandreadabilitymaybeimportant,forscholarsaccuracyandcompletenessmaybemorerelevant.
Assessingthequalityofonlinedocumentsisachallengingtaskbecauseoftheirintrinsicpeculiarities:theirvolume,variety,andvelocitymakeitimpossibleforhumanstoprocessthemmanually.A
75
combinationofhumanandautomatedprocessingneedstobedevisedtohandletheirqualityassessment.Moreover,qualityassessmentisachallengingtaskonitsown.Theoverallqualityofagivendocumentistheresultoftheaggregationofmultiplefacets(orqualitydimensions),suchasaccuracy,completeness,andneutrality.Howthesefacetsarequantifiedandaggregatedismostlyasubjectiveandcontext-dependentmatter.Userswithdifferenttasksathandhavedifferentqualitativerequirements.Also,userswithdifferentbackgroundsarelikelytoevaluatethesamedocumentinadifferentmanner.
Ageneraldefinitionofqualityis‘fitnessforpurpose’,whereby‘fitness’varieswithbothcontextandpurpose.Althoughthismeansthattheassessmentofthequalityofonlinedocumentsisaflexible,fluidprocess,webelieveitisnotimpossibletomodelit.Todojusticetothefactthatdifferentpurposesimplydifferentqualitativerequirements(e.g.,forwritinganewspaperarticle,sourceneutralitymaybelessrelevantthanaccuracy),itiscrucialtocreateareferencesystemthatallowsforthequantificationofdocumentqualities(e.g.,theextenttowhichagivendocumentisneutraloraccurate).Whenthisreferencesystemexists,thenwecanidentifythemostaccurate,precise,orneutraldocumentsthatcorrespondtothoseofhigherqualityforagiventask(seeFigure1).
Figure1.Referencesystem.Throughtheassessment,wecreateareferencesystemforassessingtherelativequalityofthedocuments(e.g.,highaccuracy,neutrality,orboth).
Forthepurposeofcreatingareferencesystem,wearebenchmarkingalargeportionofonlinedocumentsbyemployingacombinationofhumanassessmentandmachinelearning.Inapilotstudy,weusedthishuman-machineinteractionapproachtoassessaselectionofonlinedocuments[1].Thesedocumentsfocussedonthetopicofvaccinations,andtheywereselectedtoprovideanoverviewofthetypesofdocuments(blogposts,officialdocuments,etc.),stances(pro,anti,neutral),andtypesofsources(governmentauthorities,activists,etc.).Thesedocumentswereassessedbyexperts(journalistsandmediascholars)whowereaskedtojudgetheirrelevanceforwritinganarticleonthevaccinationdebate.Theexpertswereasked,first,tojudgerelevancebasedoncertainautomaticallygeneratedqualityfeatures(suchasthetrustworthinessofthedocument,theentities
Documents
Assessment
Biasedandinaccuratedocuments
Neutralandaccuratedocuments
76
mentionedandthesentimentexpressedinit)and,second,tohighlightpartsofthedocumentandmanuallyannotatethequalityfeatures(suchasprovenance,references,specificstatementsinthetextitself).Theresultscollectedshowedthatthesubjectivityoftheseassessmentsislimitedbythefactthatcontributorsshareasimilarbackgroundandbythecleardefinitionofthetaskproposed(documentswereassessedsupposingtheywereusedforaspecifictask).Thisexploratorystudyprovidedpromisingindicationsforautomatingandscalingupthisprocess.
Currently,weareemployingcrowdsourcingtoextendthecoverageofhumanassessmentsofWebdocuments.Byemployingthecrowdinplaceofnicheexperts,wecanextendthenumberofdocumentsassessed.Nevertheless,suchashiftrequiresthedocumentassessmenttaskstobesimplifiedbecauseofthedifferenttypologyofcontributors,andbecausecrowdsourcingtasksareusuallyshorterthannichesourcingtasks[2]:assessingthequalityofWebdocumentsisalengthyanddemandingtask.However,sincethecrowdsourcingversionofthesetasksisintendedtocaptureimplicitqualityevaluationsthatusersusuallydowhenreadingonlinedocuments,suchasimplificationwillaffectthegranularityandnotthereliabilityoftheresults.Forexample,westillaskthecontributorstoassesstheprecision,completeness,andneutralityofdocuments,butweuseaBooleanscaleinsteadofaLikertscale,andwelimitthedepthoftheargumentationsrequested.
Wearealsoexploringthepossibilitytoautomatesuchassessmentprocess(seeFigure2).Weextractedasetoffeaturesfromthedocumentsinourpilotstudy.TheseincludedNLPfeatures(e.g.,namedentities,sentimentanalysis)andprovenance(e.g.,sourcetrustworthiness),andwefoundthatitispossibletoemployalgorithmslikeSupportVectorMachines[3]topredictthequalityassessmentsbyusingthesefeatureswithanaccuracyupto72%.Weareextendingthisprediction,toscaleupthenumberofdocumentsassessedandtoimprovetheaccuracyofthepredictions.Wearescalinguptheprocessoffeatureextraction,byparallelizingthenaturalprocessanalysistoextracttextualfeaturesfromlargecollectionsofdocuments.Wearealsoscalingupthepredictionpart,whichtakesasinputthefeaturesextractedandthetrainingdataprovidedbythecrowdandbytheniches,andproducesthequalityestimations.
Figure2:Automaticassessmentsetup.
NLP
Provenance
...
...
Nicheandcrowdsourc
MachineLearningPrediction
Training
data
Trainingdocuments
Documentstobeassessed
Assesseddocuments(inred:low-qualitydocs;ingreen:high-quality
PredictionParallelizedfeatureextraction
77
Convolutionalneuralnetworks[4]willbeevaluatedasapossiblesolutionforextendingthesetofdocumentsassessed.Importantwillbealsoananalysisoftherelationamongthequalitydimensionsconsidered.Sofar,wehaveconsideredthediversequalitydimensionsasindependenttargetstobepredicted.However,itcouldbethecasethatsome,lowerorderqualitiesprovidethepreconditionsforthevaluesofqualitiesofahigherorder.Forexample,highneutralityandprecisioncouldbethenecessarypreconditionsforhighaccuracy.Thiskindofdependencieswouldbefavorableforimprovingtheestimationprocess.
References:[1]D.Ceolin,J.Noordegraaf,L.Aroyo,CapturingtheIneffable:Collecting,AnalysingandAutomatingWebDocumentQualityAssessments.InProceedingsofthe20thInternationalConferenceonKnowledgeEngineeringandKnowledgeManagement(EKAW2016),pages:83-97.Springer.2016.
[2]V.DeBoer,M.Hildebrand,L.Aroyo,P.DeLeenheer,C.Dijkshoorn,B.Tesfa,G.Schreiber“Nichesourcing:harnessingthepowerofcrowdsofexperts”.In:InternationalConferenceonKnowledgeEngineeringandKnowledgeManagement.pp.16-20.,2012.Springer.
[3]C.CortesandV.Vapnik,“Support-vectornetworks,”Mach.Learn.,vol.20,no.3,pp.273–297,1995.
[4]Y.LeCun."LeNet-5,convolutionalneuralnetworks".Retrieved15February2017.
3.BuildingtheARTECHNEdatabase:NewdirectionsinDigitalArtHistoryMariekeHendriksen,ARTECHNEproject/DepartmentofArtHistory,UtrechtUniversityMartijnvanderKlis,DigitalHumanitiesLab,UtrechtUniversity
TheARTECHNEprojectatUtrechtUniversity/UniversityofAmsterdamstudieshowtechniquewastransmittedamongartists,artisans,andscholarsbetween1500and1950.Aspartoftheproject,theresearchersarecurrentlyworkingwithUtrechtUniversityDigitalHumanitiesLab(DHLab)todevelopanonlinedatabasecontainingsearchablefull-textearlymodernrecipes,artisthandbooks,andtechnicalinstructions,linkedtootherrelevantinformationsuchasrecordsofobjects,worksofart,conservationresearchandreconstructions.(http://artechne.hum.uu.nl)Weaimforbothquantityandquality:themoreenrichedtextsweadd,themorecomplexthequestionswecananswerusingthesearchandvisualizationfunctionsinthedatabase.
Forexample,asetofquestionslike‘howdidtheuseofcochinealasapigmentinoilpaintschangeinEuropebetween1500and1950,canwediscernpatternsinthespreadofrecipesforsuchpaints,andarecertainusesspecificforparticulargeographicalregions?’cancurrentlyonlybepartlyansweredthroughmanyyearsofresearchonprimarysourcessuchasobjectsandtexts.Giventhenumberofrelevantsourcesandtheirlimitedaccessibility,itwillbeverydifficultforaresearchertodiscoverandvisualizesuchpatternsrelyingontraditionalarthistoricalmethods.Thisdatabase,containingagreatnumberoffullysearchableandannotatedsources,willallowresearchersinarthistory,conservation,andculturalheritagetoasksuchcomplexquestionsandanswerthemwithaspeedandaccuracythatwasimpossiblebefore.Moreover,toolstodetecthierarchicaldistance/patternsofproximityorco-occurrenceofparticulartermswillbeintegrated,whichcangiveusinsightinthechangingmeaningsofconcepts.
Toreachthesegoals,incollaborationwiththeDHLab,wecreatetheARTECHNEdatabase.WeuseDrupal(https://www.drupal.org/)tomanagethedatabasecontents.ThedatabaseisindexedusingApacheSolr(http://lucene.apache.org/solr/),allowingresearcherstousefacetedsearchtofind
78
relevantresultsinthemanuscripts.Thedataisgeotaggedandcontainsdatinginformation,allowingtoalsoshowsearchresultsinaGISwithanextratimedimension.Moreover,thedatabaseallowstoexportdatafromtheapplicationto.csv-format,hasstableURIsandlinkstotheGettyVocabularies(ULANforartistnames,AATforglossarytermsandCONAforartefacts).Thedatabasethusadherestothe5-staropendataplan.
Byintegratingvarioustechnologiesandrecentlydevelopedmethodsindigitalhumanities,suchasOCR,GIS,semanticannotation,crowdsourcing,andLinkedOpenDatainthisdatabase,wehopetofirmlyestablishtheuseofenrichedtextualprimarysourcesindigitalarthistoricalresearch,whichtraditionallyreliesheavilyonimages.Thetwoauthorsofthepaper–ahistorianandascientificprogrammer–willpresentthefirstresultsofthedatabaseproject.Wewillalsoreflectonthequestionhowmuchdigitalliteracyonthepartofhistoriansandhowmuchhistoricalliteracyonthepartofscientificprogrammersisrequiredtosuccessfullysetupresearchprojectsrelyingonnewtechnologies.
4.FromToolsto“Recipes”:BuildingaMediaSuitewithintheDutchDigitalHumanitiesInfrastructureCLARIAHCarlosMartinez-Ortiz,RoelandOrdelman,MarijnKoolen,JuliaNoordegraaf,LilianaMelgar,LoraAroyo,JaapBlom,VictordeBoer,WillemMelder,JasmijnVanGorp,EvaBaaren,KasparBeelen,NorahKarrouche,OanaInel,RositaKiewik,ThemisKaravellasandThomasPoell
IntroductionScholarsrequireaccesstomultiple,large,multimediacollectionsofdigitalresources,aswellastouseawiderangeofinformationprocessingtoolstoaccessandworkwiththosecollections.Theserequirementsraisetheneedfordevelopingasynchronizednationalandcross-nationalinfrastructure.
CommonLabResearchInfrastructurefortheArtsandHumanities(CLARIAH)61isadistributedresearchinfrastructurefortheHumanities,includedontheNationalRoadmapforLarge-ScaleResearchFacilities(2015-2018)drawnupbytheNetherlandsOrganisationforScientificResearch(NWO).CLARIAHdesigns,implementsandexploitstheDutchpartoftheEuropeanCLARINandDARIAHinfrastructures.
TherearedifferentresearchdomainswithinCLARIAH:linguistics,socio-economichistory,andmediastudies.EachworkpackagewithintheCLARIAHprojectplacesatthecentreofdevelopmentboththetechnicalrequirementsofeachmediatype(text,structureddata,audio-visualmedia),aswellasthespecificresearchneedsoftheirusercommunities.
TheCLARIAHMediaStudiesworkpackagefocusesoncreatingaresearchenvironment,theMediaSuite(CLARIAHMS)62,aspartoftheCLARIAHinfrastructureaimingtoservetheneedsofmediascholarsbyprovidingaccesstoaudio-visualcollectionsandtheircontextualdata.ThispaperdescribestheapproachtakentobuildCLARIAHMS.
BackgroundCLARIAHMSincorporatesaseriesofDigitalHumanities(DH)toolsandaimstomakethemsustainable.PrototypesarecurrentlyhostedonanewinfrastructureattheTheNetherlandsInstitute
61 http://www.clariah.nl/62 http://mediasuite.clariah.nl/
79
forSoundandVision(NISV)datacentre.Theseprototypesare:AVResearcherXL,TROVe,CoMeRDa,OralHistoryToday(OHT)andDIVE+.Furthermore,CLARIAHMSaimstosupportaudio-visualarchivesinopeningupcollectionsinamorestandardizedway.Oncetheseobjectiveshavebeenaccomplished,scholarswillbeabletosearchandanalysethesecollectionsviaacentralworkspace,thus,enablingdataintensiveresearchinthehumanities.
AVResearcherXLisanexploratorytoolwhichenablessimultaneousqueriesandanalyticvisualizationsofthecollections´metadata(VanGorpetal(2015)).TROVewasdevelopedtoeasethecombinedaccessandvisualizationofarchivalcollectionsandonlinesocialmedia.CoMeRDaisawebbasedaggregatedsearchsystemforvisualizingsearchresults(Bronetal(2013)).OHTisaprototypeforsearchandenrichment(throughAutomaticSpeechRecognitiontechnology)ofdistributedOralHistorycollectionsinTheNetherlands(OrdelmananddeJong(2011)).Finally,DIVE+isalinked-datadigitalculturalheritagecollectionbrowserwhichprovidesaccesstoheritageobjectsfromheterogeneouscollections,usinghistoricaleventsandnarrativesascontextforsearching,browsingandpresentingtheobjects(deBoeretal.(2015)).
Thesefivetoolssupportscholarsinthe“exploration”and“contextualization”phasesoftheirresearch,aframeworkproposedin(Bronetal.(2015)).Theoriginaltoolscouldnotinteroperateanddidnotoperateonthesamedata,whichlimitstheirpotential.Recreatingtheminasingleconfigurableenvironmentmakesitpossibletoreusefunctionalitiesacrossdatasetsandtoreusedataacrossfunctionalities.
CLARIAHMediaSuiteTheDHcommunityincludesscholarswithawidediversityofresearchinterestsandgoals;everyresearchgroupinDHisworkingwithdifferenttypesofdataandtheirresearchobjectiveshavespecificrequirementswhichcannotbeeasilyfacilitatedbytoolsusingasingle,genericapproach.Simultaneously,therearesimilaritiesinthemethodsusedbydifferentscholars(deJongetal.(2011))thatcanbeusedforgeneralisedtooldevelopment.Therearecommonalitiesinresearchquestionsandmethodsamongmediascholars,whichwegroupedintoMediaaesthetics,Socialhistoryofmedia,Aesthetichistoriography,Socialandculturalhistory,Mediarepresentationsorcoverage,Transmediaanalysis,andMemorystudies(Melgaretal.,2017).
Figure1.CLARIAHMSconsistsoffunctionalities,APIsandrecipes,version1,April2017
80
Agenericinfrastructureisrequiredtocaterforthegeneralneedsofeveryusergroup.Theinfrastructureneedstoincorporateflexiblefunctionalitycapableofaddressingveryspecializedresearchquestions.Mediascholarsexpressedtheirdesiretousethecollectionsandtoolswhichwerepreviously“locked”togetherintheindividualprototypes.CLARIAHMShasbeendesignedinamodularway(Figure1);eachmoduleperformsasingle,well-definedtask.Modulescaninteroperatetoconstructmoresophisticatedfunctionality.
Metaphoricallyspeaking:whereaspreviouslyusershadaccesstopredefined‘meals’-toolswhichcouldperformcross-collectionsearchandvisualizetheresultsintheformoftimelines,wordclouds,snippetsand/orthumbnails-wenowprovideuserswithsingleingredients(individualfunctionalitiessuchassearching),andready-maderecipes(combinationsofseveralfunctionalities).Someingredientsmaybeusedindifferentrecipes,existingrecipesmaybecomplementedbyaddingextraingredients.
MediaSuiteArchitectureCLARIAHMSconsistsoffourlayersoffunctionality,explainedbelow(Figure2):
Figure2-ArchitecturaldesignofCLARIAHMS.
DataSourcescontainthecollections(e.g.,televisionbroadcastsfromNISV,EYEJeanDesmetcollection,DANSOralHistorycollection).Allcollectionsareregisteredinacommoninventory(CKAN63)whichdescribestheirmetadata.CollectionsareavailableinElasticsearch(fulltextsearch)andRDFformat(semanticsearch).
APIsfacilitatetheinteractionwithdatafromvariouscollections:
• CollectionsAPI-high-levelcollectioninformation(metadata:dataformat,size,etc.)• SearchAPI-searchingforcollectionitems.• AnnotationAPI-annotatingexistingdatausingW3CWebAnnotationstandard(mainlyfor
manualannotations)(Melgaretal.,(2017)).• DataEnrichmentAPI-collectionenrichmentthroughautomaticmechanisms(e.g.nameentity
recognition)orbyhumaninteraction(e.g.crowdsourcing).
TheAPIsdesignallowstheintegrationofnewdataofdifferentformatsanddatamodels.
ComponentsinCLARIAHMSaresoftwareunitswhichperformasinglefunctionality:eachcomponenttakesdataasinputandproducesameaningfuloutputusingstandardformats,tobeconnectedwithothercomponents(e.g.,wordcloud,timelinevisualizations,topicidentificationinnewspapers,searchingcontentincollections).
Recipesclosethecirclebyintegratingcomponentstorecreatethefunctionalitiesoftheoriginaltools.Wefocusonprovidingthecomplexfunctionalityoftheoriginaltoolsintheformoffour
63 http://mediasuite.clariah.nl/datasources
APIs Data sources
Components Recipes
81
'recipes'.Followingwiththemetaphorabove,theconceptofingredients(components)allowsresearcherstopreparetheirownpersonalrecipes(functionalities).
ConclusionInthispaperwehaveexplainedthestructureoftheCLARIAHMSandhowpreviouslydevelopedDHtoolsarebeingintegratedinasustainableinfrastructurethatallowsflexibleuseofdatacollectionsandfunctionalitiesfittingtheresearchneedsofscholars.Wehavealsosketchedourstrategytoenabletheintegrationofalternativefunctionalitiesanddatacollectionsusingamodularapproach(ingredientsandrecipes).FutureworkincludesuserevaluationofthefirstversionoftheMediaSuite(launchedinApril,2017),andco-developmentinvolvingsixCLARIAHresearchpilotprojects64.
References[Bronetal.(2013)]MarcBron,JasmijnVanGorp,FrankF.Nack,MaartendeRijke,LotteB.Baltussen.Aggregatedsearchinterfacesinmulti-sessiontasks.SIGIR2013:36thinternationalACMSIGIRconferenceonresearchanddevelopmentininformationretrieval.Dublin:ACM(2013)
[Bronetal.(2015)]MarcBron,JasmijnVanGorp,andMaartenRijke.Mediastudiesresearchinthedata-drivenage:Howresearchquestionsevolve.JournaloftheAssociationforInformationScienceandTechnology(2015),https://doi.org/10.1002/asi.23458.
[deJongetal.(2011)]FranciskadeJong,RoelandOrdelman,andStefScagliola.Audio-visualcollectionsandtheuserneedsofscholarsinthehumanities:acaseforco-development.InProceedingsofthe2ndConferenceonSupportingDigitalHumanities(SDH2011),Copenhagen,Denmark,2011.CentreforLanguageTechnology,Copenhagen.
[Melgaretal.(2017)]LilianaMelgarEstrada,MarijnKoolen,HugoHuurdeman,andJaapBlom.Aprocessmodeloftime-basedmediaannotationinascholarlycontext.InACMConferenceonHumanInformationInteractionandRetrieval(CHIIR),Oslo,2017.
[OrdelmananddeJong(2011)]RoelandOrdelmanandFranciskadeJong.Distributedaccesstooralhistorycollections:Fittingaccesstechnologytotheneedsofcollectionownersandresearchers.InDigitalHumanities2011:ConferenceAbstracts,pages347–349,Stanford,2011.StanfordUniversityLibrary.URLhttp://purl.utwente.nl/publications/78347.ISBN=978-0-911221-47-3.
[deBoeretal.(2015)]VictordeBoer,JohanOomen,OanaInel,LoraAroyo,ElcovanStaveren,WernerHelmich,DennisdeBeurs:DIVEintotheevent-basedbrowsingoflinkedhistoricalmedia.J.WebSem.35:152-158(2015)
[VanGorpetal(2015)]JasmijnVanGorp,SonjadeLeeuw,JustinvanWees,BoukeHuurnink.DigitalMediaArchaeology-DiggingintotheDigitalToolAVResearcherXL.VIEW.JournalofEuropeanTelevisionHistoryandCulture/E-journal,4(7):38-53(2015)
64 http://www.clariah.nl/projecten/research-pilots
82
SessionK
1.Digitallymediatedemotions:representationsandreinforcements
AncaȚenea-DoctoralSchool"Space,Image,Text,Territory"–CESI
In2014,aresearchconductedbyFacebookanalyzed689.003users’newsfeeds,bydeliveringmorepositiveornegativecontenttoeachuser(Krameretal.8788).Theconclusionsstatedthecontagiousfactorofemotionsindigitalspace,byshowingthatpeoplewhowereprovided,say,morepositivecontent,tendedtoexpressmorepositivityinreturn,bydistributingthemselvespositivemessages.Theassumptionsthatmachinesstimulateemotionsinordertoenhancethemwasthusreconfirmed.Yet,therearestillmanyquestionsemergingfromthecurrentstateofdigitalmedia.
TheresearchprojectIamproposinganalysestheconnectionbetweendigitalmediaandhumanemotions,asusersparticipateinadigitalspectacleinwhichtheygettheiractivepartthroughtheiremotions.Themanifestationsoftheiremotionsarereadandinterpretedbytheplatforms'algorithmsinordertorespondbyprovidingaspectacleaccordingtotheuser’sdesires,beliefs,andactions.Iproposeatheoreticalapproachtothewayhumanemotions,feelings,andaffectsaremirroredandenactedindigitalspace.Inlightofthenewtheoriesregardingtheimportanceofsentimentminingandaffectivecomputinginshapinghumanknowledge,affectsandbehavior,Iarguethenecessityofanalysinghowemotionsarestimulatedinnetworkedpublics(Boyd)inordertoenhancetheparticipationtothedigitalspectacle.
Thedigitalspectacleaddressedinthisresearchreferstothecollectionofinformationthatreturnstotheuserintheformofpersonalizedcontent,inresponsetotheironlineactivity,andtotheinformationtheydisplayviatheInternet.Transformingemotionsintodataconstitutesapivotalmechanismfordigitaltechnology,wheretheuserisnotonlythespectatorbutafullyengagedparticipant.Interestingly,Iargue,thishasthepotentialtorevealthemechanismthroughwhichtheuserrelatestotheLacanianOtherinanonlineprocessofvirtualidentityformation.TheelementsthattriggeraFreudiandriveandencouragetheinteractionwiththeLacanianOther,aswellasthecharacteristicsofdigitalplatformsmeanttoprovidefantasmaticandidentitaryprojectionswillalsobeexamined.
Myresearchfocusesonaseriesofplatformsandsoftwaresthatensuresuchinteractions.Forinstance,IaminterestedinhowthePersadocognitivemechanismsaresettodetectemotionsthroughsentimentmining,forthebenefitofcommercialadvertising.Persadoidentified16emotionsastriggersforuseraction,andthecasestudiespostedontheirwebsiteshowthattheirstrategyincreasinglyimprovedmarketingperformancefordifferentcampaigns.Aquestionoccurs:howarethesecommercialmessagesbecomingtriggersforuserreaction?
IamalsoenquiringintoFacebook’sreactionbuttonsand“Onthisday”featureeffectsonsentiment,affectandemotionreinforcement.Emotionsaredeeplyconnectedandinfluencedbysocialnorms,whichdictatehowoneshouldfeel,andbybehavioralcodes,whichinfluencespeopleonexpressingemotions(BenskiandFisher3).IarguethatFacebook,bymeansofitsarchitecture,isbuilttoenhancefeelingsandaffects,andexposea
83
usertobothsocialnormsandbehavioralcodes,inductedbyotherpeople’sposts,magazineandbrandspostsandads.Thereactionsbuttonsstimulatetheiremotionstodifferentsituations:like,love,angry,sadandwowareverysimilartotheonesdefinedbyPaulEkmanin1972,in“EmotionintheHumanFace”(surprise,fear,sadness,happiness,disgustandrage),askeyandprevalentemotionsinhumanbehavior.
ThereactionssymbolsalsopartlycorrespondtoemotionspostulatedbyJacquesLacanandMelanieKlein,asfundamentalresortsofidentityconstruction,suchasfear,pleasure,anxiety,fury,joyetc.Thedigitalpreponderantemotionscanbelinkedtothegoodandbadobjects(MelanieKlein),throughthemanifestationofthethingsweobserveandinteractwithfromthedigitalscreen.Iwouldarguethateventhoughthespeedofonlineemotionalreactivenessishigherthaninreallifeinteractions,theoperativemechanismisbasiclythesame.
Thisopensanotherdiscussion,onthevisualtriggersinsocialmedia,throughpsychoanalyticaltheories.Sincethedigitalscreeniscomposedofobjectsofdesires,orLacanian“objetspetita”,thedigitalinterfacescanbeinvestigatedthroughvisualpleasureandidentification.Technologyisoftencomparedwithfetishism,whichinvolvessyncingdifferentsymbolswithobjectsorpleasure.Thisidea-statesAndreNusselder(19)-issupportedbythefactthetechnologiestranscendthelimitsofregularlife,offeringpleasure,openingendlesspossibilities.Hedescribesthisaspectasahallucinatoryimaginationofrealitybecausedigitaltechnologiessynchronizehumanswiththepleasureprinciple,postulatedbyFreudandreinforcedbyfurtherpsychoanalyticaltheoriesasthemotorofhumanpleasure.Thespectatoristhereforenolongerpassive,astheycontinuouslyinteractwiththescreen,influencingthecontent.Imagesandsymbolsonlinerepresenttheobjectsofdesire,whicharepartoftheImaginarythatsimulatesandstimulates,creatingthedigitalfantasy.
Platformsaimtoofferusersaspectaclecompatibletotheirconceptualapparatus,reinforcingfamiliarmythologiesandbeliefs,aswellastheirregisteredcommondesires.Inthelightofthequestionsonwhydopeoplereacttocertaincontent,beitonsocialmedia,newslettersorotherparticularadvertisingmessages,Ifinditlegitimatetoaskwhicharethetriggersthatmakeausertakeanaction.
WorksCitedBoyd,Danah(2010)."SocialNetworkSitesasNetworkedPublics:Affordances,Dynamics,andImplications."NetworkedSelf:Identity,Community,andCultureonSocialNetworkSites(ed.ZiziPapacharissi),2010,pp.39-58,www.danah.org/papers/2010/SNSasNetworkedPublics.pdf.
Fisher,Eran;,Benksi,Tova,InternetandEmotions:Routledge,NewYork,2014
Kramer,AdamD.I.,JamieE.Guillory,andJeffreyT.Hancock."Experimentalevidenceofmassive-scaleemotionalcontagionthroughsocialnetworks."SocialSciences-PsychologicalandCognitiveSciences,vol.111,no.24,pp.8788-8790,www.pnas.org.
Nusselder,André,InterfaceFantasy:ALacanianCyborgOntology,Cambridge:TheMITPressCambridge,2009.