lecture notes in computer science 10450 - home - springer978-3-319-67008-9/1.pdf · lecture notes...

24
Lecture Notes in Computer Science 10450 Commenced Publication in 1973 Founding and Former Series Editors: Gerhard Goos, Juris Hartmanis, and Jan van Leeuwen Editorial Board David Hutchison Lancaster University, Lancaster, UK Takeo Kanade Carnegie Mellon University, Pittsburgh, PA, USA Josef Kittler University of Surrey, Guildford, UK Jon M. Kleinberg Cornell University, Ithaca, NY, USA Friedemann Mattern ETH Zurich, Zurich, Switzerland John C. Mitchell Stanford University, Stanford, CA, USA Moni Naor Weizmann Institute of Science, Rehovot, Israel C. Pandu Rangan Indian Institute of Technology, Madras, India Bernhard Steffen TU Dortmund University, Dortmund, Germany Demetri Terzopoulos University of California, Los Angeles, CA, USA Doug Tygar University of California, Berkeley, CA, USA Gerhard Weikum Max Planck Institute for Informatics, Saarbrücken, Germany

Upload: donga

Post on 25-Jun-2018

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lecture Notes in Computer Science 10450 - Home - Springer978-3-319-67008-9/1.pdf · Lecture Notes in Computer Science 10450 Commenced Publication in 1973 Founding and Former Series

Lecture Notes in Computer Science 10450

Commenced Publication in 1973Founding and Former Series Editors:Gerhard Goos, Juris Hartmanis, and Jan van Leeuwen

Editorial Board

David HutchisonLancaster University, Lancaster, UK

Takeo KanadeCarnegie Mellon University, Pittsburgh, PA, USA

Josef KittlerUniversity of Surrey, Guildford, UK

Jon M. KleinbergCornell University, Ithaca, NY, USA

Friedemann MatternETH Zurich, Zurich, Switzerland

John C. MitchellStanford University, Stanford, CA, USA

Moni NaorWeizmann Institute of Science, Rehovot, Israel

C. Pandu RanganIndian Institute of Technology, Madras, India

Bernhard SteffenTU Dortmund University, Dortmund, Germany

Demetri TerzopoulosUniversity of California, Los Angeles, CA, USA

Doug TygarUniversity of California, Berkeley, CA, USA

Gerhard WeikumMax Planck Institute for Informatics, Saarbrücken, Germany

Page 2: Lecture Notes in Computer Science 10450 - Home - Springer978-3-319-67008-9/1.pdf · Lecture Notes in Computer Science 10450 Commenced Publication in 1973 Founding and Former Series

More information about this series at http://www.springer.com/series/7409

Page 3: Lecture Notes in Computer Science 10450 - Home - Springer978-3-319-67008-9/1.pdf · Lecture Notes in Computer Science 10450 Commenced Publication in 1973 Founding and Former Series

Jaap Kamps • Giannis TsakonasYannis Manolopoulos • Lazaros IliadisIoannis Karydis (Eds.)

Research andAdvanced Technologyfor Digital Libraries21st International Conference on Theory and Practiceof Digital Libraries, TPDL 2017Thessaloniki, Greece, September 18–21, 2017Proceedings

123

Page 4: Lecture Notes in Computer Science 10450 - Home - Springer978-3-319-67008-9/1.pdf · Lecture Notes in Computer Science 10450 Commenced Publication in 1973 Founding and Former Series

EditorsJaap KampsFaculteit der GeesteswetenschappenUniversiteit van AmsterdamAmsterdamThe Netherlands

Giannis TsakonasLibrary & Information CenterUniversity of PatrasPatrasGreece

Yannis ManolopoulosAristotle University of ThessalonikiThessalonikiGreece

Lazaros IliadisCivil EngineeringUniversity of ThraceKimmeriaGreece

Ioannis KarydisInformaticsIonian UniversityKerkyraGreece

ISSN 0302-9743 ISSN 1611-3349 (electronic)Lecture Notes in Computer ScienceISBN 978-3-319-67007-2 ISBN 978-3-319-67008-9 (eBook)DOI 10.1007/978-3-319-67008-9

Library of Congress Control Number: 2017952390

LNCS Sublibrary: SL3 – Information Systems and Applications, incl. Internet/Web, and HCI

© Springer International Publishing AG 2017This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of thematerial is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation,broadcasting, reproduction on microfilms or in any other physical way, and transmission or informationstorage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology nowknown or hereafter developed.The use of general descriptive names, registered names, trademarks, service marks, etc. in this publicationdoes not imply, even in the absence of a specific statement, that such names are exempt from the relevantprotective laws and regulations and therefore free for general use.The publisher, the authors and the editors are safe to assume that the advice and information in this book arebelieved to be true and accurate at the date of publication. Neither the publisher nor the authors or the editorsgive a warranty, express or implied, with respect to the material contained herein or for any errors oromissions that may have been made. The publisher remains neutral with regard to jurisdictional claims inpublished maps and institutional affiliations.

Printed on acid-free paper

This Springer imprint is published by Springer NatureThe registered company is Springer International Publishing AGThe registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

Page 5: Lecture Notes in Computer Science 10450 - Home - Springer978-3-319-67008-9/1.pdf · Lecture Notes in Computer Science 10450 Commenced Publication in 1973 Founding and Former Series

Preface

This volume of proceedings contains the reviewed papers presented at the 21st Inter-national Conference on Theory and Practice of Digital Libraries (TPDL), which washeld in Thessaloniki, Greece from September 18 to 21, 2017. The conference wasorganized by the Aristotle University of Thessaloniki and the Democritus University ofThrace. The general theme of the 21st International Conference on Theory and Practiceof Digital Libraries was “Part of the Machine: Turning Complex into Scalable” and itsaim was to create a dialogue that addressed the challenge of creatively transformingthese highly-synthesized environments into solutions that can scale for the benefit ofvaried communities.

TPDL 2017 received 85 full-paper submissions, up from 50 at TPDL 2016 and 44 atTPDL 2015, making the conference in 2017 very competitive and selective, andrequiring the Program Committee to uphold the highest possible academic standards.We introduced a two-layered structure for oral presentations, long and short oral, inorder to include an adequate number of interesting papers that expand the field ofdigital libraries on innovative topics and to strengthen the areas already known. Of the85 long-paper submissions, only 20 (24%) were accepted for a long oral presentation,and an additional 19 (22%) long papers were accepted for a shorter oral presentation.This makes a grand total of 39 (46%) full papers accepted for the proceedings.

Of the 8 short-paper submissions, only 4 (50%) were accepted, and of the 5poster/demo submissions, only 2 (40%) were accepted. Selected full-paper submissionswere redirected for evaluation as potential short or poster/demo papers, following therecommendations of the reviewers.

Each submission was reviewed by at least three Program Committee members, andtwo Senior Program Committee members, and the two chairs oversaw the reviewingand often extensive follow-up discussion. Where the discussion was not sufficient tomake a decision, the paper went through an extra review by the Program Committee.Each paper was discussed individually, based on the reviews, the meta reviews, and thediscussion at a PC meeting, where the final decisions were made.

The conference was honored by three very interesting keynote speeches by PaulGroth on “Machines are People Too”, Elton Barker on “Back to the Future: Anno-tating, Collaborating, and Linking in a Digital Ecosystem” and Dimitrios Tzovaras on“Visualization in the Big Data Era: Data Mining from Networked Information”. Allthree covered important areas of the digital library field.

The program of TPDL 2017 also included a doctoral consortium track and fourtutorials on “enriching digital collections using tools for text mining, indexing andvisualization”, “putting historical data in context: how to use DSpace-GLAM”, “in-novation search”, and “enabling precise identification and citability of dynamic data –recommendations of the RDA working group on data citation”. Finally, four work-shops were organized in conjunction with the main conference, namely thelong-established “European Networked Knowledge Organization Systems (NKOS)” in

Page 6: Lecture Notes in Computer Science 10450 - Home - Springer978-3-319-67008-9/1.pdf · Lecture Notes in Computer Science 10450 Commenced Publication in 1973 Founding and Former Series

its 17th year and the newly introduced “(Meta)-data Quality Workshop”, “InternationalWorkshop on Temporal Dynamics in Digital Libraries”, and “Modeling Societal Future(FUTURITY)”.

We would like to thank all our colleagues for trusting their papers to the conference,as well as our Program Committee members, both the senior and the regular, for theprecise and thorough work they put into reviewing the submissions. A word of grat-itude must be addressed to our workshop chairs, Philipp Mayr and Kjetil Nørvåg, ourtutorial chairs, Thomas Risse and Gianmaria Silvello, our panel chair, Cristina Ribeiro,our posters/demo chairs, Vangelis Banos and Annika Hinze, and our doctoral con-sortium chairs, Maja Žumer and Heiko Schuldt, for the substantial effort they put intoin running their tracks.

September 2017Thessaloniki

Jaap KampsGiannis Tsakonas

Yannis ManolopoulosLazaros IliadisIoannis Karydis

VI Preface

Page 7: Lecture Notes in Computer Science 10450 - Home - Springer978-3-319-67008-9/1.pdf · Lecture Notes in Computer Science 10450 Commenced Publication in 1973 Founding and Former Series

Organization

TPDL 2017 was organized by the Aristotle University of Thessaloniki, Greece and theDemocritus University of Thrace, Greece.

General Chairs

Yannis Manolopoulos Aristotle University of Thessaloniki, GreeceLazaros Iliadis Democritus University of Thrace, Greece

Program Chairs

Jaap Kamps University of Amsterdam, The NetherlandsGiannis Tsakonas University of Patras, Greece

Organizing Chair

Apostolos Papadopoulos Aristotle University of Thessaloniki, Greece

Publicity/Publication Chair

Ioannis Karydis Ionian University, Greece

Workshop Chairs

Philipp Mayr GESIS, GermanyKjetil Nørvåg Norwegian University of Science and Technology,

Norway

Tutorial Chairs

Thomas Risse L3S, GermanyGianmaria Silvello University of Padua, Italy

Panel Chair

Cristina Ribeiro University of Porto, Porto

Posters/Demo Chairs

Vangelis Banos Aristotle University of Thessaloniki, GreeceAnnika Hinze University of Waikato, New Zealand

Page 8: Lecture Notes in Computer Science 10450 - Home - Springer978-3-319-67008-9/1.pdf · Lecture Notes in Computer Science 10450 Commenced Publication in 1973 Founding and Former Series

Doctoral Consortium Chairs

Maja Žumer University of Ljubljana, SloveniaHeiko Schuldt University of Basel, Switzerland

Program Committee

Senior Program Committee

Trond Aalberg Norwegian University of Science and Technology,Norway

David Bainbridge University of Waikato, New ZealandTobias Blanke University of Glasgow, UKJose Borbinha IST/INESC-ID, UKGeorge Buchanan City University London, UKDonatella Castelli CNR-ISTI, ItalyStavros Christodoulakis Technical University of Crete, GreeceMilena Dobreva University of Malta, MaltaNicola Ferro University of Padua, ItalyEdward Fox Virginia Polytechnic Institute and State University, USAIngo Frommholz University of Bedfordshire, UKNorbert Fuhr University of Duisburg-Essen, GermanyRichard Furuta Texas A&M University, USAMarcos Goncalves Federal University of Minas Gerais, BrazilAnnika Hinze University of Waikato, New ZealandSarantos Kapidakis Ionian Uninversity, GreeceLaszlo Kovacs Hungarian Academy of Sciences, HungaryClifford Lynch CNI, USAWolfgang Nejdl L3S and University of Hanover, GermanyMichael Nelson Old Dominion University, USAErich Neuhold University of Vienna, AustriaChristos Papatheodorou Ionian University, GreeceEdie Rasmussen University of British Columbia, CanadaAndreas Rauber Vienna University of Technology, AustriaThomas Risse L3S Research Center, GermanyLaurent Romary Inria and HUB-ISDL, FranceSeamus Ross University of Toronto, CanadaHeiko Schuldt University of Basel, SwitzerlandMário J. Silva Instituto Superior Técnico, Universidade de Lisboa,

PortugalHussein Suleman University of Cape Town, South AfricaCostantino Thanos ISTI-CNR, ItalyHerbert Van De Sompel Los Alamos National Laboratory, USA

VIII Organization

Page 9: Lecture Notes in Computer Science 10450 - Home - Springer978-3-319-67008-9/1.pdf · Lecture Notes in Computer Science 10450 Commenced Publication in 1973 Founding and Former Series

Program Committee

Hamed Alhoori Northern Illinois University, USARobert Allen Yonsei University, South KoreaAvishek Anand L3S Research Center, GermanyVangelis Banos Aristotle University of Thessaloniki, GreeceValentina Bartalesi ISTI-CNR, ItalyChristoph Becker University of Toronto, CanadaMaria Bielikova Slovak University of Technology in Bratislava, SlovakiaPável Calado Instituto Superior Técnico, Universidade de Lisboa,

PortugalVittore Casarosa ISTI-CNR, ItalyLillian Cassel Villanova University, USAPanos Constantopoulos Athens University of Economics and Business, GreeceFabio Crestani University of Lugano (USI), SwitzerlandSally Jo Cunningham Waikato University, New ZealandTheodore Dalamagas IMIS-“Athena” R.C., GreeceMakx Dekkers SpainGiorgio Maria Di Nunzio University of Padua, ItalyFabien Duchateau Université Claude Bernard Lyon 1 – LIRIS, FranceMaria Economou University of Glasgow, UKSchubert Foo Nanyang Technological University, SingaporeNuno Freire INESC-ID, PortugalDimitris Gavrilis Athena Research Centre, GreeceManolis Gergatsoulis Ionian University, GreeceC. Lee Giles Pennsylvania State University, USAJulio Gonzalo UNED, SpainPaula Goodale University of Sheffield, UKSergiu Gordea Austrian Institute of Technology, AustriaStefan Gradmann KU Leuven, GermanyJane Greenberg Drexel University, USAMark Michael Hall Edge Hill University, UKBernhard Haslhofer AIT-Austrian Institute of Technology, AustriaFrank Hopfgartner University of Glasgow, UKNikos Houssos RedLink, GreeceAntoine Isaac Europeana Foundation, BelgiumAdam Jatowt Kyoto University, JapanNattiya Kanhabua Aalborg University, DenmarkRoman Kern Graz University of Technology, AustriaClaus-Peter Klas GESIS – Leibniz Institute for Social Sciences, GermanyMartin Klein Los Alamos National Laboratory, USAPetr Knoth KMi, The Open University, UKStefanos Kollias National Technical University of Athens, GreeceRonald Larsen University of Pittsburgh, USASéamus Lawless Trinity College Dublin, Ireland

Organization IX

Page 10: Lecture Notes in Computer Science 10450 - Home - Springer978-3-319-67008-9/1.pdf · Lecture Notes in Computer Science 10450 Commenced Publication in 1973 Founding and Former Series

Hyowon Lee Singapore University of Technology and Design,Singapore

Suzanne Little Dublin University, IrelandZinaida Manžuch Vilnius University, LithuaniaBruno Martins IST – Instituto Superior Técnico, PortugalPhilipp Mayr GESIS, GermanyCezary Mazurek Poznań Supercomputing and Networking Center, PolandRobert H. Mcdonald Indiana University/Data to Insight Center, USADana Mckay Swinburne University of Technology, AustraliaAndras Micsik SZTAKI, HungaryDavid Nichols University of Waikato, New ZealandJeppe Nicolaisen University of Copenhagen, DenmarkRagnar Nordlie Oslo and Akershus University College, NorwayMoira Norrie ETH Zurich, SwitzerlandKjetil Nørvåg Norwegian University of Science and Technology,

NorwayRaul Palma Poznan Supercomputing and Networking Center, PolandNils Pharo Oslo & Akershus University College of Applied

Sciences, NorwayDimitris Plexousakis Institute of Computer Science, FORTH, GreecePanayiota Polydoratou ATEI of Thessaloniki, GreeceCristina Ribeiro University of Porto, PortugalIan Ruthven University of Strathclyde, UKJ. Alfredo Sánchez UDLAP, MexicoMichalis Sfakakis Ionian University, GreeceGianmaria Silvello University of Padua, ItalyNicolas Spyratos University of Paris South, FranceShigeo Sugimoto University of Tsukuba, JapanTamara Sumner University of Colorado at Boulder, USAAnastasios Tombros Queen Mary University of London, UKTheodora Tsikrika Information Technologies Institute, CERTH, GreeceChrisa Tsinaraki European Union - Joint Research Center (EU-JRC),

BelgiumDouglas Tudhope University of Glamorgan, UKYannis Tzitzikas University of Crete and FORTH-ICS, GreeceStefanos Vrochidis Information Technologies Institute, CERTH, GreeceMichele Weigle Old Dominion University, USAMarcin Werla Poznań Supercomputing and Networking Center, PolandIris Xie University of Wisconsin-Milwaukee, USAMaja Žumer University of Ljubljana, Slovenia

X Organization

Page 11: Lecture Notes in Computer Science 10450 - Home - Springer978-3-319-67008-9/1.pdf · Lecture Notes in Computer Science 10450 Commenced Publication in 1973 Founding and Former Series

Additional Reviewers

Agathos, MichailAlian Nejadi, MohammadCancellieri, MatteoCarvalho, AndréChuda, DanielaFafalios, PavlosGiachanou, AnastasiaKalogeros, EleftheriosKamateri, EleniKanellos, IliasKaššák, OndrejKim, KunhoKompan, MichalKondylakis, HaridimosKotzinos, DimitrisKörner, MartinLandoni, MonicaLi, Liuqing

Liu, LuMarketakis, YannisMedina, María AuxilioMinadakis, NikosMountantonakis, MichalisPapachristopoulos, LeonidasPapadakos, PanagiotisPride, DavidRocha Da Silva, JoãoRörden, JanSantos, RuiSchlarb, SvenSrba, IvanTzouramanis, TheodorosVergoulis, ThanasisWilliams, KyleWu, JianZhang, Xuan

Sponsors

The Coalition for Networked Information (CNI)

Organization XI

Page 12: Lecture Notes in Computer Science 10450 - Home - Springer978-3-319-67008-9/1.pdf · Lecture Notes in Computer Science 10450 Commenced Publication in 1973 Founding and Former Series

Keynotes

Page 13: Lecture Notes in Computer Science 10450 - Home - Springer978-3-319-67008-9/1.pdf · Lecture Notes in Computer Science 10450 Commenced Publication in 1973 Founding and Former Series

Back to the Future: Annotating, Collaboratingand Linking in a Digital Ecosystem

Elton Barker

The Open University, UK

Abstract. Classical philology has rarely been a self-enclosed discipline: in orderto interpret Greek and Latin texts, it is necessary to place them in context—grounding them in the histories of the time and exploring them in and againstthose cultural horizons. Using the linking potential of the Web, PelagiosCommons (http://commons.pelagios.org/) has been pioneering a means of dig-ital ‘mutual contextualization’, whereby any online document—be it a text, map,database or image—can be connected to another simply by virtue of havingsomething in common with it, and then draw on this external content to enrichits own, or in turn be drawn upon by and enrich another. In Pelagios this linkingis achieved through the method of annotating places. From having originallybeen seeded in collaboration with partners who already curated data and had thetechnical know-how to align datasets, Pelagios Commons now offers anyresearcher, librarian, museum curator, student or member of the public a simple,intuitive means to encode place information in a document of their choosing.

This presentation will set out and explain this annotation process in theWeb-based, Open Source platform, Recogito (http://recogito.pelagios.org/)developed by the Pelagios team. It will go through the steps that the researcherwould take in order to geoannotate their material—first identifying the placeentity in their document, then resolving that information to a central authorityfile: i.e. a gazetteer of placenames (e.g. http://pleiades.stoa.org/). It also con-siders the potential uses of this kind of semantic annotation, outlining themapping of places in texts, the repurposing of the data in other systems (such asGIS), and the linking to other related resources. Throughout, however, it will beconcerned to identify challenges and persistent issues that are not only related tothe technical development and use; using Recogito puts a primary demand ondefining and conceptualising place. Thus, contrary to much current thinking, thispresentation hopes to show how digital tools can enhance the close reading oftexts and facilitate a more nuanced understanding of the status and role of placesin our historical sources.

Elton Barker is Reader in Classical Studies, having joined The Open University as aLecturer in July 2009. Before then, he had been a Tutor and Lecturer at Christ Church,Oxford (2004-09), and also lectured at Bristol, Nottingham and Reading. He has been aJunior Research Fellowship at Wolfson College, Cambridge (2002-04) and a VisitingFellow at Venice International University (2003-04). From 2012-2013 he had aResearch Fellowship for Experienced Researchers awarded by the Alexander vonHumboldt Foundation for research at the Freie Universität Berlin and the University ofLeipzig. He has been awarded a Graduate Teaching Award from Pembroke College

Page 14: Lecture Notes in Computer Science 10450 - Home - Springer978-3-319-67008-9/1.pdf · Lecture Notes in Computer Science 10450 Commenced Publication in 1973 Founding and Former Series

(Cambridge) and twice won awards from the University of Oxford for an OutstandingContribution to Teaching.

His research interests cross generic and disciplinary boundaries. Since 2008, he hasbeen leading and co-running a series of collaborative projects, which are using digitalresources to rethink spatial understanding of the ancient world. The Hestia projectinvestigates the underlying ways in which Herodotus constructs space in book 5 of hisHistories. Meanwhile, the Pelagios project has been establishing the Web infrastructureby which data produced and curated by different content providers – from academicprojects like the Perseus Classical Library to cultural heritage institutions like theBritish Museum – can be linked through their common references to places.

XVI Back to the Future: Annotating, Collaborating and Linking in a Digital Ecosystem

Page 15: Lecture Notes in Computer Science 10450 - Home - Springer978-3-319-67008-9/1.pdf · Lecture Notes in Computer Science 10450 Commenced Publication in 1973 Founding and Former Series

Machines are People Too

Paul Groth

Elsevier Labs, Elsevier Inc., USA

Abstract. The theory and practice of digital libraries provides a long history ofthought around how to manage knowledge ranging from collection develop-ment, to cataloging and resource description. These tools were all designed tomake knowledge findable and accessible to people. Even technical progress ininformation retrieval and question answering are all targeted to helping answer ahuman’s information need.

However, increasingly demand is for data. Data that is needed not for peo-ple’s consumption but to drive machines. As an example of this demand, therehas been explosive growth in job openings for Data Engineers – professionalswho prepare data for machine consumption. In this talk, I overview the infor-mation needs of machine intelligence and ask the question: Are our knowledgemanagement techniques applicable for serving this new consumer?

Paul Groth is Disruptive Technology Director at Elsevier Labs. He holds a Ph.D. inComputer Science from the University of Southampton (2007) and has done research atthe University of Southern California and the Vrije Universiteit Amsterdam. Hisresearch focuses on dealing with large amounts of diverse contextualized knowledgewith a particular focus on the web and science applications. This includes research indata provenance, data science, data integration and knowledge sharing. He leadsarchitecture development for the Open PHACTS drug discovery data integration plat-form. Paul was co-chair of the W3C Provenance Working Group that created a standardfor provenance interchange. He is co-author of “Provenance: An Introduction to PROV”and “The Semantic Web Primer: 3rd Edition” as well as numerous academic articles. Heblogs at http://thinklinks.wordpress.com. You can find him on twitter: @pgroth.

Page 16: Lecture Notes in Computer Science 10450 - Home - Springer978-3-319-67008-9/1.pdf · Lecture Notes in Computer Science 10450 Commenced Publication in 1973 Founding and Former Series

Visualization in the Big Data Era: Data Miningfrom Networked Information

Dimitrios Tzovaras

Information Technologies Institute, Centre for Research and Technology, Greece

Abstract. Network graphs have long formed a widely adapted and acknowl-edged practice for the representation of inter- and intra-dependent informationstreams. Nowadays, they are largely attracting the interest of the researchcommunity mainly due to the vastly growing amount (size & complexity) ofsemantically dependent data produced world-wide as a result of the rapidexpansion of data sources.

In this context, the efficient processing of the big amounts of information,also known as Big Data forms a major challenge for both the research com-munity and a wide variety of industrial sectors, involving security, health andfinancial applications.

In order to address these needs the current presentation describes a propri-etary platform built upon state-of-the-art algorithms that are combined toimplement a top-down approach for the facilitation of Data & Graph Miningprocesses, like behavioral clustering, interactive visualizations, etc.

The applicability of this platform has been validated on a series of distinctreal-world use cases that involve large amounts of intra-exchanged informationand can be thus help as characteristic examples of modern Big Data problems. Inparticular, they refer to (i) DoS attacks in a real-world mobile networks and(ii) early event detection in social media communities, (iii) traffic managementand (iv) DNA sequences analysis.

In all these cases, the large volumes of data are addressed via a Data Min-imization approach that starts with an aggregated overview of network at itswhole, and gradually the focus is put on smaller data subsets (i.e. approach uponsuccessive levels of abstraction). In parallel, insights on the network’s opera-tions are allowed through the detection of behavioral patterns. Similarly, adynamic hypothesis formulator and the corresponding backend solver cansubsequently be exploited through graph traversing and pattern mining. Thisway, an analyst is provided with the appropriate equipment to set and verifyconcrete hypotheses through simulation and extract useful conclusions.

Dr. Dimitrios Tzovaras is a Senior Researcher Grade A’ (Professor) and Director atCERTH/ITI (the Information Technologies Institute of the Centre for Research andTechnology Hellas). He received the Diploma in Electrical Engineering and the Ph.D.in 2D and 3D Image Compression from the Aristotle University of Thessaloniki,Greece in 1992 and 1997, respectively. Prior to his current position, he was a SeniorResearcher on the Information Processing Laboratory at the Electrical and ComputerEngineering Department of the Aristotle University of Thessaloniki. His main researchinterests include network and visual analytics for network security, computer security,

Page 17: Lecture Notes in Computer Science 10450 - Home - Springer978-3-319-67008-9/1.pdf · Lecture Notes in Computer Science 10450 Commenced Publication in 1973 Founding and Former Series

data fusion, biometric security, virtual reality, machine learning and artificial intelli-gence. He is author or co-author of over 110 articles in refereed journals and over300 papers in international conferences.

Since 2004, he has been Associate Editor in the following International journals:Journal of Applied Signal Processing (JASP) and Journal on Advances in Multimediaof EURASIP. Additionally, he is Associate Editor in the IEEE Signal ProcessingLetters journal (since 2009) and Senior Associate Editor in the IEEE Signal ProcessingLetters journal (since 2012), while since mid-2012 he has been also Associate Editor inthe IEEE Transactions on Image Processing journal. Over the same period, Dr. Tzo-varas acted as ad hoc reviewer for a large number of International Journals andMagazines such as IEEE, ACM, Elsevier and EURASIP, as well as InternationalScientific Conferences (ICIP, EUSIPCO, CVPR, etc.).

Since 1992, Dr. Tzovaras has been involved in more than 100 European projects,funded by the EC and the Greek Ministry of Research and Technology. Within theseresearch projects, he has acted as the Scientific Responsible of the research group ofCERTH/ITI, but also as the Coordinator and/or the Technical/Scientific Manager ofmany of them (coordinator of technical manager in 21 projects – 10 H2020, 1 FP7 ICTIP, 7 FP7 ICT STREP, 3 FP6 IST STREP and 1 Nationally funded project).

Visualization in the Big Data Era: Data Mining from Networked Information XIX

Page 18: Lecture Notes in Computer Science 10450 - Home - Springer978-3-319-67008-9/1.pdf · Lecture Notes in Computer Science 10450 Commenced Publication in 1973 Founding and Former Series

Contents

Linked Data

Exploiting Interlinked Research Metadata. . . . . . . . . . . . . . . . . . . . . . . . . . 3Shirin Ameri, Sahar Vahdati, and Christoph Lange

Preserving Bibliographic Relationships in Mappings from FRBRto BIBFRAME 2.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

Sofia Zapounidou, Michalis Sfakakis, and Christos Papatheodorou

Exploring Ontology-Enhanced Bibliography DatabasesUsing Faceted Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

Tadeusz Pankowski

What Should I Cite? Cross-Collection Reference Recommendationof Patents and Papers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

Julian Risch and Ralf Krestel

Corpora

Taxonomic Corpus-Based Concept Summary Generationfor Document Annotation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

Ikechukwu Nkisi-Orji, Nirmalie Wiratunga, Kit-Ying Hui,Rachel Heaven, and Stewart Massie

RussianFlu-DE: A German Corpus for a Historical Epidemicwith Temporal Annotation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

Tran Van Canh, Katja Markert, and Wolfgang Nejdl

A Digital Repository for Physical Samples: Concepts, Solutionsand Management. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

Anusuriya Devaraju, Jens Klump, Victor Tey, Ryan Fraser, Simon Cox,and Lesley Wyborn

Facet Embeddings for Explorative Analytics in Digital Libraries . . . . . . . . . . 86Sepideh Mesbah, Kyriakos Fragkeskos, Christoph Lofi,Alessandro Bozzon, and Geert-Jan Houben

Page 19: Lecture Notes in Computer Science 10450 - Home - Springer978-3-319-67008-9/1.pdf · Lecture Notes in Computer Science 10450 Commenced Publication in 1973 Founding and Former Series

Data in Digital Libraries

Automatic Hierarchical Categorization of Research ExpertiseUsing Minimum Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

Gustavo Oliveira de Siqueira, Sérgio Canuto, Marcos André Gonçalves,and Alberto H.F. Laender

Extracting Event-Centric Document Collections from Large-ScaleWeb Archives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

Gerhard Gossen, Elena Demidova, and Thomas Risse

Information Governance Maturity Model Final Development Iteration . . . . . . 128Diogo Proença, Ricardo Vieira, and José Borbinha

Challenges of Research Data Management for HighPerformance Computing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140

Björn Schembera and Thomas Bönisch

Quality in Digital Libraries

How Linked Data can Aid Machine Learning-Based Tasks . . . . . . . . . . . . . 155Michalis Mountantonakis and Yannis Tzitzikas

Can Plausibility Help to Support High Quality Contentin Digital Libraries? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169

José María González Pinto and Wolf-Tilo Balke

Classifying Document Types to Enhance Search and Recommendationsin Digital Libraries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181

Aristotelis Charalampous and Petr Knoth

Understanding the Influence of Hyperparameters on Text Embeddingsfor Text Classification Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193

Nils Witt and Christin Seifert

Digital Humanities

Europeana: What Users Search for and Why . . . . . . . . . . . . . . . . . . . . . . . 207Paul Clough, Timothy Hill, Monica Lestari Paramita,and Paula Goodale

Metadata Aggregation: Assessing the Application of IIIF and SitemapsWithin Cultural Heritage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220

Nuno Freire, Glen Robson, John B. Howard, Hugo Manguinhas,and Antoine Isaac

XXII Contents

Page 20: Lecture Notes in Computer Science 10450 - Home - Springer978-3-319-67008-9/1.pdf · Lecture Notes in Computer Science 10450 Commenced Publication in 1973 Founding and Former Series

A Decade of Evaluating Europeana - Constructs, Contexts,Methods & Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233

Vivien Petras and Juliane Stiller

On the Uses of Word Sense Change for Researchin the Digital Humanities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246

Nina Tahmasebi and Thomas Risse

Entities

Multi-aspect Entity-Centric Analysis of Big Social Media Archives . . . . . . . . 261Pavlos Fafalios, Vasileios Iosifidis, Kostas Stefanidis, and Eirini Ntoutsi

A Comparative Study of Language Modeling to Instance-Based Methods,and Feature Combinations for Authorship Attribution . . . . . . . . . . . . . . . . . 274

Olga Fourkioti, Symeon Symeonidis, and Avi Arampatzis

What Others Say About This Work? Scalable Extraction of CitationContexts from Research Papers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287

Petr Knoth, Phil Gooch, and Kris Jack

Semantic Author Name Disambiguation with Word Embeddings. . . . . . . . . . 300Mark-Christoph Müller

Scholarly Communication

Towards a Knowledge Graph Representing Research Findingsby Semantifying Survey Articles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315

Said Fathalla, Sahar Vahdati, Sören Auer, and Christoph Lange

Integration of Scholarly Communication MetadataUsing Knowledge Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328

Afshin Sadeghi, Christoph Lange, Maria-Esther Vidal, and Sören Auer

Analysing Scholarly Communication Metadata of ComputerScience Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342

Said Fathalla, Sahar Vahdati, Christoph Lange, and Sören Auer

High-Pass Text Filtering for Citation Matching . . . . . . . . . . . . . . . . . . . . . . 355Yannis Foufoulas, Lefteris Stamatogiannakis, Harry Dimitropoulos,and Yannis Ioannidis

Contents XXIII

Page 21: Lecture Notes in Computer Science 10450 - Home - Springer978-3-319-67008-9/1.pdf · Lecture Notes in Computer Science 10450 Commenced Publication in 1973 Founding and Former Series

Sentiment Analysis

Sentiment Classification over Opinionated Data Streams ThroughInformed Model Adaptation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 369

Vasileios Iosifidis, Annina Oelschlager, and Eirini Ntoutsi

Mining Semantic Patterns for Sentiment Analysis of Product Reviews . . . . . . 382Sang-Sang Tan and Jin-Cheon Na

A Comparison of Pre-processing Techniques for TwitterSentiment Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 394

Dimitrios Effrosynidis, Symeon Symeonidis, and Avi Arampatzis

Employing Twitter Hashtags and Linked Data to Suggest TrendingResources in a Digital Library . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 407

Ioannis Papadakis, Konstantinos Kyprianos, Apostolos Karalis,and Christos Douligeris

Information Behavior

Social Tagging: Implications from Studying User Behaviorand Institutional Practice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 421

Õnne Mets and Jaagup Kippar

The Ghost in the Museum Website: Investigating the General Public’sInteractions with Museum Websites. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 434

David Walsh, Mark Hall, Paul Clough, and Jonathan Foster

Evaluating the Usefulness of Visual Features for SupportingDocument Triage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 446

Dagmar Kern, Maria Lusky, and Dirk Wacker

Building User Groups Based on a Structural Representationof User Search Sessions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 459

Wilko van Hoek and Zeljko Carevic

Information Retrieval

Multiple Random Walks for Personalized Rankingwith Trust and Distrust . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 473

Dimitrios Rafailidis and Fabio Crestani

Plagiarism Detection Based on Citing Sentences . . . . . . . . . . . . . . . . . . . . . 485Sidik Soleman and Atsushi Fujii

Lexicon Induction for Interpretable Text Classification. . . . . . . . . . . . . . . . . 498Jérémie Clos and Nirmalie Wiratunga

XXIV Contents

Page 22: Lecture Notes in Computer Science 10450 - Home - Springer978-3-319-67008-9/1.pdf · Lecture Notes in Computer Science 10450 Commenced Publication in 1973 Founding and Former Series

The Clustering-Based Initialization for Non-negative Matrix Factorizationin the Feature Transformation of the High-Dimensional Text CategorizationSystem: A Viewpoint of Term Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . 511

Le Nguyen Hoai Nam and Ho Bao Quoc

Short Paper

Analysis of Interactive Multimedia Features in ScientificPublication Platforms. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 525

Camila Wohlmuth da Silva and Nuno Correia

Extending R2RML with Support for RDF Collections and Containersto Generate MADS-RDF Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 531

Christophe Debruyne, Lucy McKenna, and Declan O’Sullivan

Building the Brazilian Academic Genealogy Tree . . . . . . . . . . . . . . . . . . . . 537Wellington Dores, Elias Soares, Fabrício Benevenuto,and Alberto H.F. Laender

When a Metadata Provider Task Is Successful . . . . . . . . . . . . . . . . . . . . . . 544Sarantos Kapidakis

Semantic Enrichment of Web Query Interfaces to Enable DynamicDeep Linking to Web Information Portals . . . . . . . . . . . . . . . . . . . . . . . . . 553

Arne Martin Klemenz and Klaus Tochtermann

A Complete Year of User Retrieval Sessions in a Social SciencesAcademic Search Engine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 560

Philipp Mayr and Ameni Kacem

Social Dendro: Social Network Techniques Applied to ResearchData Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 566

Nelson Pereira, João Rocha da Silva, and Cristina Ribeiro

Incidental or Influential? - Challenges in Automatically DetectingCitation Importance Using Publication Full Texts . . . . . . . . . . . . . . . . . . . . 572

David Pride and Petr Knoth

User Interactions with Bibliographic Information Visualizations . . . . . . . . . . 579Athena Salaba and Tanja Merčun

Towards Building Knowledge Resources from Social MediaUsing Semantic Roles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 585

Diana Trandabăț

Contents XXV

Page 23: Lecture Notes in Computer Science 10450 - Home - Springer978-3-319-67008-9/1.pdf · Lecture Notes in Computer Science 10450 Commenced Publication in 1973 Founding and Former Series

Poster and Demonstration Paper

Towards Finding Animal Replacement Methods . . . . . . . . . . . . . . . . . . . . . 595Nadine Dulisch and Brigitte Mathiak

Environmental Monitoring of Libraries with MonTreAL . . . . . . . . . . . . . . . 599Marcel Großmann, Steffen Illig, and Cornelius Matějka

Introducing Solon: A Semantic Platform for Managing Legal Sources . . . . . . 603Marios Koniaris, George Papastefanatos, Marios Meimaris,and Giorgos Alexiou

Towards a Semantic Search Engine for Scientific Articles . . . . . . . . . . . . . . 608Bastien Latard, Jonathan Weber, Germain Forestier,and Michel Hassenforder

Development of an RDF-Enabled Cataloguing Tool . . . . . . . . . . . . . . . . . . 612Lucy McKenna, Marta Bustillo, Tim Keefe, Christophe Debruyne,and Declan O’Sullivan

Towards Semantic Quality Control of Automatic Subject Indexing . . . . . . . . 616Martin Toepfer and Christin Seifert

Doctoral Consortium Paper

Research Data in Scholarly Practices: Observations of an InterdisciplinaryHorizon2020 Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 623

Madeleine Dutoit

Research Data in Norway: How Do Expectations, Demands and SolutionsCorrespond in the Knowledge Infrastructure for Research Data? . . . . . . . . . . 628

Live Kvale

Top-Down and Bottom-up Approaches to Identify the Users,the Services and the Interface of a 2.0 Digital Library . . . . . . . . . . . . . . . . . 632

Elina Leblanc

Cross-Language Record Linkage Across Humanities CollectionsUsing Metadata Similarities Among Languages. . . . . . . . . . . . . . . . . . . . . . 640

Yuting Song

Machine Learning Architectures for Scalable and Reliable SubjectIndexing: Fusion, Knowledge Transfer, and Confidence . . . . . . . . . . . . . . . . 644

Martin Toepfer

XXVI Contents

Page 24: Lecture Notes in Computer Science 10450 - Home - Springer978-3-319-67008-9/1.pdf · Lecture Notes in Computer Science 10450 Commenced Publication in 1973 Founding and Former Series

Explaining Pairwise Relationships Between Documents . . . . . . . . . . . . . . . . 648Nils Witt

Studying Conceptual Models for Publishing Library Datato the Semantic Web . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 652

Sofia Zapounidou

Tutorials

Putting Historical Data in Context: How to Use DSpace-GLAM . . . . . . . . . . 659Andrea Bollini and Claudio Cortese

Innovation Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 661Michail Salampasis

Enabling Precise Identification and Citability of Dynamic Data:Recommendations of the RDA Working Group on Data Citation . . . . . . . . . 663

Andreas Rauber

Enriching Digital Collections Using Tools for Text Mining, Indexingand Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 665

Riza Batista-Navarro, Axel J. Soto, Nhung T.H. Nguyen,William Ulate, and Sophia Ananiadou

Workshops

NKOS 2017 – 17th European Networked Knowledge OrganizationSystems Workshop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 669

Philipp Mayr, Douglas Tudhope, Koraljka Golub, Christian Wartena,and Ernesto William De Luca

MDQual – (Meta)-Data Quality Workshop . . . . . . . . . . . . . . . . . . . . . . . . . 671Dimitris Gavrilis and Christos Papatheodorou

TDDL 2017 – 1st International Workshop on Temporal Dynamicsin Digital Libraries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 673

Annalina Caputo, Nattiya Kanhabua, Pierpaolo Basile,and Séamus Lawless

FUTURITY 2017 – Workshop on Modeling Societal Future . . . . . . . . . . . . 675Daniela Gîfu and Diana Trandabăţ

Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 677

Contents XXVII