lecture notes in computer science 9283978-3-319-24027...lecture notes in computer science 9283...

Lecture Notes in Computer Science 9283

Commenced Publication in 1973Founding and Former Series Editors:Gerhard Goos, Juris Hartmanis, and Jan van Leeuwen

Editorial Board

David HutchisonLancaster University, Lancaster, UK

Takeo KanadeCarnegie Mellon University, Pittsburgh, PA, USA

Josef KittlerUniversity of Surrey, Guildford, UK

Jon M. KleinbergCornell University, Ithaca, NY, USA

Friedemann MatternETH Zurich, Zürich, Switzerland

John C. MitchellStanford University, Stanford, CA, USA

Moni NaorWeizmann Institute of Science, Rehovot, Israel

C. Pandu RanganIndian Institute of Technology, Madras, India

Bernhard SteffenTU Dortmund University, Dortmund, Germany

Demetri TerzopoulosUniversity of California, Los Angeles, CA, USA

Doug TygarUniversity of California, Berkeley, CA, USA

Gerhard WeikumMax Planck Institute for Informatics, Saarbrücken, Germany

More information about this series at http://www.springer.com/series/7409

http://www.springer.com/series/7409

Josiane Mothe • Jacques SavoyJaap Kamps • Karen Pinel-SauvagnatGareth J.F. Jones • Eric SanJuanLinda Cappellato • Nicola Ferro (Eds.)

Experimental IR MeetsMultilinguality,Multimodality, and Interaction6th International Conferenceof the CLEF Association, CLEF’15Toulouse, France, September 8–11, 2015Proceedings

123

EditorsJosiane MotheInstitut de Recherche en Informatique

de ToulouseToulouseFrance

Jacques SavoyUniversity of NeuchâtelNeuchâtelSwitzerland

Jaap KampsUniversity of AmsterdamAmsterdamThe Netherlands

Karen Pinel-SauvagnatInstitut de Recherche en Informatique

de ToulouseToulouseFrance

Gareth J.F. JonesDublin City UniversityDublinIreland

Eric SanJuanUniversité d’Avignon et des Pays

de VaucluseAvignonFrance

Linda CappellatoUniversity of PaduaPaduaItaly

Nicola FerroUniversity of PaduaPaduaItaly

ISSN 0302-9743 ISSN 1611-3349 (electronic)Lecture Notes in Computer ScienceISBN 978-3-319-24026-8 ISBN 978-3-319-24027-5 (eBook)DOI 10.1007/978-3-319-24027-5

Library of Congress Control Number: 2015947945

LNCS Sublibrary: SL3 – Information Systems and Applications, incl. Internet/Web, and HCI

Springer Cham Heidelberg New York Dordrecht London© Springer International Publishing Switzerland 2015This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of thematerial is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation,broadcasting, reproduction on microfilms or in any other physical way, and transmission or informationstorage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology nowknown or hereafter developed.The use of general descriptive names, registered names, trademarks, service marks, etc. in this publicationdoes not imply, even in the absence of a specific statement, that such names are exempt from the relevantprotective laws and regulations and therefore free for general use.The publisher, the authors and the editors are safe to assume that the advice and information in this book arebelieved to be true and accurate at the date of publication. Neither the publisher nor the authors or the editorsgive a warranty, express or implied, with respect to the material contained herein or for any errors oromissions that may have been made.

Printed on acid-free paper

Springer International Publishing AG Switzerland is part of Springer Science+Business Media(www.springer.com)

Preface

Since 2000, the Conference and Labs of the Evaluation Forum (CLEF) has played aleading role in stimulating research and innovation in the domain of multimodal andmultilingual information access. Initially founded as the Cross-Language EvaluationForum and running in conjunction with the European Conference on Digital Libraries(ECDL/TPDL), CLEF became a standalone event in 2010 combining a peer-reviewedconference with a multi-track evaluation forum. CLEF 20151 was hosted by the Institutde Recherche en Informatique de Toulouse UMR 5505 CNRS, Université de Toulouse,France, SIG team.

The CLEF conference addresses all aspects of information access in any modalityand language. The conference has a clear focus on experimental IR as done at eval-uation forums (CLEF Labs, TREC, NTCIR, FIRE, MediaEval, RomIP, SemEval,TAC, …), paying special attention to the challenges of multimodality, multilinguality,and interactive search. We invited submissions on significant new insights demon-strated on the resulting IR test collections, on analysis of IR test collections andevaluation measures, as well as on concrete proposals to push the boundaries of theCranfield/TREC/CLEF paradigm. The conference format consisted of keynotes, con-tributed papers, lab sessions, and poster sessions, including reports from otherbenchmarking initiatives from around the world. It was an honor and a privilege tohave Gregory Grefenstette (INRIA Saclay, France), Mounia Lalmas (Yahoo Labs,London, UK), and Douglas W. Oard (University of Maryland, USA) as keynotespeakers. Greg talked about personal information systems and personal semantics,Mounia addressed the topic of user engagement evaluation, and Doug examined issuesin privacy and ethics when searching among secrets.

CLEF 2015 received a total of 68 submissions, a dramatic increase over previousyears. Each submission was reviewed by at least three PC members, and the twoprogram chairs oversaw the reviewing and often extensive follow-up discussion. Wherethe discussion was not sufficient to make a decision, the paper, went through an extrareview by the PC. A novel feature of the CLEF 2015 conference was to invite CLEF2014 lab organizers to nominate a “best of the labs” paper, which was reviewed as afull paper submission to the CLEF 2015 conference according to the same reviewcriteria and PC. This resulted in 8 full papers accepted corresponding to each to theCLEF 2014 labs. We received 24 regular full paper submissions, of which 8 (33 %) fullpapers were accepted for regular oral presentation, 7 further full paper submissions (29 %,making a total of 63 %) accepted with short oral presentation and a poster. We received 36short paper submissions, and accepted 20 (55 %).

1 http://clef2015.clef-initiative.eu/

http://clef2015.clef-initiative.eu/

In addition to these talks, the eight benchmarking labs reported results of their yearlong activities in overview talks and lab sessions2. The eight labs running as part ofCLEF 2015 were as follows:

CLEFeHealth provided scenarios aiming to ease patients’ and nurses’ understand-ing and accessing of eHealth information. The goals of the lab were to develop pro-cessing methods and resources in a multilingual setting, to enrich difficult-to-understandeHealth texts, and provide valuable documentation. The tasks were: informationextraction from clinical data, and user-centered health information retrieval.

ImageCLEF provided four main tasks with a global objective of benchmarkingautomatic annotation and indexing of images. The tasks tackled different aspects of theannotation problem and aimed at supporting and promoting cutting-edge researchaddressing the key challenges in the field: image annotation, medical classification,medical clustering, and liver CT annotation.

LifeCLEF provided image-based plant, bird, and fish identification tasks addressingmultimodal data by (i) considering birds and fish in addition to plants, (ii) consideringaudio and video content in addition to images, (iii) scaling-up the evaluation data tohundreds of thousands of life media records and thousands of living species. The taskswere: an audio record-based bird identification task (BirdCLEF), an image-based plantidentification task (PlantCLEF), and a fish video surveillance task (FishCLEF).

Living Labs for IR (LL4IR) provided a benchmarking platform for researchers toevaluate their ranking systems in a live setting with real users in their natural task envi-ronments. The lab acted as a proxy between commercial organizations (live environments)and lab participants (experimental systems), facilitated data exchange, and made com-parisons between the participating systems. The task was: product search and web search.

News Recommendation Evaluation Lab (NEWSREEL) provided two tasksdesigned to address the challenge of real-time news recommendation. Participantscould: a) develop news recommendation algorithms and b) have them tested by mil-lions of users over the period of a few weeks in a living lab. The tasks were: benchmarknews recommendations in a living lab, benchmarking news recommendations in asimulated environment.

Uncovering Plagiarism, Authorship, and Social Software Misuse (PAN) pro-vided evaluation of uncovering plagiarism, authorship, and social software misuse.PAN offered three tasks at CLEF 2015 with new evaluation resources consisting oflarge-scale corpora, performance measures, and web services that allowed for mean-ingful evaluations. The main goal was to provide for sustainable and reproducibleevaluations, to get a clear view of the capabilities of state-of-the-art algorithms. Thetasks were: plagiarism detection, author identification, and author profiling.

Question Answering (QA) provided QA from the starting point of a natural lan-guage question. However, answering some questions may need to query linked data

2 The full details for each lab are contained in a separate publication, the Working Notes, which areavailable online at http://ceur-ws.org/Vol-1391/.

VI Preface

http://ceur-ws.org/Vol-1391/

(especially if aggregations or logical inferences are required); whereas some questionsmay need textual inferences and querying free-text. Answering some queries may needboth. The tasks were: QALD: Question Answering over Linked Data, entrance exams:questions from reading tests, BioASQ: large-scale biomedical semantic indexing, andBioASQ: biomedical question answering.

Social Book Search (SBS) provided evaluation of real-world information needswhich are generally complex, yet almost all research focuses instead on either relativelysimple search based on queries or recommendation based on profiles. The goal of theSocial Book Search Lab was to investigate techniques to support users in complexbook search tasks that involve more than just a query and results list. The tasks were:suggestion track, and interactive track.

A rich social program was organized in conjunction with the conference, startingwith a welcome reception with local food and wine specialities, continuing with a cityhall reception, which included the local band “La mal Coiffée”. The social dinner wasenjoyed in a famous organic restaurant named “Saveur Bio”, and master classes in(1) traditional polyphonic singing with Bastien Zaoui from the famous Vox Bigerriband and (2) wine and food pairing with Yves Cinotti, were also offered.

The success of CLEF 2015 would not have been possible without the huge effort ofseveral people and organizations, including the CLEF Association3, the ProgramCommittee, the Lab Organizing Committee, the Local Organization Committee inToulouse, the reviewers, and the many students and volunteers who contributed alongthe way. We would like to acknowledge the Institut de Recherche en Informatique deToulouse UMR 5505 CNRS and its director, Prof. Michel Daydé, for the support we got,first for bidding to host the conference, then for organizing it. We also received thesupport from the following universities and schools: Ecole supérieure du professorat etde l’éducation, Université Toulouse-Jean Jaurès, Université Paul Sabatier, and Universitédu Capitole. We also gratefully acknowledge the support we received from our sponsors.The ESF Research Networking Program ELIAS, the ACM SIG-IR, the UniversitéToulouse-Jean Jaurès, and the Région Midi-Pyrénées for their strong financial support;but also: Springer, the Université Paul Sabatier, Institut de Recherche en Informatiquede Toulouse UMR 5505 CNRS, INFORSID, Université Toulouse Capitole, EGC,ARIA, and ACL. The level of sponsorship allowed us to offer 20 grants for students inaddition to a free registration for the 25 volunteers including 11 further students.

July 2015 Josiane MotheJacques Savoy

Jaap KampsKaren Pinel-Sauvagnat

Gareth J.F. JonesEric SanJuan

Linda CappellatoNicola Ferro

3 http://www.clef-initiative.eu/association

Preface VII

http://www.clef-initiative.eu/association

Organization

CLEF 2015 Conference and Labs of the Evaluation Forum, Experimental IR MeetsMultilinguality, Multimodality, and Interaction, was organized by the University ofToulouse, France.

General Chair

Josiane Mothe IRIT, Université de Toulouse, FranceJacques Savoy University of Neuchâtel, Switzerland

Program Chair

Jaap Kamps University of Amsterdam, The NetherlandsKaren Pinel-Sauvagnat IRIT, Université de Toulouse, France

Lab Chair

Gareth J.F. Jones Dublin City University, IrelandEric SanJuan Université d’Avignon et des Pays du Vaucluse, France

Program Committee

Maristella Agosti University of Padua, ItalyKrisztian Balog University of Stavanger, NorwayPatrice Bellot LSIS - Université de Marseille, FranceToine Bogers Aalborg University Copenhagen, DenmarkMohand Boughanem IRIT - Université Paul Sabatier Toulouse 3, FranceGuillaume Cabanac IRIT - Université Paul Sabatier Toulouse 3, FranceTiziana Catarci Università di Roma “La Sapienza”, ItalyPaul Clough University of Sheffield, UKNicola Ferro University of Padua, ItalyNorbert Fuhr University of Duisburg-Essen, GermanyEric Gaussier Université Joseph Fourier (Grenoble I), FranceLorraine Goeuriot Université Joseph Fourier (Grenoble I), FranceJulio Gonzalo UNED, Madrid, SpainAllan Hanbury Vienna University of Technology, AustriaDonna Harman NIST, USADjoerd Hiemstra University of Twente, The NetherlandsFrank Hopfgartner University of Glasgow, UKGilles Hubert IRIT - Université Paul Sabatier Toulouse 3, FrancePeter Ingwersen University of Copenhagen, DenmarkAlexis Joly INRIA Sophia-Antipolis, FranceGareth J.F. Jones Dublin City University, Ireland

Evangelos Kanoulas University of Amsterdam, The NetherlandsGabriella Kazai Lumi, UKJaana Kekäläinen University of Tampere, FinlandLiadh Kelly Trinity College Dublin, IrelandBenjamin Kille DAI Lab, Berlin Institute of Technology, GermanyMarijn Koolen University of Amsterdam, The NetherlandsBirger Larsen Aalborg University, DenmarkMihai Lupu Vienna University of Technology, AustriaThomas Mandl University of Hildesheim, GermanyHenning Müller HES-SO, University of Applied Sciences Western

Switzerland, SwitzerlandJian-Yun Ni Université de Montréal, CanadaIadh Ounis University of Glasgow, UKGabriella Pasi Università degli Studi di Milano Bicocca, ItalyAnselmo Peñas NLP and IR Group, UNED, SpainBenjamin Piwowarski CNRS/Université Pierre et Marie Curie, FranceMartin Potthast Bauhaus University Weimar, GermanyPaolo Rosso Technical University of Valencia, SpainEric SanJuan Université d’Avignon, FranceRalf Schenkel Universität Passau, GermanyAnne Schuth University of Amsterdam, The NetherlandsEfstathios Stamatatos University of the Aegean, GreeceBenno Stein Bauhaus-Universität Weimar, GermanyLynda Tamine IRIT - Université Paul Sabatier Toulouse 3, FranceXavier Tannier LIMSI-CNRS, Université Paris-Sud, FranceTheodora Tsikrika Information Technologies Institute, CERTH, GreeceChristina Unger CITEC, Universität Bielefeld, GermanyMauricio Villegas Universitat Politècnica de València, Spain

Local Organization

Adrian Chifu IRIT, Université de Toulouse, France (Sponsoring)Véronique Debats IRIT, France (Communication)Marlène Giamporcaro SAIC, INP Toulouse, France (Administration and

Registration)Laure Soulier IRIT, Université de Toulouse, France (Advertising)Nathalie

Valles-ParlengeauIRIT, Université de Toulouse, France (Co-resp.

for UT1-Capitole University)

X Organization

Platinum Sponsors

Silver Sponsors

Organization XI

Bronze Sponsors

XII Organization

Keynotes

Personal Information Systemsand Personal Semantics

Gregory Grefenstette

INRIA Saclay, France

People generally think of Big Data as something generated by machines or largecommunities of people interacting with the digital world. But technological progressmeans that each individual is currently, or soon will be, generating masses of digitaldata in their everyday lives. In every interaction with an application, every web pagevisited, every time your telephone is turned on, you generate information aboutyourself, Personal Big Data. With the rising adoption of quantified self gadgets, and theforeseeable adoption of intelligent glasses capturing daily life, the quantity of personalBig Data will only grow. In this Personal Big Data, as in other Big Data, a key problemis aligning concepts in the same semantic space. While concept alignment in the publicsphere is an understood, though unresolved, problem, what does ontological organi-zation of a personal space look like? Is it idiosyncratic, or something that can be sharedbetween people? We will describe our current approach to this problem of organizingpersonal data and creating and exploiting a personal semantics.

Evaluating the Search Experience:From Retrieval Effectiveness to User Engagement

Mounia Lalmas

Yahoo Labs, London, UK

Building retrieval systems that return results to users that satisfy their information needis one thing; Information Retrieval has a long history in evaluating how effectiveretrieval systems are. Many evaluation initiatives such as TREC and CLEF haveallowed organizations worldwide to evaluate and compare retrieval approaches.Building a retrieval system that not only returns good results to users, but does so in away that users will want to use that system again is something more challenging; apositive search experience has been shown to lead to users engaging long-term with theretrieval system. In this talk, I will review state-of-the-art approaches concerned withevaluating retrieval effectiveness. I will then focus on those approaches aiming atevaluating user engagement, and describe current works in this area. The talk will endwith the proposal of a framework incorporating effectiveness evaluation into userengagement. An important component of this framework is to consider both within-and across-search session measurement.

Beyond Information Retrieval:When and How Not to Find Things

Douglas W. Oard

University of Maryland, USA

The traditional role of a search engine is much like the traditional role of a library:generally the objective is to help people find things. As we get better at this, however,we have been encountering an increasing number of cases in which some things that weknow exist simply should not be found. Some well known examples include removal ofimproperly posted copyrighted material from search engine indexes, and the evolvinglegal doctrine that is now commonly referred to as the “right to be forgotten.” Somesuch cases are simple, relying on users to detect specific content that should be flushedfrom a specific index. Other cases, however, are more complex. For example, in theaspect of the civil litigation process known as e-discovery, one side may be entitled towithhold entire classes of material that may not have been labeled in advance (becauseof attorney-client privilege). An even more complex example is government trans-parency, in which for public policy reasons we may want to make some informationpublic, despite that information being intermixed with other information that must beprotected. Professional archivists have long dealt with such challenges, so perhaps weshould start thinking about how to build search engines that act less like a library andmore like an archive. In this talk, I will use these and other examples to introduce theidea of “search among secrets” in which the goal is to help some users find somecontent while protecting some content from some users (or some uses). We’ll divedown to look at how this actually works today in a few specific cases, with particularattention to how queries are formulated and which parts of the process are, or might be,automated. With that as background, I will then offer a few initial thoughts on how wemight evaluate such systems. I’ll conclude with an invitation to think together abouthow information retrieval researchers might, together with others, begin to tackle thesechallenges.

Contents

Experimental IR

Experimental Study on Semi-structured Peer-to-Peer Information RetrievalNetwork. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

Rami S. Alkhawaldeh and Joemon M. Jose

Evaluating Stacked Marginalised Denoising Autoencoders Within DomainAdaptation Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

Boris Chidlovskii, Gabriela Csurka, and Stephane Clinchant

Language Variety Identification Using Distributed Representationsof Words and Documents. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

Marc Franco-Salvador, Francisco Rangel, Paolo Rosso, Mariona Taulé,and M. Antònia Martít

Evaluating User Image Tagging Credibility . . . . . . . . . . . . . . . . . . . . . . . . 41Alexandru Lucian Ginsca, Adrian Popescu, Mihai Lupu, Adrian Iftene,and Ioannis Kanellos

Web and Social Media

Tweet Expansion Method for Filtering Task in Twitter . . . . . . . . . . . . . . . . 55Payam Karisani, Farhad Oroumchian, and Maseud Rahgozar

Real-Time Entity-Based Event Detection for Twitter . . . . . . . . . . . . . . . . . . 65Andrew J. McMinn and Joemon M. Jose

A Comparative Study of Click Models for Web Search . . . . . . . . . . . . . . . . 78Artem Grotov, Aleksandr Chuklin, Ilya Markov, Luka Stout,Finde Xumara, and Maarten de Rijke

Evaluation of Pseudo Relevance Feedback Techniques for Cross VerticalAggregated Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

Hermann Ziak and Roman Kern

Long Papers with Short Presentation

Analysing the Role of Representation Choices in Portuguese RelationExtraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

Sandra Collovini, Marcelo de Bairros P. Filho, and Renata Vieira

http://dx.doi.org/10.1007/978-3-319-24027-5_1

http://dx.doi.org/10.1007/978-3-319-24027-5_1

http://dx.doi.org/10.1007/978-3-319-24027-5_2

http://dx.doi.org/10.1007/978-3-319-24027-5_2

http://dx.doi.org/10.1007/978-3-319-24027-5_3

http://dx.doi.org/10.1007/978-3-319-24027-5_3

http://dx.doi.org/10.1007/978-3-319-24027-5_4

http://dx.doi.org/10.1007/978-3-319-24027-5_5

http://dx.doi.org/10.1007/978-3-319-24027-5_6

http://dx.doi.org/10.1007/978-3-319-24027-5_7

http://dx.doi.org/10.1007/978-3-319-24027-5_8

http://dx.doi.org/10.1007/978-3-319-24027-5_8

http://dx.doi.org/10.1007/978-3-319-24027-5_9

http://dx.doi.org/10.1007/978-3-319-24027-5_9

An Investigation of Cross-Language Information Retrievalfor User-Generated Internet Video. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117

Ahmad Khwileh, Debasis Ganguly, and Gareth J.F. Jones

Benchmark of Rule-Based Classifiers in the News Recommendation Task . . . 130Tomáš Kliegr and Jaroslav Kuchař

Enhancing Medical Information Retrieval by Exploiting a Content-BasedRecommender Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142

Wei Li and Gareth J.F. Jones

Summarizing Citation Contexts of Scientific Publications . . . . . . . . . . . . . . . 154Sandra Mitrović and Henning Müller

A Multiple-Stage Approach to Re-ranking Medical Documents . . . . . . . . . . . 166Heung-Seon Oh, Yuchul Jung, and Kwang-Young Kim

Exploring Behavioral Dimensions in Session Effectiveness. . . . . . . . . . . . . . 178Teemu Pääkkönen, Kalervo Järvelin, Jaana Kekäläinen,Heikki Keskustalo, Feza Baskaya, David Maxwell, and Leif Azzopardi

Short Papers

META TEXT ALIGNER: Text Alignment Based on Predicted PlagiarismRelation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193

Samira Abnar, Mostafa Dehghani, and Azadeh Shakery

Automatic Indexing of Journal Abstracts with Latent Semantic Analysis . . . . 200Joel Robert Adams and Steven Bedrick

Shadow Answers as an Intermediary in Email Answer Retrieval . . . . . . . . . . 209Alyaa Alfalahi, Gunnar Eriksson, and Eriks Sneiders

Are Topically Diverse Documents Also Interesting? . . . . . . . . . . . . . . . . . . 215Hosein Azarbonyad, Ferron Saan, Mostafa Dehghani, Maarten Marx,and Jaap Kamps

Modeling of the Question Answering Task in the YodaQA System . . . . . . . . 222Petr Baudiš and Jan Šedivý

Unfair Means: Use Cases Beyond Plagiarism . . . . . . . . . . . . . . . . . . . . . . . 229Paul Clough, Peter Willett, and Jessie Lim

Instance-Based Learning for Tweet Monitoring and Categorization . . . . . . . . 235Julien Gobeill, Arnaud Gaudinat, and Patrick Ruch

XX Contents

http://dx.doi.org/10.1007/978-3-319-24027-5_10

http://dx.doi.org/10.1007/978-3-319-24027-5_10

http://dx.doi.org/10.1007/978-3-319-24027-5_11

http://dx.doi.org/10.1007/978-3-319-24027-5_12

http://dx.doi.org/10.1007/978-3-319-24027-5_12

http://dx.doi.org/10.1007/978-3-319-24027-5_13

http://dx.doi.org/10.1007/978-3-319-24027-5_14

http://dx.doi.org/10.1007/978-3-319-24027-5_15

http://dx.doi.org/10.1007/978-3-319-24027-5_16

http://dx.doi.org/10.1007/978-3-319-24027-5_16

http://dx.doi.org/10.1007/978-3-319-24027-5_17

http://dx.doi.org/10.1007/978-3-319-24027-5_18

http://dx.doi.org/10.1007/978-3-319-24027-5_19

http://dx.doi.org/10.1007/978-3-319-24027-5_20

http://dx.doi.org/10.1007/978-3-319-24027-5_21

http://dx.doi.org/10.1007/978-3-319-24027-5_22

Are Test Collections “Real”? Mirroring Real-World Complexityin IR Test Collections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241

Melanie Imhof and Martin Braschler

Evaluation of Manual Query Expansion Rules on a Domain SpecificFAQ Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248

Mladen Karan and Jan Šnajder

Evaluating Learning Language Representations . . . . . . . . . . . . . . . . . . . . . . 254Jussi Karlgren, Jimmy Callin, Kevyn Collins-Thompson,Amaru Cuba Gyllensten, Ariel Ekgren, David Jurgens, Anna Korhonen,Fredrik Olsson, Magnus Sahlgren, and Hinrich Schütze

Automatic Segmentation and Deep Learning of Bird Sounds . . . . . . . . . . . . 261Hendrik Vincent Koops, Jan van Balen, and Frans Wiering

The Impact of Noise in Web Genre Identification . . . . . . . . . . . . . . . . . . . . 268Dimitrios Pritsos and Efstathios Stamatatos

On the Multilingual and Genre Robustness of EmoGraphs for AuthorProfiling in Social Media . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274

Francisco Rangel and Paolo Rosso

Is Concept Mapping Useful for Biomedical Information Retrieval? . . . . . . . . 281Wei Shen and Jian-Yun Nie

Using Health Statistics to Improve Medical and Health Search . . . . . . . . . . . 287Tawan Sierek and Allan Hanbury

Determining Window Size from Plagiarism Corpus for Stylometric Features . . . 293Šimon Suchomel and Michal Brandejs

Effect of Log-Based Query Term Expansion on Retrieval Effectivenessin Patent Searching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 300

Wolfgang Tannebaum, Parvaz Mahdabi, and Andreas Rauber

Integrating Mixed-Methods for Evaluating Information Access Systems . . . . . 306Simon Wakeling and Paul Clough

Teaching the IR Process Using Real Experiments Supported by GameMechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312

Thomas Wilhelm-Stein and Maximilian Eibl

Tweet Contextualization Using Association Rules Mining and DBpedia. . . . . 318Meriem Amina Zingla, Chiraz Latiri, and Yahya Slimani

Contents XXI

http://dx.doi.org/10.1007/978-3-319-24027-5_23

http://dx.doi.org/10.1007/978-3-319-24027-5_23

http://dx.doi.org/10.1007/978-3-319-24027-5_24

http://dx.doi.org/10.1007/978-3-319-24027-5_24

http://dx.doi.org/10.1007/978-3-319-24027-5_25

http://dx.doi.org/10.1007/978-3-319-24027-5_26

http://dx.doi.org/10.1007/978-3-319-24027-5_27

http://dx.doi.org/10.1007/978-3-319-24027-5_28

http://dx.doi.org/10.1007/978-3-319-24027-5_28

http://dx.doi.org/10.1007/978-3-319-24027-5_29

http://dx.doi.org/10.1007/978-3-319-24027-5_30

http://dx.doi.org/10.1007/978-3-319-24027-5_31

http://dx.doi.org/10.1007/978-3-319-24027-5_32

http://dx.doi.org/10.1007/978-3-319-24027-5_32

http://dx.doi.org/10.1007/978-3-319-24027-5_33

http://dx.doi.org/10.1007/978-3-319-24027-5_34

http://dx.doi.org/10.1007/978-3-319-24027-5_34

http://dx.doi.org/10.1007/978-3-319-24027-5_35

Best of the Labs

Search-Based Image Annotation: Extracting Semantics from SimilarImages. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327

Petra Budikova, Michal Batko, Jan Botorek, and Pavel Zezula

NLP-Based Classifiers to Generalize Expert Assessments in E-Reputation . . . 340Jean-Valère Cossu, Emmanuel Ferreira, Killian Janod, Julien Gaillard,and Marc El-Bèze

A Method for Short Message Contextualization: Experimentsat CLEF/INEX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352

Liana Ermakova

Towards Automatic Large-Scale Identification of Birds in AudioRecordings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364

Mario Lasseck

Optimizing and Evaluating Stream-Based News RecommendationAlgorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 376

Andreas Lommatzsch and Sebastian Werner

Information Extraction from Clinical Documents: TowardsDisease/Disorder Template Filling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 389

Veera Raghavendra Chikka, Nestor Mariyasagayam, Yoshiki Niwa,and Kamalakar Karlapalem

Adaptive Algorithm for Plagiarism Detection: The Best-PerformingApproach at PAN 2014 Text Alignment Competition . . . . . . . . . . . . . . . . . 402

Miguel A. Sanchez-Perez, Alexander Gelbukh, and Grigori Sidorov

Question Answering via Phrasal Semantic Parsing. . . . . . . . . . . . . . . . . . . . 414Kun Xu, Yansong Feng, Songfang Huang, and Dongyan Zhao

Labs Overviews

Overview of the CLEF eHealth Evaluation Lab 2015 . . . . . . . . . . . . . . . . . 429Lorraine Goeuriot, Liadh Kelly, Hanna Suominen, Leif Hanlen,Aurélie Névéol, Cyril Grouin, João Palotti, and Guido Zuccon

General Overview of ImageCLEF at the CLEF 2015 Labs . . . . . . . . . . . . . . 444Mauricio Villegas, Henning Müller, Andrew Gilbert, Luca Piras,Josiah Wang, Krystian Mikolajczyk, Alba G. Seco de Herrera,Stefano Bromuri, M. Ashraful Amin, Mahmood Kazi Mohammed,Burak Acar, Suzan Uskudarli, Neda B. Marvasti, José F. Aldana,and María del Mar Roldán García

XXII Contents

http://dx.doi.org/10.1007/978-3-319-24027-5_36

http://dx.doi.org/10.1007/978-3-319-24027-5_36

http://dx.doi.org/10.1007/978-3-319-24027-5_37

http://dx.doi.org/10.1007/978-3-319-24027-5_38

http://dx.doi.org/10.1007/978-3-319-24027-5_38

http://dx.doi.org/10.1007/978-3-319-24027-5_39

http://dx.doi.org/10.1007/978-3-319-24027-5_39

http://dx.doi.org/10.1007/978-3-319-24027-5_40

http://dx.doi.org/10.1007/978-3-319-24027-5_40

http://dx.doi.org/10.1007/978-3-319-24027-5_41

http://dx.doi.org/10.1007/978-3-319-24027-5_41

http://dx.doi.org/10.1007/978-3-319-24027-5_42

http://dx.doi.org/10.1007/978-3-319-24027-5_42

http://dx.doi.org/10.1007/978-3-319-24027-5_43

http://dx.doi.org/10.1007/978-3-319-24027-5_44

http://dx.doi.org/10.1007/978-3-319-24027-5_45

LifeCLEF 2015: Multimedia Life Species Identification Challenges. . . . . . . . 462Alexis Joly, Hervé Goëau, Hervé Glotin, Concetto Spampinato,Pierre Bonnet, Willem-Pier Vellinga, Robert Planqué, Andreas Rauber,Simone Palazzo, Bob Fisher, and Henning Müller

Overview of the Living Labs for Information Retrieval Evaluation (LL4IR)CLEF Lab 2015 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 484

Anne Schuth, Krisztian Balog, and Liadh Kelly

Stream-Based Recommendations: Online and Offline Evaluationas a Service . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 497

Benjamin Kille, Andreas Lommatzsch, Roberto Turrin, András Serény,Martha Larson, Torben Brodt, Jonas Seiler, and Frank Hopfgartner

Overview of the PAN/CLEF 2015 Evaluation Lab . . . . . . . . . . . . . . . . . . . 518Efstathios Stamatatos, Martin Potthast, Francisco Rangel, Paolo Rosso,and Benno Stein

Overview of the CLEF Question Answering Track 2015 . . . . . . . . . . . . . . . 539Anselmo Peñas, Christina Unger, Georgios Paliouras,and Ioannis Kakadiaris

Overview of the CLEF 2015 Social Book Search Lab . . . . . . . . . . . . . . . . . 545Marijn Koolen, Toine Bogers, Maria Gäde, Mark Hall,Hugo Huurdeman, Jaap Kamps, Mette Skov, Elaine Toms,and David Walsh

Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 565

Contents XXIII

http://dx.doi.org/10.1007/978-3-319-24027-5_46

http://dx.doi.org/10.1007/978-3-319-24027-5_47

http://dx.doi.org/10.1007/978-3-319-24027-5_47

http://dx.doi.org/10.1007/978-3-319-24027-5_48

http://dx.doi.org/10.1007/978-3-319-24027-5_48

http://dx.doi.org/10.1007/978-3-319-24027-5_49

http://dx.doi.org/10.1007/978-3-319-24027-5_50

http://dx.doi.org/10.1007/978-3-319-24027-5_51