evaluating heterogeneous information access (position paper)
DESCRIPTION
Information access is becoming increasingly heterogeneous. We need to better understand the more complex user behaviour within this context so that to properly evaluate search systems dealing with heterogeneous information. In this paper, we review the main challenges associated with evaluating search in this context and propose some avenues to incorporate user aspects into evaluation measures. Full paper at http://www.dcs.gla.ac.uk/~mounia/Papers/mube2013.pdf. To presented at SIGIR workshop MUBE, 2013. PhD work/reflection/conclusions of Ke (Adam) Zhou.TRANSCRIPT
Evalua&ng Heterogeneous Informa&on Access (Posi&on Paper)
Ke Zhou1, Tetsuya Sakai2, Mounia Lalmas3, Zhicheng Dou2 and Joemon M. Jose1
1University of Glasgow 2MicrosoN Research Asia 3Yahoo! Labs Barcelona
SIGIR 2013 MUBE workshop
IR Evalua&on
• System-‐oriented Evalua&on (test collec&on + metrics)
• User-‐oriented Evalua&on (interac&ve user study)
• Current endeavor to incorporate user into system-‐oriented metrics – Time-‐Biased Gain (Smucker, Clarke.) – U-‐measure (Sakai, Dou) – etc.
Increasing Heterogeneous Nature on Search
Increasing Heterogeneous Nature on Search
……
Posi&on
• Compared with tradi&onal homogeneous search, evalua&on in the context of heterogeneous informa&on is more challenging and requires taking into account more complex user behaviors and interac4ons.
Challenges
• Non-‐linear Traversal Browsing • Diverse Search Tasks • Coherence • Diversity • Personaliza&on • etc.
Various Presenta&on Strategies
Non-‐linear Blended Blended
……
Tabbed
User Browsing Pa^ern “E” Browsing Pa?ern on Aggregated Search Page
“F” Browsing Pa?ern on Organic Search Page
h^p://searchengineland.com/eye-‐tracking-‐on-‐universal-‐and-‐personalized-‐search-‐12233
Non-‐linear Traversal Browsing
h^p://searchengineland.com/eye-‐tracking-‐on-‐universal-‐and-‐personalized-‐search-‐12233
Search Tasks: Ver&cal Orienta&on
CIKM’10 (Sushmita et al.), WWW’13 (Zhou et al.)
Search Tasks: Complexity
SIGIR’12 (Arguello et al.)
Coherence
Car
Animal
Car
Car
Car
Car
Car
Car
Car
Car
Car
Car
vs.
CIKM’12 (Arguello et al.)
Coherence
Car
Animal
Animal
Car
Animal
Car
Car
Car
Animal
Car
Animal
Car
vs.
CIKM’12 (Arguello et al.)
Diversity
vs.
Image News+Map+Image SIGIR’12 (Zhou et al.)
Personaliza&on
vs.
SIGIRer Average User
Avenues of Research
• Be^er understanding of users – Click models: WSDM’12 (Chen et al.), SIGIR’13 (Wang et al.)
– Ver&cal orienta&on: CIKM’10 (Sushmita et al.), WWW’13 (Zhou et al.)
– Task complexity: SIGIR’12 (Arguello et al.) – Task coherence: CIKM’12 (Arguello et al.) – Diversity: SIGIR’12 (Zhou et al.) – Personaliza&on – Non-‐linear presenta&on strategies
Avenues of Research
• Be^er incorpora&on of learned user behavior into evalua&on metrics – follow SIGIR’13 (Chuklin et al) and convert obtained aggregated search click models into system-‐oriented evalua&on metrics.
– model addi&onal features into powerful evalua&on framework (e.g. TBG, U-‐measure, AS-‐metric).
Thank you! Ques&ons?
Ke Zhou, [email protected]
TREC 2013 FedWeb track: h^ps://sites.google.com/site/trecfedweb/