on enhancing the user experience in web search engines

On Enhancing the User Experiencein Web Search Engines

Franco Maria Nardini

About Me

• I joined the HPC Lab in 2006– Master Thesis

• Ph.D. in 2011, University of Pisa– Thesis: “Query Log Mining to Enhance User

Experience in Search Engines”

• mail: [email protected]• web: http://hpc.isti.cnr.it/~nardini• skype: francomaria.nardini

mailto:[email protected]


http://hpc.isti.cnr.it/~nardini

http://hpc.isti.cnr.it/~nardini


Query Suggestion

with Daniele Broccolo, Lorenzo MarconRaffaele Perego, Fabrizio Silvestri

Our Contribution: Search Shortcuts

Our Contribution: Search Shortcuts

• Search Shortcuts:– It uses the “happy ending” stories in the query log

to help new users;• Efficient:– All the “stuff” is stored on a inverted index:

retrieval problem;• Effective: (head, torso, tail)– New evaluation methodology confirming this

evidencies: TREC Diversity Track.Daniele Broccolo, Lorenzo Marcon, Franco Maria Nardini, Fabrizio Silvestri, Raffaele Perego, Generating Suggestions for Queries in the Long Tail with an Inverted Index, IP&M, 2011.

Some Results

What’s Next?!• Why not to use Machine Learning?– Machine learning is helping a lot in the IR

community;– Better and “fine-graned” ranking as it could take

into account important signals that are not fully-exploited nowadays;

– It may helps in filtering redundant suggestions and choosing the “best” expressive ones (for each intent).

under exploration withMarcin Sydow (PJIIT),

Raffaele Perego, Fabrizio Silvestri

Signals

• Which signals we would like to capture?– Relevance to the given query;– Diversity with respect to a subtopic list;– Serendipity of suggestions;– Novelty with respect to news/trends on Twitter;

• How do we catch them?• How do we combine them?• The “training” set is a problem.

Query Suggestion: Ranking

• A two-step architecture– First step to produce a list of candidates;– Second step as a ML architecture composed of two

different (cascade) stages of ranking:• First round to rank suggestions w.r.t. the query;• Second round to understand “diversity”.

Diversification ofWeb Search Engine Results

withGabriele Capannini, Raffaele Perego, Fabrizio Silvestri

Our Contribution

• We design a method for efficiently diversify results from Web search engines.– Same effectiveness of other state-of-the-art

approaches;– Extremely fast in doing the “hard” work;

• Intents behind “ambiguous” queries are mined from query logs;

Capannini G., Nardini F.M., Silvestri F., Perego R., A Search Architecture Enabling Efficient Diversification of Search Results, Proc. DDR Workshop 2011.Capannini G., Nardini F.M., Silvestri F., Perego R., Efficient Diversification of Web Search Results. Proceedings of VLDB 2011 (PVLDB), Volume 4, Issue 7.

Our Contribution

Some Results

What’s Next?• A modern ranking architecture:– Effective:• Users should be happy of the results they receive;

– Efficient:• Low response times (< 0.1 s);

– Easy to adapt:• Continuous crawling from the Web;• Continuous users’ feedback;

with Berkant Barla Cambazoglu (Yahoo! Barcelona),

Gabriele Capannini, Raffaele Perego, Fabrizio Silvestri

Let’s Plug All Together

BM25 Scorer1 … Scorern

Query

Index

Second Phase

First Phase

Results

Scorerdiv

SS

• A way for efficiently diversifying “ambiguous” queries;• SS teaches how to “diversify” the current user query;• Scorerdiv computes the diversity “signal” of each document and

rerank the final results list;

Possible intents behind the query

Retrieval over Query Sessions

with M-Dyaa AlBakour

(University of Glasgow)

Main Goals

• Question 1)– Can Web search engines improve their

performance by using previous user interactions? (including previous queries, clicks on ranked results, dwell times, etc.)

• Question 2)– How do we evaluate system performance over an

entire query session instead of a single query?

TREC Session Track• Two editions of the challenge: 2010, 2011– query, previous queries;– urls + docs, urls + docs + dwell time;– Two different evaluations: last subtop., all subtop.

• “Query expansion” with Search Shortcuts:– weighted by means of user interaction data;– “history-based” recommendation;

• Follow-up with tuning of the parameters.Ibrahim Adeyanju, Franco Maria Nardini, M-Dyaa Albakour, Dawei Song, Udo Kruschwitz, RGU-ISTI-Essex at TREC 2011 Session Track, TREC Conference, 2011.Franco Maria Nardini, M-Dyaa Albakour, Ibrahim Adeyanju, Udo Kruschwitz, Studying Search Shortcuts in a Query Log to Improve Retrieval Over Query Sessions, SIR 2012 in conjunction with ECIR 2012.

Some Results

What’s Next?• Entity-based representation of the user

session.– to reduce the “sparsity” of the space.

Challenges

• How those systems really affect (and modify) the behavior of the user?– Is it possible to quantify it? (metrics?)– What do we need to observe?

• Toward the “perfect result page”:– accurate models for blending different sources of

results.

Little Announcement

http://tf.isti.cnr.it

• Models and Techniques for Tourist Facilities• Evaluation and Test Collections• User Interaction and Interfaces

Paper Deadline

06/25/2012

http://tf.isti.cnr.it/

Questions!?!

on enhancing the user experience in web search engines

Documents

search architecture

search engines mail

results whats

lorenzo marconraffaele

query logs capannini

fabrizio silvestrilets

query log mining

modern ranking architecture