complex network analysis reveals kernel-periphery structure in web search queries

12
Complex Network Analysis Reveals Kernel-Periphery Structure in Web Search Queries Query Representation and Understanding Workshop 2011 (QRU '11) ACM SIGIR 2011, Beijing, China Rishiraj Saha Roy and Niloy Ganguly IIT Kharagpur India Monojit Choudhury Microsoft Research India India Naveen Kumar Singh NIT Durgapur India

Upload: yen-kelley

Post on 31-Dec-2015

24 views

Category:

Documents


0 download

DESCRIPTION

Query Representation and Understanding Workshop 2011 (QRU '11) ACM SIGIR 2011, Beijing, China. Complex Network Analysis Reveals Kernel-Periphery Structure in Web Search Queries. Language of Queries. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Complex Network Analysis Reveals Kernel-Periphery Structure in Web Search Queries

Complex Network Analysis Reveals Kernel-Periphery Structure in Web Search

Queries

Query Representation and Understanding Workshop 2011 (QRU '11)

ACM SIGIR 2011, Beijing, China

Rishiraj Saha Roy and Niloy Ganguly

IIT KharagpurIndia

Monojit ChoudhuryMicrosoft Research

IndiaIndia

Naveen Kumar Singh

NIT DurgapurIndia

Page 2: Complex Network Analysis Reveals Kernel-Periphery Structure in Web Search Queries

Language of Queries

Interaction between user and search engines over the years has resulted in the evolution of a distinct language for Web search queries

gprs config samsung focus at&t

samsung focus at&t gprs config

focus config at&t gprs samsungApril 19, 2023 Query Representation and Understanding 2011 (QRU '11) 2

Page 3: Complex Network Analysis Reveals Kernel-Periphery Structure in Web Search Queries

Language of Queries

How can we begin to

analyze this new language?

April 19, 2023 Query Representation and Understanding 2011 (QRU '11) 3

Page 4: Complex Network Analysis Reveals Kernel-Periphery Structure in Web Search Queries

Complex Networks

Real life networks not easily explained by standard topologies

Applications to linguistics – word co-occurrences, consonant inventories, syntactic and semantic features, language dynamics

April 19, 2023 Query Representation and Understanding 2011 (QRU '11) 4

Page 5: Complex Network Analysis Reveals Kernel-Periphery Structure in Web Search Queries

Complex Networks

Word co-occurrence networks: Interesting tool to discover fundamental properties of a language

April 19, 2023 Query Representation and Understanding 2011 (QRU '11) 5

Page 6: Complex Network Analysis Reveals Kernel-Periphery Structure in Web Search Queries

Data

16.7 million entries sampled from Bing Query

Logs from Australia (February – May 2009)

Courtesy: Microsoft India Development Center

April 19, 2023 Query Representation and Understanding 2011 (QRU '11) 6

Page 7: Complex Network Analysis Reveals Kernel-Periphery Structure in Web Search Queries

Network Models for Queries

“gprs” “config” “samsung

focus” “at&t”

“dell laptop” “extreme”

“gaming” “config”

April 19, 2023 Query Representation and Understanding 2011 (QRU '11) 7

samsung focus

config

gprs

extreme

gamingdell

laptopat&t

Globalco-

occurrenceEdge restriction

Localco-

occurrence

Page 8: Complex Network Analysis Reveals Kernel-Periphery Structure in Web Search Queries

Two-regimePower Law

Two-regime power law in degree distribution

Similar coefficients for queries and English

Kernel (K-Lex) and peripheral (P-Lex) lexicon

distinctionApril 19, 2023 Query Representation and Understanding 2011 (QRU '11) 8

Page 9: Complex Network Analysis Reveals Kernel-Periphery Structure in Web Search Queries

Insights (1)

April 19, 2023 Query Representation and Understanding 2011 (QRU '11) 9

Differences in compositions of K-Lex and P-Lex

Heads and modifiersK-Lex (popular

segments)P-Lex (rarer segments)

how to matthew brodrick

wiki accessories

free police officer

and who is

in australia epson tx800

videos star trek next gen

real estate adams apple

difference between harvard university

windows xp leukemia

K-Lex and P-Lex Higher mean shortest

paths Less tight kernel More k-p edges Socio-cultural effects

Page 10: Complex Network Analysis Reveals Kernel-Periphery Structure in Web Search Queries

Insights (2)

Higher mean shortest path in query networks

Peripheral units can independently form queries

More difficult to understand the context of a previously unseen unit

High surprise factorApril 19, 2023 Query Representation and Understanding 2011 (QRU '11) 10

K-Lex and P-Lex Higher mean shortest

paths Less tight kernel More k-p edges Socio-cultural effects

airedale

terrier

tumor

where

download

prison break

Page 11: Complex Network Analysis Reveals Kernel-Periphery Structure in Web Search Queries

Insights (3)

Kernel is less tightly coupled

98% edges run between kernel and periphery, while intra-kernel edges dominate in English

Socio-cultural factors govern kernel-periphery distinction (lyrics, movies, adelaide in K-Lex; code, accessories, delhi in P-Lex)

April 19, 2023 Query Representation and Understanding 2011 (QRU '11) 11

K-Lex and P-Lex Higher mean shortest

paths Less tight kernel More k-p edges Socio-cultural effects

Page 12: Complex Network Analysis Reveals Kernel-Periphery Structure in Web Search Queries

April 19, 2023 Query Representation and Understanding 2011 (QRU '11) 12