ddc2011 - association

47
Inception How to guide users where they want to go DATA MINING

Upload: buhwan-jeong

Post on 13-Jul-2015

604 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: DDC2011 - Association

InceptionHow to guide users where they want to go

DATA MINING

Page 2: DDC2011 - Association

User-Intended Guide Search

Page 3: DDC2011 - Association

Web Cloud

Robot::Search docs having ‘ranking’

Robot:: Display documentsInformation Retrieval

Page 4: DDC2011 - Association

Web Cloud

Search Engine

Robot (DAUMOA)

Keyword

N

F

Crawling

Indexing

Ranking

Page 5: DDC2011 - Association

Web Cloud• Crawling• Indexing• Ranking

Search Engine

Keyword

Make users search correctly

Page 6: DDC2011 - Association

How to Search?

Page 7: DDC2011 - Association

Search by Typing

Users’ Intention

Page 8: DDC2011 - Association

Search by Clicking

Provider’s Intention

Page 9: DDC2011 - Association

Guide QueryUser Query

Page 10: DDC2011 - Association

Search by Clicking?

In response to user action

Page 11: DDC2011 - Association

Guide QueryUser Query

User-Intended Guide Query

Page 12: DDC2011 - Association

Why?Correctness Ease to use Business

Page 13: DDC2011 - Association

Suggest Speller Association

Page 14: DDC2011 - Association

Anatomy of Association

Page 15: DDC2011 - Association

101Introduction to Association

Page 16: DDC2011 - Association

Abstraction

Apple

Page 17: DDC2011 - Association
Page 18: DDC2011 - Association

Clustering & Diversifying

Page 19: DDC2011 - Association
Page 20: DDC2011 - Association
Page 21: DDC2011 - Association

Plausible (Fishable?) Options

Page 22: DDC2011 - Association
Page 23: DDC2011 - Association

AssociationAssociated words with the queryAnswers to the queryAdditional information for the queryQuery expansion or contractionQuery correction/reformulationQuery patternRecent issues related to the query

Page 24: DDC2011 - Association

201Construction of Associations

Page 25: DDC2011 - Association

Link. Sink. Rank.

Page 26: DDC2011 - Association

L

S

R

• Keywords in sequential search• Click keywords of same doc.• Query keywords that display same doc.• Keywords from same documents• Contents & rule-based keywords

• Taboo keywords• Incorrect/mis-typed keywords• Morphologically-identical keywords• Representative keywords (+)

• More connections get more relevance• Click-through rate• Business-intensified keywords• Human-intervention

Page 27: DDC2011 - Association

Sequential Keywords

Page 28: DDC2011 - Association

Asso

ciated

Not As

socia

ted

Page 29: DDC2011 - Association

{���} → {���}

Page 30: DDC2011 - Association

Click Keywords

Page 31: DDC2011 - Association

Click Keywords• � ��• � �� ��• �� ���• � �� ���• �� ���• �� �"��• ...

{� ��, � �� ��}{� ��, �� ���}

{� ��, � �� ���}…

{� �� ���, �� �"��}{ �� ���, �� �"��}

Page 32: DDC2011 - Association

{� ��, �� ���}{� ��, �� ���}

{� ��, �� ���}{� ��, �� ���}

{� ��} → { �� ���}

Page 33: DDC2011 - Association

Query Keywords

Page 34: DDC2011 - Association

{�!� ���} → {�!� ���}

Page 35: DDC2011 - Association

SK : CK : QK = 70% : 10% : 20%

Page 36: DDC2011 - Association

FilteringAdult keywordsCopyright keywordsPrivacy keywords / Personal informationIncorrect/mis-typed keywords (with Speller)

Morphologically-identical keywordsSame keystrokes (i.e., Korean ⬌ English)Guide/Operation keyword pairsBanned: User requests (C/S)

Page 37: DDC2011 - Association

Collective Intelligence

More is Better

Page 38: DDC2011 - Association

301Advanced Topics I: Extension

Page 39: DDC2011 - Association

ExtensionsProperty Description

Symmetric A → B then B → A

Transitive A → B → C then A → C

Triangular (A → C) & (B → C) then A → C(A → B) & (A → C) then B → C

Inclusive A ⊃ B → C then A → CA ≈ B → C then A → C

Page 40: DDC2011 - Association

Me

C1 C2 C3 Cn

P

S1 S2 Sn

G

U1 U2 Un

Page 41: DDC2011 - Association

Contents & Properties(keep working)

Page 42: DDC2011 - Association

401Advanced Topics II: System & Service

Page 43: DDC2011 - Association

In Service

DB

Operation: Daum ServiceAnalytics: SAS System

Index

Daily update 24h MNT

25M

Page 44: DDC2011 - Association
Page 45: DDC2011 - Association

Real-time Adaptive Systemwith MapReduce

Page 46: DDC2011 - Association

CARTS

CoverageAccuracyRobustnessTimelinessSerendipity

Page 47: DDC2011 - Association

4M