ecir 2013 keynote - time for events

Post on 28-Nov-2014

2.619 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

 

TRANSCRIPT

time for events telling the world’s stories from social media

Mor Naaman Rutgers SC&I & Mahaya, Inc.

@informor

enter: social media

(JCDL 2007)

(JCDL 2007)

yes.

(SIGIR 2007)

organize the world’s memories

people, together

BYOBW

outside lands festival

organize the world’s memories

objectives d

ete

ct

ide

nti

fy

org

an

ize

objectives d

ete

ct

ICWSM 2011a JASIST 2011 WebDB 2009 SIGIR 2007

objectives

ide

nti

fy

WSDM 2012 ICWSM 2011b WSDM 2010

objectives

org

an

ize

ICMR 2012 CHI 2012

CSCW 2012 MTAP 2012 VAST 2010

WWW 2009

today d

ete

ct

org

an

ize

Vox!Multiplayer

multi-site id

en

tify

Vox Civitas

over

view

Multiplayer

Multi-site content E

[with Hila Becker, Luis Gravano]

goal effectively retrieve social media content for known events from multiple services

E

E

challenges event descriptor not well-formed brief textual descriptors noise formats/conventions/metadata differ

E

approach two-step query formulation

precision-based recall-based

validate queries based on known/extracted event model

E

step 1 term extraction from event descriptors generates “high precision” queries e. g. “andrew bird, opening gala, celebrate brooklyn, prospect park”

E E

step 2 use “high precision” corpus to generate more general queries to improve recall e. g. “andrew bird concert”, “state farm insurance”

E E

recall-oriented queries Benefits: - Works cross-site - Works with short content Challenges: - Introduces noise - Potentially large set of queries

E E

post-filtering use known event model (topics, time, location) use queries with a result set that matches known model

E E

for example...

E E

0"20"40"60"80"

100"120"

6/7/11" 6/8/11" 6/9/11" 6/10/11" 6/11/11" 6/12/11" 6/13/11"

[andrew"bird"concert]" [state"farm"insurance]"

5" 5"

4" 4"

39" 36" 34" 34"

9" 8" 8" 7"

0"0.1"0.2"0.3"0.4"0.5"0.6"0.7"0.8"0.9"1"

1.1"

0" 5" 10" 15" 20" 25"

NDC

G%

Number%of%Documents%k%

Precision"

Twi7er8MS"

YouTube8MS"

evaluation query generation relevance of retrieved documents

E

takeaways can aggregate content fragmented across platforms improve recall, not rely on site-specific features

E

Vox Civitas

over

view

Multiplayer

Multi-site content E (WSDM 2012)

[with postdoctoral fellow Nick Diakopoulos]

research questions can Twitter content around broadcast news events inform journalistic inquiry? what insights and analyses can we enable through visual analytic tools?

direct attention to relevant information

automatic content analysis for filtering

– relevance

– uniqueness / novelty

– sentiment

– keyword extraction

supporting analysis

how to evaluate? directly evaluate the output of the algorithms (quantitative)

deep, extensive evaluation of users’ interaction with the system (qualitative)  

read more: Olsen (UIST ’07) Naaman (MTAP ’12)

Vox evaluation goals •  How effective for generating story ideas?

•  What kind of insights/analysis are supported?

•  Shortcomings and how features are used?

takeaways can extract reliable event structure from social media

Vox Civitas

over

view

Multiplayer

Multi-site content E

(VAST 2010)

what the hell?

[with: Lyndon Kennedy, Dan Ellis, Kai Su]

supporting analysis extract the signal from people’s attention: find overlapping moments compute and rank scenes extract scene descriptors

audio fingerprinting

Wang et al. (ISMIR ’03)

two clips, aligned

0:00

0:00 0:18

2:32

3:32

a story of n clips

time

from clips to scenes

time Happy Birthday, Birthday

Higher Ground Encore

evaluation quantitative: evaluated matching, scene extraction… qualitative: evaluated deployment scenario/task

takeaways can create an event presentation that gets better them more content is added

Vox Civitas

over

view

Multiplayer

Multi-site content E

(NM&S 2012, ICMR 2012, MTAP 2012, WWW 2009)

towards better models of large-scale human attention

printing press

è knowledge archive

digital documents

èdigital archive

the web

ènetworked archive

social media

èexperience archive

new methods?

search by subject code?

explore. new information seeking tasks (and models) new applications for social media content

explore.

beyond real-time personal and social

mor@rutgers.edu @informor

http://mornaaman.com

questions?

Luis Gravano Hila Becker Nick Diakopoulos Kai Su Dan Ellis Munmun de Choudhury Tarikh Korula …

thanks

top related