mor naaman, yee jiun song, andreas paepcke, hector garcia-molina digital library project, database...

49
Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia- Molina Digital Library Project, Database Group Stanford University Automatic Organization for Digital Photographs with Geographic Coordinates

Post on 15-Jan-2016

224 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina Digital Library Project, Database Group Stanford University Automatic Organization for

Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina

Digital Library Project, Database Group

Stanford University

Automatic Organization for Digital Photographs with Geographic Coordinates

Page 2: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina Digital Library Project, Database Group Stanford University Automatic Organization for

JCDL 2004 2

Geo-Referenced Photos

April 8th, 2004 1:20:02pm

Latitude: N34.3121

Longitude: W122.234

Page 3: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina Digital Library Project, Database Group Stanford University Automatic Organization for

JCDL 2004 3

Geo-Photography Technology

+1) 2)

Page 4: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina Digital Library Project, Database Group Stanford University Automatic Organization for

JCDL 2004 4

Personal Photo Libraries

• Searching/browsing very difficult

• Little discernible structure to photo collections

Page 5: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina Digital Library Project, Database Group Stanford University Automatic Organization for

JCDL 2004 5

• Content-based retrieval– Basic, primitive (far from semantic)

• Manual labeling– Improved, yet cumbersome

• Visual methods for fast scanning (Zoom)– Don’t scale well

Managing Personal Photos

Page 6: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina Digital Library Project, Database Group Stanford University Automatic Organization for

JCDL 2004 6

Our Approach

• Absolutely no human effort required

• Utilize time and location– Automatically captured – Easy to get

Page 7: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina Digital Library Project, Database Group Stanford University Automatic Organization for

JCDL 2004 7

Automatic Organization

Page 8: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina Digital Library Project, Database Group Stanford University Automatic Organization for

JCDL 2004 8

Automatic Organization

Page 9: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina Digital Library Project, Database Group Stanford University Automatic Organization for

JCDL 2004 9

Automatic Organization

Page 10: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina Digital Library Project, Database Group Stanford University Automatic Organization for

JCDL 2004 10

Automatic Organization

Page 11: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina Digital Library Project, Database Group Stanford University Automatic Organization for

JCDL 2004 11

Outline

• Requirements and challenges

• The algorithms

• Sample output

• Experiment results

Page 12: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina Digital Library Project, Database Group Stanford University Automatic Organization for

JCDL 2004 12

Browsing by Location/Time

• Use a map/calendar– wwmx.org from MSR:

• Map issues – Lots of screen space– Sparse – Limited interaction?– Not intuitive for some

Page 13: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina Digital Library Project, Database Group Stanford University Automatic Organization for

Using Hierarchies

Time

United States

Yosemite N.P, Yosemite Valley, CA

Location:Around: San Francisco, Berkeley, Sonoma CA

San Francisco, Golden Gate Park, CA

Seattle, WA

……

Berkeley,

Oakland CA

2003-01-01: Yosemite N.P. (2 Days)

2003-01-18: San Francisco (1 hour)

2003-01-18: San Francisco (1 hour)Time:

Page 14: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina Digital Library Project, Database Group Stanford University Automatic Organization for

JCDL 2004 14

Challenges

• Locations should be intuitive

• Events are tricky – 3-days trip to NYC– The kid’s soccer game, followed by a

birthday party

• Good names are important.

Page 15: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina Digital Library Project, Database Group Stanford University Automatic Organization for

JCDL 2004 15

Outline

• Requirements and challenges

• The algorithms

• Sample output

• Experiment results

Page 16: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina Digital Library Project, Database Group Stanford University Automatic Organization for

JCDL 2004 16

Process Diagram

Page 17: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina Digital Library Project, Database Group Stanford University Automatic Organization for

JCDL 2004 17

Discovering Structure

Location Hierarchy

Initial Event Segmentation

Location Clustering

Final Event Segmentation

Event Hierarchy

Initial Event Segmentation

Automatic Organization

Page 18: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina Digital Library Project, Database Group Stanford University Automatic Organization for

JCDL 2004 18

Initial Event Segmentation

• Photos occur in bursts

• Identify bursts: semantically “connected”

Page 19: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina Digital Library Project, Database Group Stanford University Automatic Organization for

JCDL 2004 19

Initial Event Segmentation

Stream of photos

More details: •Graham et al, JCDL 2002•Tomorrow•Proceedings

Page 20: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina Digital Library Project, Database Group Stanford University Automatic Organization for

JCDL 2004 20

Discovering Structure

Location Hierarchy

Initial Event Segmentation

Final Event Segmentation

Event Hierarchy

Location Clustering

Automatic Organization

Page 21: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina Digital Library Project, Database Group Stanford University Automatic Organization for

JCDL 2004 21

Location Clusters

• Cluster the bursts into locations

• A. Gionis and H. Mannila. Finding recurrent sources in sequences. In Proceedings, Computational molecular biology 2003.– Minimize: number of clusters– Minimize: error (distance to cluster centers)

Page 22: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina Digital Library Project, Database Group Stanford University Automatic Organization for

Photo location

Location Clusters: 2-D View

Page 23: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina Digital Library Project, Database Group Stanford University Automatic Organization for

2-D View: with Bursts

Page 24: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina Digital Library Project, Database Group Stanford University Automatic Organization for

JCDL 2004 24

Location Clusters

Location4 -

Location3 -

Location2 -

Location1 -

Page 25: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina Digital Library Project, Database Group Stanford University Automatic Organization for

Location4 -

Location3 -

Location2 -

Location1 -

Location Clusters (breakdown)

• Some clusters may be overloaded:– Many bursts / picture-taking days in one location

San Francisco

Page 26: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina Digital Library Project, Database Group Stanford University Automatic Organization for

JCDL 2004 26

Discovering Structure

Location Hierarchy

Initial Event Segmentation

Location Clustering

Event Hierarchy

Final Event Segmentation

Automatic Organization

Page 27: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina Digital Library Project, Database Group Stanford University Automatic Organization for

JCDL 2004 27

Final Event Segmentation

• Again scan sequence, new events detected:– Whenever location context changes– In the same location, use adaptive time

threshold

Page 28: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina Digital Library Project, Database Group Stanford University Automatic Organization for

JCDL 2004

Final Event Segmentation

Overnight trip to Yosemite

Soccer game and dinner

Page 29: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina Digital Library Project, Database Group Stanford University Automatic Organization for

JCDL 2004 29

Next - names

• Detected location and event structure

• Need to choose names for each node

Page 30: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina Digital Library Project, Database Group Stanford University Automatic Organization for

30

Assigning Names

Photo location

Stanford

Palo Alto City Park

Palo AltoButano State Park

Stanford 42

Palo Alto 30

Butano 10

P.A. park 8

Page 31: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina Digital Library Project, Database Group Stanford University Automatic Organization for

31

Assigning Names – Nearby?

San Jose, 20 miles

San Francisco, 30 milesWhat if photos occur sparsely within cities or parks?

Page 32: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina Digital Library Project, Database Group Stanford University Automatic Organization for

JCDL 2004

Assigning Names - Nearby

Which city has stronger “gravity”?

Page 33: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina Digital Library Project, Database Group Stanford University Automatic Organization for

JCDL 2004

Assigning Names - Nearby

San Jose is Closer

Page 34: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina Digital Library Project, Database Group Stanford University Automatic Organization for

JCDL 2004

Assigning Names - Nearby

San Jose is bigger**larger population

Page 35: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina Digital Library Project, Database Group Stanford University Automatic Organization for

JCDL 2004

Assigning Names - Nearby

But San Fran is more important!**greater Google count

Final name for location cluster:

“Stanford, 30 miles South of SF”

Page 36: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina Digital Library Project, Database Group Stanford University Automatic Organization for

JCDL 2004 36

Assigning Names - Alexandria

• Using polygon-based dataset of administrative areas

• Alexandria gazetteer can be used for other prominent geographic features

Page 37: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina Digital Library Project, Database Group Stanford University Automatic Organization for

JCDL 2004 37

Outline

• The requirement and challenges of automatic organization

• The algorithms

• Sample output

• Experiment results

Page 38: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina Digital Library Project, Database Group Stanford University Automatic Organization for

JCDL 2004 38

Location Hierarchy

Photoshop Album (at least 4 man-hours)

Our system (about 0 man-seconds)

Page 39: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina Digital Library Project, Database Group Stanford University Automatic Organization for

39

Location Hierarchy (US)

+San Francisco, Berkeley, Sonoma, CA-Stanford, Mountain View, Monterey, CA

•Monterey (58 miles S of San Jose) •Mountain View (4 miles NW of San Jose) •Stanford

-Colorado (219 miles W of Denver)-Long Beach (35 miles S of Los Angeles, CA)-Philadelphia, PA-Seattle, WA-Sequoia N.P. (153 miles E of Fresno, CA)-South lake Tahoe; Bear Valley, CA-Yosemite N.P.; Yosemite Valley, CA 

Page 40: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina Digital Library Project, Database Group Stanford University Automatic Organization for

Events

about 0 man-seconds:

...

2003-06-28: Long Beach,CA (3 days)

2003-07-04: San Francisco,CA (3 hours)

2003-07-10: Colorado (3 days)

2003-07-15: San Francisco,CA(1 hours)

2003-07-18: Mountain View,CA (5 hours)

2003-07-27: San Francisco,CA (1 hours)

2003-09-28: Philadelphia,PA (1 hours)

2003-10-03: Sequoia NP (3 days)

...

Photoshop Album (at least 4 man-hours)

Page 41: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina Digital Library Project, Database Group Stanford University Automatic Organization for

JCDL 2004 41

Event Names

• LOCALE: share automatically

• Check personal calendar

• Event Gazetteer

• Easy interface

Page 42: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina Digital Library Project, Database Group Stanford University Automatic Organization for

JCDL 2004 42

Experiment

• Tested on 3 real-world geo-referenced photo collections

• Our system automatically generated the structure and names

• Tested with the owners

Page 43: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina Digital Library Project, Database Group Stanford University Automatic Organization for

JCDL 2004 43

Experiment - Locations

• Accepted the automatic hierarchy

• Only minor edits requested– Merge/split few of the locations

Page 44: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina Digital Library Project, Database Group Stanford University Automatic Organization for

JCDL 2004 44

Experiment - Events

• Compared to events as annotated by users

• 80-85% in both recall and precision

• Other metrics proposed (see paper)

Page 45: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina Digital Library Project, Database Group Stanford University Automatic Organization for

JCDL 2004 45

Experiment - Naming

• Naming location clusters– For 76% of clusters, system and users pick

at least one name in common– For the rest, “automatic” name was useful

Page 46: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina Digital Library Project, Database Group Stanford University Automatic Organization for

Not yet published:

• Paid 13 participants to “geo-reference” their photos• Loaded to WWMX and our browser

– Most liked the map better, but…– Performed the same for search/browse tasks– Event notion helps overcome location handicap– Organization “made sense”

P.S. Some didn’t touch the map, yet used our location hierarchy.

P.S.2 This was on a BIG screen!

Page 47: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina Digital Library Project, Database Group Stanford University Automatic Organization for

JCDL 2004 47

Thank You!

More details:

Proceedings

Google: Mor Naaman

[email protected]

http://www-db.stanford.edu/~mor/

Page 48: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina Digital Library Project, Database Group Stanford University Automatic Organization for

JCDL 2004 48

Future Work

• User interface

• PDA

• Integrate with map

• Global photo libraries

Page 49: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina Digital Library Project, Database Group Stanford University Automatic Organization for

JCDL 2004 54

Remember The Bursts?