motivation - university of...

73
Motivation Where is my W-2 Form?

Upload: others

Post on 16-Jul-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

Motivation

Where is my W-2 Form?

Page 2: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

Video-based Tracking

Camera view of the desk

Camera

Overhead video camera

Page 3: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

Example

40 minutes, 1024x768 @ 15 fps

Page 4: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

System Overview

UserInput Video

Page 5: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

System Overview

User

Internal Representation

Video Analysis

Vision Engine

Desk Desk

T T+1

Input Video

Page 6: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

Query Interface

System Overview

Query(where is my W-2 form?)

Internal Representation

Input Video User

Video Analysis

Desk Desk

T T+1

Page 7: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

System Overview

Query(where is my W-2 form?)

Internal Representation

Input Video User

Video Analysis

Answer

Query Interface

Desk Desk

T T+1

Page 8: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

System Overview

User

Internal Representation

Video Analysis

Input Video

Vision Engine

Desk Desk

T T+1

Page 9: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

Vision Problem

… …

Page 10: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

Vision ProblemEvent

… …

Page 11: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

Vision Problem

Event

… …

Desk

Page 12: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

Vision Problem

Desk

… …

Event

… …

Desk

Page 13: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

Vision Problem

… …tut-article.pdf

sanders01.pdf

objectspaces.pdf kidd94.pdf

lowe04sift.pdf

Event

… …

Desk Desk

Page 14: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

Vision Problem

… …

Event

Scene Graph(DAG)

… …

tut-article.pdf

sanders01.pdf

objectspaces.pdf kidd94.pdf

lowe04sift.pdf

Desk Desk

Page 15: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

Assumptions

• Simplifying– Corresponding electronic copy exists

Page 16: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

Assumptions

• Simplifying– Corresponding electronic copy exists– 3 event types: move/entry/exit– One document at a time– Only topmost document can move– No duplicate copies of same document

Page 17: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

Assumptions

• Simplifying– Corresponding electronic copy exists– 3 event types: move/entry/exit– One document at a time– Only topmost document can move– No duplicate copies of same document

• Other– Desk need not be initially empty

Page 18: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

Algorithm OverviewInput

Frames… …

Event Detection

Event Interpretation

Document Recognition

Scene Graph Update

Page 19: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

Algorithm OverviewInput

Frames… …

Event Detection

Event Interpretation

Document Recognition

before after

Scene Graph Update

Page 20: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

Algorithm OverviewInput

Frames… …

Event Detection

Event Interpretation

“A document moved from (x1,y1) to (x2,y2)”

Document Recognition

before after

Scene Graph Update

Page 21: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

Algorithm OverviewInput

Frames… …

Event Detection

Event Interpretation

“A document moved from (x1,y1) to (x2,y2)”

Document Recognition

before after

File1.pdf

File2.pdf

File3.pdf

Scene Graph Update

Page 22: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

Algorithm OverviewInput

Frames… …

Event Detection

Event Interpretation

“A document moved from (x1,y1) to (x2,y2)”

Document Recognition

before after

File1.pdf

File2.pdf

File3.pdf

Scene Graph Update

Page 23: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

Algorithm OverviewInput

Frames… …

Event Detection

Event Interpretation

“A document moved from (x1,y1) to (x2,y2)”

Document Recognition

before after

File1.pdf

File2.pdf

File3.pdf

Scene Graph Update

Page 24: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

Event Detection

… …

Page 25: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

Event Detection

time

Frame Difference

… …

Page 26: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

Event Detection

time

Threshold

Event Frames

time

… …

Frame Difference

Page 27: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

Event Detection

before after

Event Frames

… …

Page 28: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

Algorithm OverviewInput

Frames… …

Event Detection

Event Interpretation

“A document moved from (x1,y1) to (x2,y2)”

Document Recognition

before after

File1.pdf

File2.pdf

File3.pdf

Scene Graph Update

Page 29: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

Event Interpretation

Move

before after

Page 30: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

Event Interpretation

Move

Entry

before after

Page 31: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

Event Interpretation

Move

Entry

Exit

before after

Page 32: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

Event Interpretation

Move

Entry

Exit

Motion: (x,y,θ)

before after

Page 33: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

Event Interpretation

Move

Entry

Exit

1. Move vs. Entry/Exit

before after

Page 34: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

Event Interpretation

Move

Entry

Exit

2. Entry vs. Exit

before after

Page 35: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

Event Interpretation

• Use SIFT [Lowe 99]

– Scale Invariant Feature Transform– Distinctive feature descriptor– Reliable object recognition

Page 36: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

Move vs. Entry/Exit

before after

Page 37: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

Move vs. Entry/Exit

before after

Page 38: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

Move vs. Entry/Exit

before after

Page 39: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

Move vs. Entry/Exit

before after

Page 40: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

Move vs. Entry/Exit

before after

Page 41: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

Move vs. Entry/Exit

before after

Page 42: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

Move vs. Entry/Exit

before after

Page 43: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

Move vs. Entry/Exit

before after

Page 44: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

Move vs. Entry/Exit

before after

Page 45: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

Move vs. Entry/Exit

before after

Page 46: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

Move vs. Entry/Exit

before after

Page 47: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

Move vs. Entry/Exit

before after

Motion: (x,y,θ)

Page 48: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

Entry vs. Exit

before after

Example 1 (entry)

Page 49: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

Entry vs. Exit

before after

Example 1 (entry)

Page 50: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

Entry vs. Exit

before after

Example 1 (entry)

Page 51: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

Entry vs. Exit

before after

File1.pdf File2.pdf File3.pdf File4.pdf File5.pdf File6.pdf

Page 52: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

Entry vs. Exit

before after

Example 2 (entry)

Page 53: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

Entry vs. Exit

before after

File1.pdf File2.pdf File3.pdf File4.pdf File5.pdf File6.pdf

Page 54: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

Entry vs. Exit

before after

Page 55: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

Entry vs. Exit

before after

?

Page 56: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

Entry vs. Exit

before after

?

Page 57: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

Entry vs. Exit

before after

……

Page 58: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

Entry vs. Exit

before after

……

Amount of change

time

Page 59: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

Algorithm OverviewInput

Frames… …

Event Detection

Event Interpretation

“A document moved from (x1,y1) to (x2,y2)”

Document Recognition

before after

File1.pdf

File2.pdf

File3.pdf

Scene Graph Update

Page 60: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

Document Recognition

File1.pdf File2.pdf File3.pdf File4.pdf File5.pdf File6.pdf

• Match against PDF image database

Page 61: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

Document Recognition

File1.pdf File2.pdf File3.pdf File4.pdf File5.pdf File6.pdf

• Match against PDF image database

• Performance– Can differentiate between ~100 (or more) documents– ~200x300 pixels per document for reliable match

Page 62: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

Algorithm OverviewInput

Frames… …

Event Detection

Event Interpretation

“A document moved from (x1,y1) to (x2,y2)”

Document Recognition

before after

File1.pdf

File2.pdf

File3.pdf

Scene Graph Update

Page 63: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

Scene Graph Update

before after

Motion: (x,y,θ)

Desk

Page 64: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

Scene Graph Update

before after

Motion: (x,y,θ)

Desk

Page 65: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

Scene Graph Update

before after

Motion: (x,y,θ)

Desk Desk

Page 66: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

Scene Graph Update

before after

Motion: (x,y,θ)

Desk

?

Page 67: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

Scene Graph Update

before after

Motion: (x,y,θ)

Desk Desk

Page 68: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

Scene Graph Update

Desk DeskDeskDesk

Page 69: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

Scene Graph Update

Desk DeskDeskDesk

Page 70: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

Photo Sorting Example

Page 71: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

Current Directions

• Handle more realistic desktops• Speed up processing time• Other useful functionalities

– Written annotation– Version management– Bookmark– Multi-user queries

Page 72: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

For More Information

• Our publications– Jiwon Kim, Steven M. Seitz, Maneesh Agrawala. The Office of

the Past. IEEE Workshop on Real-time Vision for HCI, 2004.– Jiwon Kim, Steven M. Seitz, Maneesh Agrawala. Video-based

Document Tracking: Unifying Your Physical and Electronic Desktops. To appear in Proceedings of UIST, 2004.

• Other related work– David G. Lowe. Distinctive image features from scale invariant

keypoints. International Journal of Computer Vision, 2004.– Pierre Wellner. Interacting with paper on the DigitalDesk.

Communications of the ACM, 36(7):86.97,1993.– D. Rus and P. deSantis. The self-organizing desk. In

Proceedings of International Joint Conference on Artificial Intelligence, 1997.

Page 73: Motivation - University of Washingtongrail.cs.washington.edu/projects/office/doc/rtv4hci_dtracking_talk.pdfFile1.2.3.4.5.6.pdf • Match against PDF image database • Performance

For More Information

• Our publications– Jiwon Kim, Steven M. Seitz, Maneesh Agrawala. The Office of

the Past. IEEE Workshop on Real-time Vision for HCI, 2004.– Jiwon Kim, Steven M. Seitz, Maneesh Agrawala. Video-based

Document Tracking: Unifying Your Physical and Electronic Desktops. To appear in Proceedings of UIST, 2004.

• Other related work– David G. Lowe. Object recognition from local scale-invariant

features. International Conference on Computer Vision, 1999.– Pierre Wellner. Interacting with paper on the DigitalDesk.

Communications of the ACM, 36(7):86.97,1993.– D. Rus and P. deSantis. The self-organizing desk. In

Proceedings of International Joint Conference on Artificial Intelligence, 1997.