realityflythrough: harnessing ubiquitous video neil j. mccurdy department of computer science and...

RealityFlythrough:Harnessing Ubiquitous Video

Neil J. McCurdy

Department of Computer Science and Engineering

University of California, San Diego

2

What is ubiquitous video?• Cameras fading into the woodwork (evokes [Weiser])• Networked cameras in “the wild”

• Can these cameras support remote exploration?• RealityFlythrough makes it possible

Virtual mobility for disabled Any-angle stadium viewsPre-drive driving directions Virtual shopping My-day diaries

3

The need for an abstraction

To make remote exploration feel like local exploration, we need a camera at every position and orientation

Move the Camera• “Telepresence” does this

by moving the camera• Mimics walking through

space– Think Mars Explorer

Challenge: Mobility is limited by the physicality of the robot Camera

Field of View

4

Telepresence solutions

Hall, Trivedi(Mobile Interactive Enviroments)

Paulos, Canny

NASA JPL

5



Interpolate (Tele-reality)• If there are enough cameras,

construct “novel views”• Reconstructs scene geometry

using vision techniques– Think “Matrix Revolutions”

Challenge: Requires precise knowledge of camera locations and optics properties. It’s slow.

6



Interpolate (Tele-reality)• If there are enough cameras,

construct “novel views”• Reconstructs scene geometry

using vision techniques– Think “Matrix Revolutions”

Challenge: Requires precise knowledge of camera locations and optics properties. It’s slow.

Novel Views

7

Tele-reality solutions

Zitnick, Kang…, Szeliski (2004)Microsoft Research

Kanade (1997)

8



Use panoramic cameras• 360° view from static

location• Virtual pan/tilt/zoom

Challenge: How do you stitch together multiple panoramas?

9

Panoramic camera solutions

Chen (1995)Quicktime VR

Hall, Trivedi (2002)

10



Combine VR and Reality• Pre-acquire a model• Project live video onto

the model

Challenge: What happens when the model is no longer accurate? What can realistically be modeled?

?

User’s view

11

Augmented virtual reality solutions

Neumann, et al. (2003)USC

12


Challenges of ubiquitous video

• Camera density is low

• Environment is dynamic– People and objects are moving– Sensors are moving

• Environment is uncalibrated– Geometry of the

environment is not known– Sensors are inaccurate

• Need data live and in real-time

we need a camera at every position and orientation

San Diego MMST Drill, May 12, 2005

13

Roadmap

The need for an abstraction• Need a camera at every point in

space• Challenges of ubiquitous video

Building the RealityFlythrough abstraction

• Motion as a substitute for ∞ cameras• Choosing what (not) to show• Handling dynamic environments• Archiving live imagery

Evaluating the abstraction• Usability• Robustness to change• Scalability

14

Simplifying 3d space

• We know the location and orientation of each camera

• From a corresponding location in virtual space we project the camera’s image onto a virtual wall

• When the user’s virtual position is the same as the cameras, the entire screen is filled with the image

• Results in a 2d simplification of 3d space

Motion as a substitute for ∞

cameras

15

The transition

• A transition between cameras is achieved by moving the user’s location from the point of view of the source camera to the point of view of the destination camera

• The virtual walls are shown in perspective• Overlapping portions of images are alpha-blended


cameras

16

Why transitions are effective

• Humans commit closure [McCloud]

– Visual cortex automatically makes sense of incomplete information

– Eg. Blind spots

• Transitions reveal rather than conceal inaccuracies– Overlaps help us make

sense of imagery– Orientation accuracy important

• Transitions provide the following cues– Motion, speed, filler images,

grid-lines

Key assumption: User is largely content to directly view a camera’s image, or is in transition to

another camera


cameras

17

Non-intersecting camera views

• Pacing and gridlines help• Intervening space can be filled with other camera

views– Either other live cameras or archived imagery (discussion

in a moment)


cameras

18

Choosing what (not) to show

• How do we decide which cameras to choose• There are no obvious choices along the path• What if we just show all of them?

Choosing what(not) to show

19

We project wherewe will be in the future

We choose the bestcamera at that location

Fitness functions: Proximity, Screen fill Liveness, Recency

The trick is to limit what is displayed

Heuristics for choosing cameras• Current image should stay in view for as long as possible• Once the destination image is visible, choose it• There should be a minimum duration for subtransitions

Choosing what(not) to show

20

Roadmap






21

The destination camera moved!

Computing the path and the cameras to display at the start of the transition does not workProblem 1: Destinationmay be a moving target

Problem 2: Interveningcameras may not beoptimal

Handling Dynamic Environments

22

Handling dynamic environments

Step 1:Paths need to be dynamic

Step 2:Cameras need to be selected just-in-time


23

There are still some problems

Problem 1: Course correction is

too disorienting

Problem 2:Too many dimensions of movement– User’s movement

(x,y,z)– Camera’s movement– Scene movement

What we tried:• Paths need to be

dynamic• Cameras need to be

selected just-in-time


24

Our current approach

Problem 1: Course correction is

too disorientingProblem 2:

Too many dimensions of movement

Solutions:First move to where the camera was. Then quickly capture the moving target

Pause the live video whenever it’s visible and play at increased speed until we’re back to live action


25

Roadmap






26

Archiving live imagery

Why do it?• Still-images generated

from live video feeds increase camera density

• Help us create the illusion of infinite camera coverage

Competing desires• Maximal camera density• Quality images

– Still-images act as the anchors in a sea of confusing movement

Pink: Live camerasGray: Still-images

27

Archiving live imagery

How do we do it?• Each frame from every camera

is considered• Sensor data (location,

orientation) is validated for accuracy

• Images are assigned a quality based on possible blurriness (eg. high position delta)

What is stored• The most recent highest quality

image, for a particular location(eg. 1 meter² with a 15 degree arc)

• The image is treated as if it was a non-moving camera

28

Roadmap






29

Scalability

802.11

H323 Video ConferencingStreamBottleneck 1:

10 stream max(Fewer with higher FPS)

Bottleneck 2:112 streammax decode

MCU(Multipoint Control Unit)

RFT Engine

Cameras

ImageCapture SensorCapture

StreamCombine (352x288 videoresolution)

X

Server

30

Scalability

Bottleneck 2:112 streammax decode


RFT Engine

Bottleneck 3:15 streams max

(no image archive)


RFT Engine(w/ image archive)

Conclusion: It is the number of live cameras, and not the total number of cameras that is the immediate bottleneck

550 archive“cameras”

31

Recap

• RealityFlythrough creates the illusion of infinite cameras

• Possible despite the challenges of ubiquitous video– Camera density is low– Environment is dynamic– Environment is uncalibrated– Need data live and in real-time

• We do this by– Using motion as a substitute

for ∞ cameras– Choosing the best imagery to show– Selecting imagery and path just-in-time– Using archived live imagery

to increase camera density

32

Research questions

• Can we harness ubiquitous video cameras live and in real-time?

• How well does the RealityFlythrough illusion work?• Why does the illusion work?• Can RealityFlythrough be a real solution for real

problems?• Can RealityFlythrough be a general solution?

33

How well does the RealityFlythrough illusion work?

1st attempt (CHI 2005)• Subjects were shown 10 short

transitions and 10 image pairs w/o transitions and had to select a birdseye depiction that best represented their position in the space

• 86.67% of subjects had a greater or equal score on transition questions

• Success rate on answering transition questions increased as subjects saw more transitions– Subjects also increased answering

speed– Comprehension for “expert users”

was shown to approach 100%– In the location familiar to the

subjects the second to last and last questions were answered correctly 93.33% and 100% of the time respectively

– Demonstrates that transition comprehension is a learnable skill

34


Take 2• We should be able to show close to

100% comprehension– Have subjects act out movement– Remove requirement of translation

from 3d to 2d birdseye representation

• Once we can do this, we can answer other questions

• Can we generate an experience that is better than being there? – As in Jim Hollan and Scott Stornetta’s Beyond Being There– Does the increased speed of movement and lack of physical

boundaries actually make understanding a large area easier? Faster?

– If we overlay simple meta-data does this affect the way people understand a space (compass directions, degree separations, etc.)?

35

Why does the illusion work?

• Is closure really what is happening?• What is closure?• What are the limits of closure?

– How can we overcome these limitations?

• Do people experience RealityFlythrough closure differently?– Is it a learnable skill?

• What effect does prior knowledge of a space have?• What spatial skills are required?• How much does a map help with comprehension?

36

Can RealityFlythrough be a real solution for real problems?

Disaster response incident command support• Continue our work with the San Diego MMST (Nov 15 Drill)• Communication challenges

– Access Point handoff– Automatic re-establishment of communication while maintaining

state– Have RTP streams play nicely in network

• Exponential backoff as with TCP• Do archive calculations on client and only transmit archive-ready frames

– Store and forward?• Integration of full-fidelity streams when cameras return to base

• User interface challenges– Generate a UI that is usable by a novice so we can have a real user

• A less desirable alternative is to have the user direct an expert

– Generate a UI that targets the requirements of incident command

37

Evaluation is the biggest challenge

Evaluation of system metrics• How well did the system perform• Were we able to achieve the desired frame rates?• How long did it take to construct a 3d representation of the

space?• How recent were the images on average?

Evaluation of the experience• How do we know that we have succeeded?• What will be possible to measure?

– The MMST drills are not controlled environments– Most likely we will only have an n of 1

• First show that we have perturbed the incident command’s practice

• We then want to show that the system can be better– User reaction– Time to understand the environment– Fidelity of understanding

38

Is RealityFlythrough a general solution?

• In my research exam I argued that there is no such thing as a general presence solution– Presence (ie. first-person immersion) may be a desirable

interface, but the requirements of the task and the user should dictate the quantity and quality of presence

– More presence is not necessarily always better• The telephone is in some ways better than a face-to-face

communication

• RealityFlythrough is not a general presence solution, but it may be a general remote exploration solution

• To show this, I will look at three distinct application domains– Disaster Response (SWAT, military, surveillance)– Semi-live entertainment/convenience (tourism, pre-drive,

shopping, sports)– Static space-browsing (real-estate, emergency room

orientation)

39

Bottom linePaper 1• Mobisys 2005 (completed)

Paper 2• CHI 2006 (Sep 23, 2005) or UIST

2006 (April 1, 2006)• How well does the illusion work

along with analysis of why it works– Closure explored

• Spin: We have a novel user interface for harnessing ubiquitous video that offloads processing requirements to the human visual cortex

Paper 3• CSCW 2006 (~Mar, 2006)• Evaluation of RealityFlythrough as

a tool for Disaster Response

Thesis Defense• Summer 2006

Can we harness ubiquitous video cameras live and in real-time?



Can RealityFlythrough be a realsolution for real problems?

Can RealityFlythrough be a general solution?

40

Possible additional work

• Implement “virtual camera metaphor” – Contrasts with the hitch-hiking metaphor described so far– Abstraction stretched to support “best” views from any

point in space– Novel views, but dynamically updated

• Integrate high-level information that is present in the birdseye view into the first-person view

• Support sound• Scale to multiple viewers with multiple servers

User’s DesiredViewArchived

imagery

41

Possible additional work

• Navigate through time as well as space• Use space as an index into time• Web-based client

42

Questions?Paper 1• Mobisys 2005 (completed)

Paper 2• CHI 2006 (Sep 23, 2005) or UIST

2006 (April 1, 2006)• How well does the illusion work

along with analysis of why it works– Closure explored

• Spin: We have a novel user interface for harnessing ubiquitous video that offloads processing requirements to the human visual cortex

Paper 3• CSCW 2006 (~Mar, 2006)• Evaluation of RealityFlythrough as

a tool for Disaster Response

Thesis Defense• Summer 2006

Can we harness ubiquitous video cameras live and in real-time?



Can RealityFlythrough be a realsolution for real problems?

Can RealityFlythrough be a general solution?

44

Evaluation is the biggest challenge

Do we have to build the system?• Could a contrived experiment do the same where a

camera operator (acting as a robot) moves where the user desires?

• This could imitate telepresence, but what about the affordances non-telepresence solutions offer? – how are boundaries crossed? – how do we move instantly (or rapidly) across space? – how could we add augmented reality qualities to the

system?

• The sum of the parts does not equal the whole. • We can create a much richer experience than what

telepresence can offer.

45

The illusion of infinite camera coverage

Camera

PhysicalCameraVirtualCamera

CameraWithState PositionSource

ImageSource

EnvironmentState

1

*

1

1

Modelin MVC

1

1

1

1

46

RealityFlythrough Engine

EnvironmentState(Model)

ViewsController

CameraRepository

StillImageGen

TransitionPlanner

TransitionExecuter

H323ConnectionManager

<< uses

1st Topic

2nd

Topic

3rd

Topic

47

Transitions

TransitionPlanner

PlannerSimple PlannerBestFit

TransitionExecuter

Path

PathStraight

FitnessFunctor

OverallFitnessProxFitnessProxFitnessKinds of Fitness

Generates>>

1

1

1

1

1

1

1

*

ProximityFitness,LivenessFitness,

...

48

Related Work

• Use model so can approximate photorealism– Pre-acquired Model

• Neumann, et al. [1] w/ Augmented Virtual Environments• Only works w/ static structures

– Acquire model from image data• Preprocess still imagery

– Szeliski [2]– Chen [3] w/ Quicktime VR

• Know exact camera locations– Kanade [4] w/ Virtualized Reality

49

How are these images related?

50

How are these images related?

51

Like this!

realityflythrough: harnessing ubiquitous video neil j. mccurdy department of computer science and...

Documents

remote exploration

camera telepresence

local exploration

camera mimics

usc slide

novel views slide

san diego slide

inaccurate need data