realityflythrough: harnessing ubiquitous video neil j. mccurdy department of computer science and...
Post on 21-Dec-2015
214 views
TRANSCRIPT
RealityFlythrough:Harnessing Ubiquitous Video
Neil J. McCurdy
Department of Computer Science and Engineering
University of California, San Diego
2
What is ubiquitous video?• Cameras fading into the woodwork (evokes [Weiser])• Networked cameras in “the wild”
• Can these cameras support remote exploration?• RealityFlythrough makes it possible
Virtual mobility for disabled Any-angle stadium viewsPre-drive driving directions Virtual shopping My-day diaries
3
The need for an abstraction
To make remote exploration feel like local exploration, we need a camera at every position and orientation
Move the Camera• “Telepresence” does this
by moving the camera• Mimics walking through
space– Think Mars Explorer
Challenge: Mobility is limited by the physicality of the robot Camera
Field of View
4
Telepresence solutions
Hall, Trivedi(Mobile Interactive Enviroments)
Paulos, Canny
NASA JPL
5
To make remote exploration feel like local exploration, we need a camera at every position and orientation
The need for an abstraction
Interpolate (Tele-reality)• If there are enough cameras,
construct “novel views”• Reconstructs scene geometry
using vision techniques– Think “Matrix Revolutions”
Challenge: Requires precise knowledge of camera locations and optics properties. It’s slow.
6
To make remote exploration feel like local exploration, we need a camera at every position and orientation
The need for an abstraction
Interpolate (Tele-reality)• If there are enough cameras,
construct “novel views”• Reconstructs scene geometry
using vision techniques– Think “Matrix Revolutions”
Challenge: Requires precise knowledge of camera locations and optics properties. It’s slow.
Novel Views
7
Tele-reality solutions
Zitnick, Kang…, Szeliski (2004)Microsoft Research
Kanade (1997)
8
To make remote exploration feel like local exploration, we need a camera at every position and orientation
The need for an abstraction
Use panoramic cameras• 360° view from static
location• Virtual pan/tilt/zoom
Challenge: How do you stitch together multiple panoramas?
9
Panoramic camera solutions
Chen (1995)Quicktime VR
Hall, Trivedi (2002)
10
To make remote exploration feel like local exploration, we need a camera at every position and orientation
The need for an abstraction
Combine VR and Reality• Pre-acquire a model• Project live video onto
the model
Challenge: What happens when the model is no longer accurate? What can realistically be modeled?
?
User’s view
11
Augmented virtual reality solutions
Neumann, et al. (2003)USC
12
The need for an abstraction
Challenges of ubiquitous video
• Camera density is low
• Environment is dynamic– People and objects are moving– Sensors are moving
• Environment is uncalibrated– Geometry of the
environment is not known– Sensors are inaccurate
• Need data live and in real-time
we need a camera at every position and orientation
San Diego MMST Drill, May 12, 2005
13
Roadmap
The need for an abstraction• Need a camera at every point in
space• Challenges of ubiquitous video
Building the RealityFlythrough abstraction
• Motion as a substitute for ∞ cameras• Choosing what (not) to show• Handling dynamic environments• Archiving live imagery
Evaluating the abstraction• Usability• Robustness to change• Scalability
14
Simplifying 3d space
• We know the location and orientation of each camera
• From a corresponding location in virtual space we project the camera’s image onto a virtual wall
• When the user’s virtual position is the same as the cameras, the entire screen is filled with the image
• Results in a 2d simplification of 3d space
Motion as a substitute for ∞
cameras
15
The transition
• A transition between cameras is achieved by moving the user’s location from the point of view of the source camera to the point of view of the destination camera
• The virtual walls are shown in perspective• Overlapping portions of images are alpha-blended
Motion as a substitute for ∞
cameras
16
Why transitions are effective
• Humans commit closure [McCloud]
– Visual cortex automatically makes sense of incomplete information
– Eg. Blind spots
• Transitions reveal rather than conceal inaccuracies– Overlaps help us make
sense of imagery– Orientation accuracy important
• Transitions provide the following cues– Motion, speed, filler images,
grid-lines
Key assumption: User is largely content to directly view a camera’s image, or is in transition to
another camera
Motion as a substitute for ∞
cameras
17
Non-intersecting camera views
• Pacing and gridlines help• Intervening space can be filled with other camera
views– Either other live cameras or archived imagery (discussion
in a moment)
Motion as a substitute for ∞
cameras
18
Choosing what (not) to show
• How do we decide which cameras to choose• There are no obvious choices along the path• What if we just show all of them?
Choosing what(not) to show
19
We project wherewe will be in the future
We choose the bestcamera at that location
Fitness functions: Proximity, Screen fill Liveness, Recency
The trick is to limit what is displayed
Heuristics for choosing cameras• Current image should stay in view for as long as possible• Once the destination image is visible, choose it• There should be a minimum duration for subtransitions
Choosing what(not) to show
20
Roadmap
The need for an abstraction• Need a camera at every point in
space• Challenges of ubiquitous video
Building the RealityFlythrough abstraction
• Motion as a substitute for ∞ cameras• Choosing what (not) to show• Handling dynamic environments• Archiving live imagery
Evaluating the abstraction• Usability• Robustness to change• Scalability
21
The destination camera moved!
Computing the path and the cameras to display at the start of the transition does not workProblem 1: Destinationmay be a moving target
Problem 2: Interveningcameras may not beoptimal
Handling Dynamic Environments
22
Handling dynamic environments
Step 1:Paths need to be dynamic
Step 2:Cameras need to be selected just-in-time
Handling Dynamic Environments
23
There are still some problems
Problem 1: Course correction is
too disorienting
Problem 2:Too many dimensions of movement– User’s movement
(x,y,z)– Camera’s movement– Scene movement
What we tried:• Paths need to be
dynamic• Cameras need to be
selected just-in-time
Handling Dynamic Environments
24
Our current approach
Problem 1: Course correction is
too disorientingProblem 2:
Too many dimensions of movement
Solutions:First move to where the camera was. Then quickly capture the moving target
Pause the live video whenever it’s visible and play at increased speed until we’re back to live action
Handling Dynamic Environments
25
Roadmap
The need for an abstraction• Need a camera at every point in
space• Challenges of ubiquitous video
Building the RealityFlythrough abstraction
• Motion as a substitute for ∞ cameras• Choosing what (not) to show• Handling dynamic environments• Archiving live imagery
Evaluating the abstraction• Usability• Robustness to change• Scalability
26
Archiving live imagery
Why do it?• Still-images generated
from live video feeds increase camera density
• Help us create the illusion of infinite camera coverage
Competing desires• Maximal camera density• Quality images
– Still-images act as the anchors in a sea of confusing movement
Pink: Live camerasGray: Still-images
27
Archiving live imagery
How do we do it?• Each frame from every camera
is considered• Sensor data (location,
orientation) is validated for accuracy
• Images are assigned a quality based on possible blurriness (eg. high position delta)
What is stored• The most recent highest quality
image, for a particular location(eg. 1 meter² with a 15 degree arc)
• The image is treated as if it was a non-moving camera
28
Roadmap
The need for an abstraction• Need a camera at every point in
space• Challenges of ubiquitous video
Building the RealityFlythrough abstraction
• Motion as a substitute for ∞ cameras• Choosing what (not) to show• Handling dynamic environments• Archiving live imagery
Evaluating the abstraction• Usability• Robustness to change• Scalability
29
Scalability
802.11
H323 Video ConferencingStreamBottleneck 1:
10 stream max(Fewer with higher FPS)
Bottleneck 2:112 streammax decode
MCU(Multipoint Control Unit)
RFT Engine
Cameras
ImageCapture SensorCapture
StreamCombine (352x288 videoresolution)
X
Server
30
Scalability
Bottleneck 2:112 streammax decode
MCU(Multipoint Control Unit)
RFT Engine
Bottleneck 3:15 streams max
(no image archive)
MCU(Multipoint Control Unit)
RFT Engine(w/ image archive)
Conclusion: It is the number of live cameras, and not the total number of cameras that is the immediate bottleneck
550 archive“cameras”
31
Recap
• RealityFlythrough creates the illusion of infinite cameras
• Possible despite the challenges of ubiquitous video– Camera density is low– Environment is dynamic– Environment is uncalibrated– Need data live and in real-time
• We do this by– Using motion as a substitute
for ∞ cameras– Choosing the best imagery to show– Selecting imagery and path just-in-time– Using archived live imagery
to increase camera density
32
Research questions
• Can we harness ubiquitous video cameras live and in real-time?
• How well does the RealityFlythrough illusion work?• Why does the illusion work?• Can RealityFlythrough be a real solution for real
problems?• Can RealityFlythrough be a general solution?
33
How well does the RealityFlythrough illusion work?
1st attempt (CHI 2005)• Subjects were shown 10 short
transitions and 10 image pairs w/o transitions and had to select a birdseye depiction that best represented their position in the space
• 86.67% of subjects had a greater or equal score on transition questions
• Success rate on answering transition questions increased as subjects saw more transitions– Subjects also increased answering
speed– Comprehension for “expert users”
was shown to approach 100%– In the location familiar to the
subjects the second to last and last questions were answered correctly 93.33% and 100% of the time respectively
– Demonstrates that transition comprehension is a learnable skill
34
How well does the RealityFlythrough illusion work?
Take 2• We should be able to show close to
100% comprehension– Have subjects act out movement– Remove requirement of translation
from 3d to 2d birdseye representation
• Once we can do this, we can answer other questions
• Can we generate an experience that is better than being there? – As in Jim Hollan and Scott Stornetta’s Beyond Being There– Does the increased speed of movement and lack of physical
boundaries actually make understanding a large area easier? Faster?
– If we overlay simple meta-data does this affect the way people understand a space (compass directions, degree separations, etc.)?
35
Why does the illusion work?
• Is closure really what is happening?• What is closure?• What are the limits of closure?
– How can we overcome these limitations?
• Do people experience RealityFlythrough closure differently?– Is it a learnable skill?
• What effect does prior knowledge of a space have?• What spatial skills are required?• How much does a map help with comprehension?
36
Can RealityFlythrough be a real solution for real problems?
Disaster response incident command support• Continue our work with the San Diego MMST (Nov 15 Drill)• Communication challenges
– Access Point handoff– Automatic re-establishment of communication while maintaining
state– Have RTP streams play nicely in network
• Exponential backoff as with TCP• Do archive calculations on client and only transmit archive-ready frames
– Store and forward?• Integration of full-fidelity streams when cameras return to base
• User interface challenges– Generate a UI that is usable by a novice so we can have a real user
• A less desirable alternative is to have the user direct an expert
– Generate a UI that targets the requirements of incident command
37
Evaluation is the biggest challenge
Evaluation of system metrics• How well did the system perform• Were we able to achieve the desired frame rates?• How long did it take to construct a 3d representation of the
space?• How recent were the images on average?
Evaluation of the experience• How do we know that we have succeeded?• What will be possible to measure?
– The MMST drills are not controlled environments– Most likely we will only have an n of 1
• First show that we have perturbed the incident command’s practice
• We then want to show that the system can be better– User reaction– Time to understand the environment– Fidelity of understanding
38
Is RealityFlythrough a general solution?
• In my research exam I argued that there is no such thing as a general presence solution– Presence (ie. first-person immersion) may be a desirable
interface, but the requirements of the task and the user should dictate the quantity and quality of presence
– More presence is not necessarily always better• The telephone is in some ways better than a face-to-face
communication
• RealityFlythrough is not a general presence solution, but it may be a general remote exploration solution
• To show this, I will look at three distinct application domains– Disaster Response (SWAT, military, surveillance)– Semi-live entertainment/convenience (tourism, pre-drive,
shopping, sports)– Static space-browsing (real-estate, emergency room
orientation)
39
Bottom linePaper 1• Mobisys 2005 (completed)
Paper 2• CHI 2006 (Sep 23, 2005) or UIST
2006 (April 1, 2006)• How well does the illusion work
along with analysis of why it works– Closure explored
• Spin: We have a novel user interface for harnessing ubiquitous video that offloads processing requirements to the human visual cortex
Paper 3• CSCW 2006 (~Mar, 2006)• Evaluation of RealityFlythrough as
a tool for Disaster Response
Thesis Defense• Summer 2006
Can we harness ubiquitous video cameras live and in real-time?
How well does the RealityFlythrough illusion work?
Why does the illusion work?
Can RealityFlythrough be a realsolution for real problems?
Can RealityFlythrough be a general solution?
40
Possible additional work
• Implement “virtual camera metaphor” – Contrasts with the hitch-hiking metaphor described so far– Abstraction stretched to support “best” views from any
point in space– Novel views, but dynamically updated
• Integrate high-level information that is present in the birdseye view into the first-person view
• Support sound• Scale to multiple viewers with multiple servers
User’s DesiredViewArchived
imagery
41
Possible additional work
• Navigate through time as well as space• Use space as an index into time• Web-based client
42
Questions?Paper 1• Mobisys 2005 (completed)
Paper 2• CHI 2006 (Sep 23, 2005) or UIST
2006 (April 1, 2006)• How well does the illusion work
along with analysis of why it works– Closure explored
• Spin: We have a novel user interface for harnessing ubiquitous video that offloads processing requirements to the human visual cortex
Paper 3• CSCW 2006 (~Mar, 2006)• Evaluation of RealityFlythrough as
a tool for Disaster Response
Thesis Defense• Summer 2006
Can we harness ubiquitous video cameras live and in real-time?
How well does the RealityFlythrough illusion work?
Why does the illusion work?
Can RealityFlythrough be a realsolution for real problems?
Can RealityFlythrough be a general solution?
43
44
Evaluation is the biggest challenge
Do we have to build the system?• Could a contrived experiment do the same where a
camera operator (acting as a robot) moves where the user desires?
• This could imitate telepresence, but what about the affordances non-telepresence solutions offer? – how are boundaries crossed? – how do we move instantly (or rapidly) across space? – how could we add augmented reality qualities to the
system?
• The sum of the parts does not equal the whole. • We can create a much richer experience than what
telepresence can offer.
45
The illusion of infinite camera coverage
Camera
PhysicalCameraVirtualCamera
CameraWithState PositionSource
ImageSource
EnvironmentState
1
*
1
1
Modelin MVC
1
1
1
1
46
RealityFlythrough Engine
EnvironmentState(Model)
ViewsController
CameraRepository
StillImageGen
TransitionPlanner
TransitionExecuter
H323ConnectionManager
<< uses
1st Topic
2nd
Topic
3rd
Topic
47
Transitions
TransitionPlanner
PlannerSimple PlannerBestFit
TransitionExecuter
Path
PathStraight
FitnessFunctor
OverallFitnessProxFitnessProxFitnessKinds of Fitness
Generates>>
1
1
1
1
1
1
1
*
ProximityFitness,LivenessFitness,
...
48
Related Work
• Use model so can approximate photorealism– Pre-acquired Model
• Neumann, et al. [1] w/ Augmented Virtual Environments• Only works w/ static structures
– Acquire model from image data• Preprocess still imagery
– Szeliski [2]– Chen [3] w/ Quicktime VR
• Know exact camera locations– Kanade [4] w/ Virtualized Reality
49
How are these images related?
50
How are these images related?
51
Like this!