click to edit master title styleperso.limsi.fr/vezien/ra_orleans/intro_ra_vezien.pdfiphone app: ny...
TRANSCRIPT
Click to edit Master title style
“Imagerie Opérationnelle”
Polytech’Orleans 2012
Augmented Reality: an introduction
Jean-Marc Vezien
2
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
Plan of the lecture
1. Augmented reality: what and why (with examples !)
2. User tracking
3. Real world 3D
4. Graphics
5. Augmentation
3
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
Wikipedia says…
Augmented reality (AR) is a term for a live
direct or indirect view of a physical, real-world
environment whose elements are augmented by
computer-generated sound, video, graphics,
haptic or GPS data.
Augmentation is conventionally in real-time and
in semantic context with environmental
elements, such as sports scores on TV during a
match.
With the help of advanced AR technology
(computer vision and object recognition) the
information about the surrounding world
becomes interactive, e.g. artificial information
about the environment can be overlaid.
Augmented reality was coined by Thomas
Caudell, working at Boeing, in 1990. ARToolkit (Kato & Billinghurst, 2001)
4
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
Real/virtual Continuum (Milgram 1994)
5
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
Real/virtual Continuum (Milgram 1994)
6
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
Real/virtual Continuum (Milgram 1994)
7
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
Real/virtual Continuum (Milgram 1994)
8
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
Real/virtual Continuum (Milgram 1994)
9
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
Real/virtual Continuum (Milgram 1994)
10
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
AR: examples
Not a new idea….
• User position and gaze provides context ,
sometimes with help (markers)
• Audio is nice (non-obstrusive), can be
stopped anytime.
• No computer involved
• No sensing involved : limited interaction
Now associated with head-tracking…
Started in 1957 (Roosevelt home)
Principio system
(2007)
11
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
AR: examples
See-through augmented vision: « classic AR »
Tourism: Archeoguide (2002)
12
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
AR: examples
Assemply / Maintenance / Repair
Maintenance: help technician with
contextualized content
BMW, 2010
Matris project, 2007
Fiducial Text
Action Graphics Icons
13
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
AR: examples
Way-finding
Iphone app: NY nearest subway (and many cities, airports have their own app)
Augmented Car Finder
Note:
• Data is collected off-line
• Access to database is native on mobile phones
(http protocol)
14
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
War "games"
Track target w.r.t. weapon
Coupled with (off-line) Geographical Information
system
… and strategic realtime info (GPS, detectors)
gun-mounted display
15
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
MR: examples
Games and promotional contents
EyePet for PSP (2009)
Topps 3D baseball cards (2009)
Note: often Webcam + computer (not wearable)
PSVita (2012)
16
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
Other examples
Medical: Surgery planning, nurse and student training…
University of North Carolina
Augmented virtuality (presented on screen)
Track hand-held device or body
Coupled with (off-line) anatomic data
17
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
• Militaires: « future soldier », BARS
• Médical : assistance chirurgicale
• Tourisme
• Customization (mode)
Virtual try-ons and Customize
Augmented virtuality or
AR
Track users anatomy
and motion (better still)
Coupled with external
data on-line
Webcam Social Shopper by Zugara
Augment for Ipad (2012)
18
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
MR: the SACARI example
Mixed virtuality: user not moving with camera,
indirect view of real world
Tele-immersion : provide sense of
presence of remote environment
Internship available ! PhD position !
19
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
Blurred boundaries: augmented movies
Is this real ? Augmented ? Virtual ?
On-line ? Off-line ?
20
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
Blurred boundaries
Is this still AR ?
AR will be everywhere the
moment efficient, cheap
see-through displays will
become available
Google moto: The World's
Information in Context
Street View is close to
Augmented Virtuality
Serguei Brin (2012)
21
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
The Ingredients for AR
Capture real world
22
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
The Ingredients for AR
Capture real world
+
Capture virtual world
(Computer Graphics)
23
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
The Ingredients for AR
Capture real world
+
Capture virtual world
(Computer Graphics)
Present to user
(Augmentation)
24
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
Real 3D world
Two ingredients necessary for
successful AR:
Real-time 3D tracking of
user viewpoint w.r.t. world
3D scene analysis
realistic augmentation
WorldViz
Shadow Zone
25
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
Real 3D world
Two ingredients necessary for
successful AR:
Real-time 3D tracking of
user viewpoint w.r.t. world
3D scene analysis
realistic augmentation
WorldViz
26
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
Real-Time tracking
Many means to compute positioning information:
Geolocalization
Electro-magnetic
Acoustic
Inertial
Vision-based (active or passive)
27
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
GPS triangulation
Global positioning system = array of synchronized
satellites: triangulate position (radio-triangulate)
Receptor has poor sync: affects precision
Differential GPS provide extra accuracy
Typical: max 50 cm accuracy
Provides position only
Couple with gyroscope or compass
(see after).
28
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
Electro-magnetic tracking
Three mutually-orthogonal coils
Each transmitter coil activated serially
Induced current in the receiver coils is measured
Varies with
the distance (cubically) from the transmitter and
their orientation relative to the transmitter (cosine of the angle between the axis and the local magnetic field direction)
Three measurements apiece (three receiver coils)
Nine-element measurement for 6D pose
Sensitive to magnetism + wires ! Source: SIGGRAPH 2001 Course 11 – Slides by Allen, Bishop, Welch
29
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
Acoustic tracking Triangulation of sound sources
The intersection of two spheres is a
circle.
The intersection of three spheres is two
points.
One of the two points can easily be
eliminated.
Ultrasonic
40 kHz typical
Good precision, but still wires
From [1]
Intersense IS-600 Mark 2
30
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
Image-based 3D tracking
It is a special case of adaptive
rendering ( for HMDs)
Environment control ++
Markers are still needed
(often)
Special case of
structure/motion estimation
AR meets movie industry:
the director is the user !
markers Known as motion estimation / match moving
31
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
From [3]
= user
From [3]
Inside-out vs. Outside-in
= user
Inside-out: user wears camera Outside-in: camera observes user
Inside-out better at estimating relative rotation
Outside-in much more convenient : don’t have to wear a camera
Special case: console gaming
Inside-out easier for wearable AR: mobility, ego-centered
reference frame
Needed for future nomadic apps
Cameras are getting small and commodity items
32
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
M
Z
Y
X
P
tt
tttt
v
u
v
u
sz
y
x
cv
cu
110
k
j
i
0100
0010
0001
100
0
0
1
image 3D world
(i,j,k,Tx,Ty,Tz) can be computed if m and M are known.
Not linear in the motion parameters !!
m = P . M
motion structure
The Maths Projection equation:
33
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
M
Z
Y
X
P
tt
tttt
v
u
v
u
sz
y
x
cv
cu
110
k
j
i
0100
0010
0001
100
0
0
1
Projection equation:
image 3D world
(i,j,k,Tx,Ty,Tz) can be computed if m and M are known.
Not linear in the motion parameters !!
m = P . M
motion structure
The Maths World camera =
Camera position
34
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
M
Z
Y
X
P
tt
tttt
v
u
v
u
sz
y
x
cv
cu
110
k
j
i
0100
0010
0001
100
0
0
1
Projection equation:
image 3D world
(i,j,k,Tx,Ty,Tz) can be computed if m and M are known.
Not linear in the motion parameters !!
m = P . M
motion structure
The Maths 3D 2D projection
35
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
M
Z
Y
X
P
tt
tttt
v
u
v
u
sz
y
x
cv
cu
110
k
j
i
0100
0010
0001
100
0
0
1
Projection equation:
image 3D world
(i,j,k,Tx,Ty,Tz) can be computed if m and M are known.
Not linear in the motion parameters !!
m = P . M
motion structure
The Maths
Sensor image (pixels)
36
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
Dementhon-Davis (1992)
Several methods to recover calibration exist (Tsai, Lowe…)
" Model-Based object pose in 25 lines of code" is a simple
yet elegant method to obtain pose rapidly if a rigid 3D model
is provided.
i
i
i
i
i
i
i
ii
xPI
PTz
kTz
TxP
Tz
i
TzPk
TxPi
Z
Xx
1
.
1.
.
.
. 0
i
ii
yPJy
1
. 0
),( 00 yx Tz
Pk ii
.
Projection of T(Tx,Ty,Tz) Relative depth around Z0=Tz
Idea consists in linearizing xi and yi to compute iJI ,,
ziz TTZ
37
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
i
o
i
i
o
i
i PJyy
PIxx
.
.
0
011
1
Geometrical interpretation:
P
O
Image plane
P0
Reference plane
Z0=Tz
Orthographic
projection
Z
Y
i Z0
Perspective
projection
Linear approximation of perspective
iio
y
iii
o
i
o
x
y
xxx
38
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
POSIT algorithm
Summarize:
Tz
jPJyyyy
Tz
iPIxxxx
i
o
iii
i
o
iii
=Jwith.)1(
=Iwith.)1(
00
00
If εi is known then I,J,Tz,X0=x0Tz,Y0=y0Tz can be computed
If I,J,Tz k is known
Iterate starting with
Converges rapidly:
Tz
Pk ii
.
0i
39
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
3D tracking: off-the-shelf solution
• Kato & Billinghurst (HIT lab, univ. of Washington) introduce
in 1999 a tool for teleconferencing:
"Marker Tracking and HMD Calibration for a video-
based Augmented Reality Conferencing System.“
Soon to become ARToolkit
40
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
ARToolkit: main characteristics
• Fast and cheap 6D marker tracking.
• Distributed with complete source code.
OpenSource with GPL license for
noncommercial usage.
• Multiplatform (Linux, MacOS and Windows) .
• Multiple input sources (USB, Firewire) ,
multiple format (RGB, YUV) supported.
• Multiple camera tracking supported.
• GUI initializing interface.
• Easy calibration routine.
• Fast rendering based on OpenGL.
• 3D VRML support.
• Simple and modular API (in C and C++).
• Complete set of samples and utilities.
• Supports both video and optical see-through
AR.
41
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
ARToolkit: main processing loop
Since v4, Pose estimation is performed using the
Iterative Closest Point (ICP) algorithm (Besl, 1992)
43
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
Important details
Virtual objects appear only when complete markers are visible:
Size
Movement
Orientation
Lighting conditions
Pattern Size (cm.) Usable Range (cm.)
6.98 40.64
8.89 63.5
10.79 86.36
In practice: close range only !
44
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
Real 3D world
Two ingredients necessary for
successful AR:
Real-time 3d tracking of user
viewpoint w.r.t. world
3D scene analysis
realistic augmentation
45
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
2 components: Light and Geometry
Lighting coherency Motion coherency
Geometric coherency
Light probe
3D real world analysis
46
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
Aim: Capture light coming from light sources in real world
10,000:1 is the (static) human eye ratio between brightest and
darkest shade: cannot be represented on 8 bits
Stored in High Dynamic Range images (.hdr image format)
From all directions at a given point
Light Probe
u
v
47
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
Half-life 2: normal and HDR
Note: advanced graphics techniques can
use hdr images (Unreal engine)
u
v
1,1),( vu
u
v1tan 22 vu
A direction vector in the world (Dx, Dy, Dz)
u = r. Dx and v = r.Dy with
unit vector pointing in the direction (u,v) is obtained by rotating
(0,0,-1) by:
1) degrees around the y (up) axis
2) degrees around the -z (forward) axis.
22
1 )(cos1
DyDx
Dzr
Light Probe (2)
48
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
World content
Geometry: 3D reconstruction : what is
where ?
Semantic : augmentation of context
object recognition
interpretation of content
Comprendre l’environnement 3D en termes : http://www.truevisionsys.com
49
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
Markers cannot always be present !
If structure is known: a priori model tracking (see before):
an object is a (collection of) marker
Remember : projection combines structure and motion (m
= P.M)
Idea: recover both simultaneouly !
+ gives tracking and reconstruction at the same time
- highly non-linear : requires iterations + data filtering (remove
outliers )
Geometry: 3D reconstruction
Factorization method introduced by Tomasi and Kanade (1992)
50
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
mij = Pj. Mi
Factorization method
2x1 = 2x3 3x1
FPF
FPF
P
P
yy
xx
yy
xx
m
1
1
111
111
jth frame
ith point
P
P
P
ZZ
YY
XX
M
1
1
1
FP
P
P 1
• Compact representation:
• Consider P points projecting on F frames:
• There is an of possible (P,M) pairs!
• Solved via SVD decomposition (but not uniquely)
Carlo Tomasi and Takeo Kanade. (November 1992)
"Shape and motion from image streams under orthography: a factorization method."
International Journal of Computer Vision, 9 (2): 137–154.
Images !
m is big, but rank 3 ! m= P.M
51
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
S. Christy et R. Horaud : "Euclidean Shape and Motion from Multiple Perspective Views by Affine Iterations"
IEEE Transactions on Pattern Analysis and Machine Intelligence, Volume 18, Number 11, Pages 1098--1104 - November 1996
Factorization: results
Hanno Ackermann, University of Hanovre (2008)
52
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
Computer Vision is good at locating features:
points
regions
textures
contours
… at different scales !
Image-based 3D reconstruction
Harris (1988)
SURF detector (OpenCV), Bay (2006)
2D region detection (region-growing)
53
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
Problems :
Robustness of detection
Matching (space/time)
Will become prevalent eventually.
For now: limited use , constraints
(model-based)
Markerless tracking
Wang & Popovic, MIT, 2010
54
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
Boujou: http://vicon.com/boujou/
Commercial solutions
Pricey (10 000$)
55
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
New real-time sensors: depth cameras (Zcam)
Infer depth for ALL pixels
Early : laser, stereo Perceptron
LIDAR , 1995
Devernay, 1994
Image-based 3D tracking
56
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
Now: structured lighting : Zcam for 300€ !
Kinect, 2010
Augmented Reality Magic
Mirror using the Kinect
Tobias Blum
Image-based 3D tracking
Solves the "where" but not the "what":
objects must still be identified (segmented)
Calibration, segment by depth
57
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
Stereo for AR: augmentation
Object location (table…)
Object occlusion
Shadows and light
interactions
In real-time !!
Will soon happen in
movie industry ($$$)
3D reconstruction is necessary for realistic CG blending:
X3d consortium (web)
58
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
3D analysis of
real images
(camera +
reconstruction)
Augmentation: CG blending
59
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
CG graphics + 3D analysis of
real images
(camera +
reconstruction)
Augmentation: CG blending
60
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
CG graphics
Compositing masks
(shadows, occlusions…)
+ 3D analysis of
real images
(camera +
reconstruction)
Augmentation: CG blending
match move
61
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
CG graphics
Compositing masks
(shadows, occlusions…)
Augmented image
+ 3D analysis of
real images
(camera +
reconstruction)
Augmentation: CG blending
match move
62
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
Matching constraints:
• geometry (epipolar)
• photometry
H. Jin, P. Favaro, and S. Soatto.
A Semi-direct Approach to Structure From Motion.
The Visual Computer, 19(6): 377-394, October
2003.
Example of augmentation pipeline (1)
image analysis
Region matching (1992)
63
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans Master RV&A 2011 - J-M.
Vézien
Example of augmentation pipeline (2)
3D reconstruction
3D reconstruction of regions based on planar equations:
Hypothesis: world is piece-wise planar
64
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
Prior information:
explicit 3D model
motion constraints
3D registration by ICP = Iterative
Closest Point (Besl 92):
always converges
model points do not coincide with
reconstruction points
Example of augmentation pipeline (3)
3D reconstruction
65
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
Example of augmentation pipeline (4)
3D virtual content
Two steps:
Geometric modeling
(sometimes based on 3D
scans): Maya, Blender,
3DSMAX, Sketchup, etc.
Photo-realistic rendering:
textures, lights, reflectance,
shadows, etc.
Note: Light probes
environment mapping
66
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
Light interaction: a must
Geometric coherency is good
but… light is essential !
Shadows in real-time =
geometry + light
Standard technique for CG renderers:
Ground (real)
(virtual) PSP
.'
dnEc
.nLd
.
E
S
67
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
Shadow projection techniques
Many different techniques !
Plane Projected
Shadows
Projected
Shadows
Depth Shadow
Mapping Vertex Projection
Shadow
Volumes
Quick, not much
calculations
Quick, almost no
calculations
Very quick, not
much calculations
Slow with high-res
meshes
Slow, lots of
calculations
High detail
Detail depends
on texture
Detail depends
on texture High Detail High detail
No self-
shadowing
No self-
shadowing Self-shadowing
No self-shadowing
Self-shadowing
No shadow
receivers
Shadow
receivers
Shadow
receivers
No shadow
receivers
Shadow
receivers
68
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
Shadow mapping
M. Haller, S. Drab, and W. Hartmann, 2003.
"A real-time shadow approach for an augmented reality application using shadow
volumes," in VRST 03: Proceedings of the ACM symposium on Virtual reality software and
technology, New York, NY, USA, 2003, pp. 56-65
Shadow volume with stencil drawing
69
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
Example of augmentation pipeline (5)
animations
Two steps:
Virtual motions must be
coherent with real ones :
match-moving
Motion is camera-centric
Virtual camera must be
identical to real one (off-line
calibration is necessary)
Accommodate zoom !
70
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
Occlusion mask
Virtual object rendering
(alone)
Example of augmentation pipeline (6)
Step by step rendering
71
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
Real objects: reference white
(albedo)
"black" virtual objects:
Shadow computation
Example of augmentation pipeline (7)
Step by step rendering
72
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
Final composition =
real image *
(1-mask) * attenuation
+
Virtual objects * mask
+ reflexions * (1-mask)
Example of augmentation pipeline (8)
Step by step rendering
74
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
Other example (1)
Objets déformables + occultations réel/virtuel
75
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
Other example (2)
Reference white
Shadows of virtual
objects on the real
scene
77
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
Conclusion
o Augmented reality is a reality
o Is a complicated process, requires
a lot of expertise and hard/software
o In-depth coherency of geometry
and photometry
o Real-time challenges
Elaborate rendering (GPU)
Match moving (markerless)
3D reconstruction (occlusions)
User interaction: hand and
body tracking
79
Vision-Embarquée - “Imagerie Opérationnelle”- Polytech’Orleans
Human vision is good at 3D but …
Mantis Shrimp: best eyes of the world.
The animal with the most sophisticated vision is
thought to be the mantis shrimp. Humans have
only three kinds of light receptors while Mantis
shrimp have ten allowing them to see not only
visible light but infrared and ultraviolet light as
well. They are the only invertebrates that visually
recognizes members of its own species. While
humans see three pigments (red, yellow and
green), mantis shrimp see up to sixteen. Mantis
shrimp also have polarized filters and some can
even produce signals detectable only with a
polarized filter. Mantis shrimp are also able to
see in stereo with each eye individually, which
means that if they lose one eye, they can still
see just as well.