robot vision ss 2005 matthias rüther 1 robot vision lesson 7/8: shape from x matthias rüther

Robot Vision SS 2005 Matthias Rüther 1

ROBOT VISION Lesson 7/8: Shape from X

Matthias Rüther


Shape from Monocular Images

Verschiedene Möglichkeiten: Shape from Shading

– Streifenmethode– Photometrisches Stereo

Shape from Texture– Statistische Methoden– Strukturelle Methoden

Geometrische Szeneneigenschaften

2 dim + “?”“?” => 3 dim

A priori Wissen

Golfball


Overview Shape from X techniques

– Shape from Shading– Shape from Specularities– Three-Dimensional Information from Shadows– Shape from Multiple Light Sources, Photometric Stereo – Shape from Laser Ranging and Structured Light Images – Shape from Texture– Shape and Structure from Perspective Effects, Vanishing Points – Three-Dimensional Reconstruction from Different Views – Depth from Focus, Changing Camera Parameters – Surface and Shape from Contours

http://iris.usc.edu/Vision-Notes/bibliography/shapefrom374.html#Shape%20from%20Shading






Shape from Shading

Schattierung der Oberfläche beeinflußt die Raumempfindung: – Bsp.: Makeup ändert nicht nur Oberflächentextur – erzeugt Highlights und Schatten => Raumempfindung


Shape from Shading

In Computer Vision erstmals durch Horn [Horn77]:– Oberflächenreflexion von untexturierten Objekten beinhaltet

TiefeninformationTiefeninformation


Flächengrenzen

Flächengrenzen spielen entscheidende Rolle bei Interpretation durch Menschen:

Begrenzungslinien werden als Grenze zu Hintergrund interpretiert:

=> gleiche Schattierung -verschiedene Form


Formrekonstruktion

Zur Vereinfachung wird immer Normalprojektion verwendet: – => Objekt ist entfernt und nahe der optischen Achse


Reflectance Map

– Beleuchtungsquelle ist weit entfernt

Reflexionseigenschaften sind zentraler Bestandteil von Shape from ShadingBeleuchtung ist eine Funktion der Richtung

und nicht Entfernung

Es gibt keine Schlagschatten, Objekt ist unverdeckt bezüglich Lichtquelle - nur Selbstschatten

keine Zweitbeleuchtung: Reflexion von anderen Gegenständen auf das Objekt ist ausgeschlossen


Reflectance Map

Lambert‘sche Oberflächen: – Helligkeit hängt nur von Beleuchtungsrichtung ab, nicht von

Beobachtungsrichtung – ein Oberflächenpunkt hat in allen Beobachtungsrichtungen gleiche

Helligkeit– keine Totalreflexion

Reflectance Map einer Oberfläche:2dim Plot des Gradientenraums (p,q) der normalisierten Bildhelligkeit einer Oberfläche als Funktion der Oberflächenorientierung


Gradient Space

z=f(x,y)

xzp

yzq

111

22qp

qpn

Oberflächennormale


Reflectance Map Equations

i)( iE L n

sn


REFLECTANCE MODELS

albedo Diffusealbedo

Specularalbedo

PHONG MODEL

L = E (aCOS bCOS )n

a=0.3, b=0.7, n=2 a=0.7, b=0.3, n=0.5

LAMBERTIAN MODEL

L = E COS


Surface Reflectance Map


Reflectance Map

2dim Plot des Gradientenraums (p,q) der normalisierten Bildhelligkeit I einer Oberfläche als Funktion der Oberflächenorientierung


Reflectance Map

gerade Linie im Gradientenraum trennt beleuchtete von Schattenregionen: => Terminator

Nachteil des Modells: Es gibt keine komplett matten OberflächenPerfekter Spiegel: Licht wird nur in eine spezielle Richtung reflektiert (Einfallswinkel = Ausfallswinkel), in alle anderen Richtungen keine Reflexion


Reflectance Map

Realität Kombination aus matten und spiegelnden Reflexionseigenschaften:– Gewichteter Durchschnitt der

diffusen und spiegelnden Komponente einer Oberfläche

Reflectance Maps für eine bestimmte Oberfläche müssen experimentell bestimmt werden – für allgemeines Shape from

Shading nicht möglich – für mache Oberflächen jedoch

bereits bestimmt => Normform


Normform der Reflectance Map

– Spiegelnde Reflexion durch Ellipse in Reflectance Map

– Übergang spiegelnd - diffus als Parabel

– Diffuser Teil: Hyperbel

– Grenze Licht - Schatten (Terminator) = Linie


Shape from Specularity

Suitable for highly reflective Surfaces

Specular Reflection map of a single point source forms a sharp peak (Specular model, Phong model)


Shape from Specularity

Principle:– If a reflection is seen by the camera

and the position of the point source is known, the surface normal can be determined.

– => use several point sources with known position: structured highlight inspection


Shape From Shadow

Also: Shape from Darkness Reconstruct Surface Topography from self-occlusion E.g. Building reconstruction in SAR images, terrain

reconstruction in remote sensing


Shape From Shadow

A static camera C observes a scene.

Light source L travels over the scene x, position of L is given by angle .

L and C are an infinite distance away (orthographic projection).

Shadowgram: binary function f(x, ), stating whether scene point x was shadowed at light position .


Photometric Stereo

Multiple images, static camera, different illumination directions

At least three images Known illumination direction Known reflection model (Lambert) Object may be textured


Photometric Stereo

Reflection model

3 unknowns per pixel

)],(),,([),(),( yxqyxpRyxyxI

Albedo (reflectivity)

),(),(),(

yxyxqyxp

at least three illumination directions


Photometric Stereo

||||

),(),(),(

1

1

1

1

ELELn

ELn

L

LL

nLyxIqpRyxI

K

ii


Photometric Stereo: Example

From Forsyth & Ponce

Inputimages

Recoveredalbedo

Recoverednormals


Range Finder

Range Finder principles:

• Runtime Range Finder

• Triangulation Range Finder

- Sender and receiver with known position, triangulation similar to stereo principle

1 image

Receiver/Sender positionDepth information2 dim + geometry => 3 dim

• Optical Range Finder• Ultrasound Range Finder

e.g.: • Spot Projectors• Moiré Range Finder• Structured Light Range Finder• Pattern (stripe) projection


Runtime Range Finder

Determine sensor-object distance by measuring radiation runtime:

1. Sender (coherent light)2. Scanning Unit3. Receiver (light detector)4. Phase detector

Alt. Method: send light pulses => LIDAR (Light Radar), defense industry

Problem: generating pulses, measuring runtime (both very short)


Ultrasound Range Finder

Used in commercial cameras (Autofocus Spot)

AdvantagesIndependent of sorrounding light,Slow speed of ray

Applications: • Obstacle Detection: e.g.: “Car parking radar”• Level Measurement, silos, tanks, …• Underwater ranging (sonar), ...

Typ. specifications: • Range: 5cm to 1-5m • Accuracy: +-3mm

Disadvantagescoarse resolution, Bad accuracy, Pointwise, scanner necessaryMultiple Reflection/Echoes


Spot Projection

Determine sensor-object distance pointwise

1. Sender (Laser beam)2. Scanning unit3. Receiver (CCD Camera)

Laser Tracker: [Ishii76], early systems 1968 [Forsen68]

Distance by triangulation


Triangulation Range Finder

Principle:

Projection of a plane onto a plane -> intersection is line (stripe)

Distortion of stripe gives object depth by Triangulation


Structured Light Range Finder

1. Sender (projects plane)2. Receiver (CCD Camera)

X- directionGeometry Z- direction Sensor image


1 plane -> 1 object profile

Object motion by conveyor band:=> synchronization: measure distance along conveyor=> y-accuracy determined by distance measurement

Scanning Units (z.b: rotating mirror) are rare (accurate measurement of mirror motion is hard, small inaccuracy there -> large inaccuracy in geometryMove the sensor: e.g. railways: sensor in wagon coupled to speed measurement

To get a 3D profile:• Move the object• Scanning Unit for projected plane• Move the Sensor


Stripe Projection

Determine object structure by projecting multiple stripes simultaneously and subsequent triangulation

1. Sender (planes)2. Receiver (Camera)


Projector

Lamp

Lens system

LCD - Shutter

Pattern structure

Example

Focusing lens (e.g.: 150mm)

Line projector (z.b: LCD-640)


Pattern projection CameraCamera: IMAG

CCD, Res:750x590, f:16 mm

ProjectorProjector: Liquid Crystal Display (LCD 640), f: 200mm, Distance to object plane: 120cm

Projected light stripes

Range Image


Moiré Range Finder

Project line structure, observe line structure through a grid

1. Sender (Projektor mit Linien)2. Receiver (CCD Camera with line filter)

Problem: identification of line ordering possible but hard, unsharp lines => inaccurate results

Moiré ImageMoiré Pattern


Triangulation Principle

Known Parameters: Angle of baseline and light ray Angle baseline and principal direction

b distance projector-camera

Baseline b

.

Z

•

Kamera

Object point

CCD

180sinsinsin

sinsinsin

sinsin

sin

sinsin

sinsin

sin

bZ

bZbZ

baba

aZ


Lichtpunkttechnik

Einfachste Methode zur Messung von Entfernungen: Punktweise Beleuchtung der Objektoberfläche

räumlicher Fall eines projizierten Lichtpunktes

Laser: Position auf der x-Achse: bekannter Richtungswinkel des Lasers zur Kamera: bekannter Schwenkwinkel des LasersR: Schnittpunkt der Schwenkebene mit der optischen AchseP: zu vermessender Objektpunktp(x,y): bekannter projizierter Objektpunkt P Z-achse: optische Achse der Kamera

Es gilt für kamerazentriertesKoordinatensystem:

yY

fZ

xX 000


Lichtpunktstereoanalyse

Kombination der Lichtpunkttechnik mit Verfahren der statischen Stereoanalyse:Laserpunkt wird an beliebiger Stelle auf das zu vermessende Objekt projiziert und mit zwei Kameras aufgenommen.

Entfernungsberechnung: ausder Position des Laserpunktesanalog zur statischenStereoanalyseVorteil gegenüber Lichtpunkttechnik: keine aufwendige Kalibrierung, daOrientierung des Laserstrahlsnicht in Berechnung eingeht.

Prinzipieller Aufbau für ein Verfahren nach der Lichtpunktstereoanalyse


Lichtschnittechnik

1) Projektion einer Lichtebene auf das Objekt (mittels Laser oder Projektoren)2) Schnitt der Lichtebene mit der Objektoberfläche als Lichtstreifen sichtbar3) Lokalisierung des Lichtstreifens mittels Kantenerkennung und Segmentierung

Entfernungsberechnung:Kalibrierung: b, f, ,Generierung eines Look-Up-Tables für Entfernungswerte

Allgemein: Die Tiefenauflösung ist sowohl von der Auflösung des Bildsensors als auch von der Breite der erzeugten Lichtebene abhängig.

Zur Vereinfachung steht Lichtebene senkrecht zur Referenzeben.


Gleichzeitige Projektion mehrerer Lichtschnitte

Anstatt einer Lichtebene werden mehrere Lichtebenen auf das Objekt projeziert, umdie Anzahl der aufzunehmenden Bilder zu reduzieren.

Entfernungsberechnung:wie mit einer Lichtebene, jedochmuß jeder Lichtstreifen im Bildeindeutig identifizierbar sein.Problem: Aufgrund von Verdeckungen sind einzelne Streifen teilweise oder gar nicht im Kamerabild sichtbar -> keine eindeutige Identifikation der LichtstreifenAnwendung: Glattheitsüberprüfung bei planaren Oberflächen ohne Tiefenwertberechnung.


Binärcodierte Lichtschnittechik

Codierung

Vorteil: wesentlich weniger Aufnahmen für gleiche Anzahl von zu vermessenden Objektpunkten notwendig (log n)Codierung der Lichtstreifen nach Gray Code. Jede Lichtebene ist codiert jedes Pixel kann einer Lichtebene zugeordnet werdenTiefenberechnung erfolgt mit Hilfe einer Lookup Tabelle, für jede Lichtebene wird die Triangulation vorausberechnet (2d)

Motivation: .) Lichtpunkt oder Lichtstreifen erfordern große Anzahl von Aufnahmen .) Mehrere projzierte Lichtmuster können nicht eindeutig identifiziert werden Codierung der Lichtmuster: Zur Unterscheidung der Lichtebenen Erzeugung zeitlich aufeinander folgender verschiedener binärcodierter Lichtmuster


Tiefenberechnung für Streifenprojektor

1) Unterschiedlich breite Lichtstreifen werden zeitlich aufeinanderfolgend in die Szene projiziert und von der Kamera aufgenommen.2) Für jede Aufnahme wird für jeden Bildpunkt festgestellt, ob dieser beleuchtet wird oder nicht.3) Diese Information wird für jeden Bildpunkt und für jede Aufnahme im sog. Bit-Plane Stack abgespeichert.

Verschiedene Lichtstreifen sind notwendig, um für jeden Bildpunkt einen zugehörigen Lichtstreifen festzustellen zu können.

Durch die zeitliche Abfolge der Aufnahmen wird es ermöglicht, daß jeder Lichtstreifen im Kamerabild identifiziert wird.

4) Findet man im Bit-Plane Stack für einen Bildpunkt die Information, daß er bei den Aufnahmen z.B. hell, dunkel, dunkel, hell war (Code 1001), dann folgt daraus, daß dieser Bildpunkt vom vierten Lichtstreifen beleuchtet wird.5) eindeutige Zuordnung „Lichtstreifen – Bildpunkt“ möglich


Abschattungen

Laser erreicht Oberflächenp. nicht

Laser erreicht Oberflächenpunkt

Kamera sieht Oberflächenp. nicht

Kamera sieht Oberflächenpunkt

keine Tiefeninformation

keine Tiefeninformation

Tiefeninformation keine

Tiefeninformation

Laser Abschattung

keine Laser Abschattungkeine Kamera Abschattung


Abschattungen

2 Projektoren erzeugen 2 Lichtebenen = uneindeutig

2 virtuelle Kameras


Tiefenbild bzw. Range-Image

Aufnahme eines Intensitätsbildes Aufnahme eines Tiefenbildes

Vergleich Intensitätsbild-Tiefenbild


Beispiele (1)

Tiefenbild einer Fischdose sowie einer Zigarettenschachtel

TiefenbildIntensitätsbild

TiefenbildKantenbild


Beispiele (2)

ungefiltert gefiltert

Tiefenbild einer Ebene und eines Spielzeug LKW's

Ebene im KKS


Beispiele (3)

Tiefenbild eines Augapfels (Modell) mit Tumor Tiefenbild eines Augapfels (Modell) ohne Tumor

Kantenbild eines Augapfels (Modell) mit Tumor Kantenbild eines Augapfels (Modell) ohne Tumor


Shape from Texture

A cat sitting on a table edge, depth cue is the change of texture size

Discrimination of depth levels by size of boxes.

Depth information within texture

What is texture?


Texture

– Principle of Texture: repetition of a basic pattern.

– Basic pattern not deterministic for natural objects

– Basic pattern approximately deterministic for man-made objects

– Repetition of pattern neither regular nor deterministic

– Only statistic regularity = uniform distribution

– Object shape can be recovered from geometric distortion of texture [Julesz75]

– Texture is not a local property, characterisitc for an area much larger than basic pattern.

=> Only basic shapes can be reconstructed (planes, spheres, …)


Texture

– Texture is the abstraction of a statistical homogenity in a part of the observer‘s field of view which contains much more information than the observer can handle.

Texture is scale dependent: Many Texture elements need to be observed but must not exceed a certain maximum, such that their size is always bigger than sensor resolution.


Statistical Texture Analysis

Suitable for natural textures, since they are not deterministic but follow statistical rules. – First order grey value statistics: statistics of single pixels compared to

pairs, triples, quadruples etc. – Normalized texture histogram: grey value distribution function– Disadvantage of first order statistics: insensitive against permutations =>

Checkerboard may have same statistics as salt‘n pepper noise. – Second order grey value statistics: Distribution of neighborhood

relations (Co-occurence Matrix)– E.g.: relation „right of“: black-white, grey-white, white-white– More suited for texture classification, because no geometrical properties

are modeled.


Structural Texture Analysis

For deterministic textures:– Man made objects usually have deterministic textures– Assumes that Texture is generated by a basic element „TEXEL“ which

is repeated in a regular fashion. – Perspective distortion of texels gives surface orientation/shape – Characterization by invariant features (circles -> oriented ellipses)


Structural Texture Analysis

Slant

Tilt

Distortion of a planar circle local surface orientation


Texel

– Texels are not unique

– Different texels => identical resulting shape

– Exact description of texels necessary to model their distortion

– Simple, geometrical texels easier to handle

1.2.


Texel

– Circle = simple texel

– Same problem as shading: Object border including texture determines the shape

– Shape reconstruction based on perspective projection

=> Texture is a 2D surface pattern, not a 3D property. E.g. brick wall, printed cloth

Aloimonos & Swain: Shape from Shading Shape from Texture


Shape from Texel

Based on local perspective distortion

Texel must be identifiable Texel must not overlap All Texels must have same spatial

dimension Texel are „small„, locally planar and

have a unique surface normal


Shape calculation

Determine position of a plane using ist texture (slant,tilt):


Examples

Tree

Sunflower

Audience

Artificial exmaples using black circles

Algorithms usable in a limited fashion on „real“ scenes.


Examples

Golfball


Geometrical Scene Constraints

– Reconstruct shape from geometrical relations (similar to human vision). – E.g.: parallel lines confine a plane– Planes are oriented by their vanishing points

Vanishing points and vanishing line give the plane orientation. Problem: find vanishing points


Geometrical Scene Constraints

e.g. intersecting area of parallel lines, symmetry axis of arbitrary objects


Shape from Parallel Lines


Shape From Focus Recover shape of surfaces from limitied depth of view.

– Requires visibly rough surfaces– Typical application: optical microscopy


Shape From Focus

Visibly Rough Surfaces– Optical roughness: the smallest spatial variations are much larger than the

wavelength of incident electromagnetic wave.– Visible roughness: smallest spatial variations are comparable to viewing

area of discrete elements (pixels).– Magnification:

• Multi-facet level: w1 >> wf, smooth texture• Facet level: w2 ~= wf, rough texture


Shape From Focus Focused / Defocused images

– Focused:

– Defocused: object point is mapped to spot with radius

=> defocusing is equivalent to convolution with low pass kernel (pillbox function)

fio111

iRr

otherwise

ryxifryxp

yxIyxpyxI fd

0

1),(

),(*),(),(

2222


Shape From Focus Changing Focus

– Displacing the sensor: changes sharp region, magnification and brightness

– Moving the lens: changes sharp region, magnification and brightness

– Moving the object: changes sharp region only

=> Object is moved in front of static camera


Shape From Focus Overview:

– At facet level magnification, rough surfaces give texture-rich images

– A defocused image is equivalent to a low-passed image

– As S moves towards focused plane, its focus increases. When S is best focused,

– Challenges:• How to measure focus?• How to find best focus from finite

number of measurements?

ddd fs


Shape From Focus

Focus measure operator– Purpose:

• respond to high frequency variations in image intensity within a small image area

• produce maximum response when image area is perfectly focused

– Possible solution: • determine high frequency content using Fourier transform (slow)

– Alternative:• High-pass image using Laplacian (problem with elimination)

• Modified Laplacian


Shape From Focus• Sum Modified Laplacian

• Tenengrad Focus Measure

• Alternatives: variance of intensities, variance of gradients

101202101

xS

121000121

yS

M

m

N

n

yx

nmSITEN

nmSnmSnmS

2

22

)],([)(

)],([)],([),(

INxM … local intensity function (image window)


Shape From Focus

Example

Infinite DOF DEM


Shape From Focus

Sampling the focus measure function– Consider a single image point (x,y)– Focus measure F is function of depth d: F(d)– Goal find Fpeak from finite number of samples F1…F8


Shape from focus

Sampling the focus measure function– Possibility1: find highest discrete sample


Shape from focus

Sampling the focus measure function– Possibility2: Gaussian interpolation

Fit Gauss function to three strongest samples


Shape from X Demos

Camera Calibration Shape from Stereo point triangulation Ransac plane fitting Photometric stereo Shading Illusions

robot vision ss 2005 matthias rüther 1 robot vision lesson 7/8: shape from x matthias rüther

Documents