alignment & its applications in computer vision3dvision.princeton.edu › courses › cos429 ›...

Post on 24-Jun-2020

2 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

© SRI International Sarnoff© SRI International

Alignment & Its Applications in Computer Vision

PU COS 429October 22, 2015

Harpreet S. Sawhney

CTO-Vision Technologies(Technical Director, Vision&Learning)

SRI InternationalPrinceton, NJ

© SRI International Sarnoff© SRI International Sarnoff© SRI International

SRI Information and Computing Sciences• 250 researchers• $100M revenue• >75% is USG business

Artificial IntelligenceCenter

Virtual Personal AssistanceLarge Scale Text UnderstandingMulti-INT, Large Scale, Data OrganizationKnowledge RepresentationAutomated Reasoning

Cyber Security Information SecuritySmart Grid TechnologiesHigh Assurance Systems

Speech Recognition and TranslationNatural Language UnderstandingSIGINT ExploitationSpeech Analytics/Information ExtractionSocial Media Sentiment Analysis

Computer ScienceLaboratory

Security and SurveillanceReal-Time Video ProcessingLow Power Embedded SystemsAdaptive & Cognitive TrainingHuman-Machine InteractionLarge-scale Image and Video SearchUAS ISR and Geo-registrationGPS-denied NavigationMixed & Augmented Reality

VisionTechnologies

Speech Technologyand Research Lab

2

Vision & Robotics

Vision & Learning

Vision Systems

© SRI International Sarnoff© SRI International Sarnoff© SRI International

Technology for K-12 and higher education

Technology Spin-off VenturesGrowth opportunities that bring SRI innovations to market

3

Panoramic image editing software*

Drug dispensing system*

Anti-counterfeiting systems

Customer service tools*

Surgical robotics

Portable power systems**

(formerly Rosedale Medical)

Glucose monitoring system

LCD technology*

Iris biometric identification*

Drug discovery

Disposable hearing aid*

Video-on-demand services**

Wireless mesh networks*

Publicly Traded

Information Technology

Materials Biomedical

Speech recognitionfor customer service

Electronic signature solutions

Digital color printing applications*

Super-bright LED light engines*

Video enhancement systems

Electroactive polymers*

Optical networkcomponents

Virtual personal assistant for the

iPhone*

Enterprise social media technology

Metal “print and plate”manufacturing process

DNA testing services*

*Acquired or merged** Dissolved

Stray voltage detection services

Environmentally friendly light products*

Real-time web video streaming and

sharing

Robotics

Next-generation personalized web

search tool

Travel search and planning

Educational gamingplatform

Innovative robots formanufacturing/service

Electroadhesion formaterials handling

Digital imaging system

Smart calendar for iPhone

© SRI International Sarnoff© SRI International

SRI Center for Vision Technologies

• 90 staff members• 30 year history in

Real Time Computer Vision

• 150+ patents First real time AR broadcast on live TV 1994: Ads in Baseball Games >> 10 Yard Line in Football

Live traffic Monitoring, deployed all over the country

VideoBrush: First ever live Video Mosaicing (now part of all Android phones)

IED DetectionCurrently saving lives in theatre

Breast Cancer: MRI based Tumor

Some Accomplishments

Slide 4

© SRI International Sarnoff© SRI International

SRI Center for Vision Technologies

• Computational Sensing

– Embedded Vision

• 2D/3D reasoning– GPS denied navigation– 3D mapping – Augmented reality– Aerial Surveillance

• Vision analytics– Video understanding– Image search

• Human behavior modeling

Search based on image/ video content

First ever Augmented Reality binoculars

Human Behavior Modeling: Social interaction and communication with computers

GPS Denied Navigation (Dismount, Robots, Vehicles, Aerial, Naval etc.)

Leading Platforms

Slide 5

© SRI International Sarnoff© SRI International6

SRI Vision Technology Algorithm Portfolio

• Non-uniform correction(sensor defect)

• AGC / color correction

• Extreme low light

• High frame rate capture

• Motion-adaptive processing

• Multi-spectral / VNIR/ SWIR

• Pan-Tilt control loop

• Image enhancement

• Stabilization

• Motion tracking

• Image fusion

• Mosaics

• Depth of field extension

• Dynamic range extension

• Vision guided prefiltering

• Super-resolution

• Dense stereo

• Face and Body detection

• Head/Face/Gaze tracking

• Landmark matching

• Moving Target Indication

• Multi Target Tracking

• Image based geo-location

• 3D LiDAR for SLAM

• Visual Odometry

• Robotic Navigation

• Geo-registration

• Visual Search / fast indexing

• Image Geo-Location• Image and Video data

mining• Object Recognition

• Activity detection

• Wide area surveillance

• Occlusion reasoning

• Gesture recognition

• Human State Estimation

(not a complete set)

© SRI International Sarnoff© SRI International Sarnoff© SRI International

Alignment for Change Detection : Tampering

7

© SRI International Sarnoff© SRI International Sarnoff© SRI International

Alignment for Change Detection

8

© SRI International Sarnoff© SRI International Sarnoff© SRI International

Alignment for Change Detection : Tampering

9

© SRI International Sarnoff© SRI International Sarnoff© SRI International

Alignment for Change Detection

10

© SRI International Sarnoff© SRI International Sarnoff© SRI International

Alignment for Mosaicing

© SRI International Sarnoff© SRI International Sarnoff© SRI International

Alignment for Moving Object Detection

12

© SRI International Sarnoff© SRI International Sarnoff© SRI International

Alignment for GeoSpatial Information

13

© SRI International Sarnoff© SRI International Sarnoff© SRI International

Alignment for Augmented Reality

14

© SRI International Sarnoff© SRI International Sarnoff© SRI International

Alignment in 3D

15

© SRI International Sarnoff© SRI International Sarnoff© SRI International

Alignment for Augmented Reality

Guide User through Emergency Response Procedures

User can ask questions and interactively diagnose problems

Display overlaid animations with directions

Automatically observe user actions and state of equipment and provide warnings and feedback

16

© SRI International Sarnoff© SRI International Sarnoff© SRI International

Alignment for Special Effects: MatchMove

17

© SRI International Sarnoff© SRI International

Pin-hole Camera Model

fZ

Y

y

ZY

f=y fP≈p

© SRI International Sarnoff© SRI International

Camera Rotation (Pan)

f

Z’

Y’

y’

Z′Y′

f=y′ P′f≈p′PR′=P′

pR′≈p′

© SRI International Sarnoff© SRI International

Camera Rotation (Pan)

f

Z’’

Y’’

y’’

Z ′′Y ′′

f=y ′′ P ′′f≈p ′′PR ′′=P ′′

pR ′′≈p ′′

© SRI International Sarnoff© SRI International

Image Motion due to Rotationsdoes not depend on the depth / structure of the scene

Verify the same for a 3D scene and 2D camera

© SRI International Sarnoff© SRI International

Pin-hole Camera Model

fZ

Yy

ZY

f=y fP≈p

© SRI International Sarnoff© SRI International

Camera Translation (Ty)

fZ

Yy

XX

X

X

Z′Y′

f=y′ P′f≈p′ T′+P=P′

© SRI International Sarnoff© SRI International Sarnoff© SRI International

Translational Displacement

Z′Y′

f=y′

ZTy+Y

f=y′

ZTy

f=y-y′

Z′Y′

f=y′

Tz+ZY

f=y′

ZTz

yy y-- ′=′

Image Motion due to Translationis a function ofthe depth of the scene

© SRI International Sarnoff© SRI International Sarnoff© SRI International

© SRI International Sarnoff© SRI International

Alignment Accounts for Motions…

• Motion Models– 2D

26

• Homography is the most general 2D model. – Includes all the transformations as special cases.

© SRI International Sarnoff© SRI International

Parameterization

ImagesCOP

P,P'

Rotations/HomographiesPlane Projective Transformations

RPP' =cc Rpp ≈'

RKppK '' ≈

RKpKp 1'' −≈

pHp'∞≈

© SRI International Sarnoff© SRI International Sarnoff© SRI International

3D Motion…

28

𝑝𝑝 ≈ 𝐾𝐾 𝑅𝑅 𝑇𝑇0 1 P

© SRI International Sarnoff© SRI International Sarnoff© SRI International

Alignment Methods

29

Acknowledgement: Adapted slides from http://slazebni.cs.illinois.edu/

© SRI International Sarnoff© SRI International

Direct Methods for Visual Motion Estimation

Employ Models of Motionand Estimate Visual MotionthroughImage Alignment

© SRI International Sarnoff© SRI International Sarnoff© SRI International

Direct Methods : The How Alignment of spatio-temporal images is a means of obtaining :Dense Representations, Parametric Models

© SRI International Sarnoff© SRI International Sarnoff© SRI International

Direct Method based Alignment

© SRI International Sarnoff© SRI International Sarnoff© SRI International

Formulation of Direct Model-based Image Alignment

)p(I1 ′ )p(I2

p)p(up −

Model image transformation as :))Θ;p(up(I)p(I 12 −=

Images separated by time, space, sensor types

Reference CoordinateSystem

Generalized pixelDisplacement

ModelParameters

© SRI International Sarnoff© SRI International Sarnoff© SRI International

Formulation of Direct Model-based Image Alignment

)p(I1 ′ )p(I2

p)p(up −

Compute the unknown parameters and correspondenceswhile aligning images using optimization :

∑i

iΘ),σ;r(ρmin ));(()( 12 Θ−−= iiii pupIpIr

What all can be varied ?

Filtered ImageRepresentations(to account for Illumination changes,Multi-modalities)

ModelParameters

Measuringmismatches(SSD, Correlations)

OptimizationFunction

© SRI International Sarnoff© SRI International Sarnoff© SRI International

How do we solve for the motion ?

)p(I))p(up(I)p(I 112 ′== -Use Taylor Series Expansion

)2(O)p(uI)p(I)p(I T112 +∇= -

Image Gradient

Convert constraint into an objective function

∑∈

∈Rp

2T1SSD ))p(I)p(uI()u(E δ+=

)p(I)p(I 12 -

© SRI International Sarnoff© SRI International Sarnoff© SRI International

Optical Flow Constraint Equation

)2(O)p(uI)p(I)p(I T112 +∇= -

0)p(I)p(uIT1 ≈+δ∈

At a Single Pixel

Leads to

0IuIuI ty

yx

x =++

Normal FlowII

- t

© SRI International Sarnoff© SRI International Sarnoff© SRI International

© SRI International Sarnoff© SRI International Sarnoff© SRI International

© SRI International Sarnoff© SRI International Sarnoff© SRI International

© SRI International Sarnoff© SRI International Sarnoff© SRI International

© SRI International Sarnoff© SRI International Sarnoff© SRI International

Generalized M-Estimation

∑i

iΘ),σ;r(ρmin ));(()( 12 Θ−−= iiii pupIpIr

• Given a solution )m(Θ at the mth iteration, find Θδ by solving :

∑∑ ∑ ∂∂

−=∂∂∂

∂∂

l i i k

ii

i

il

l

i

k

i

i

i rrrrrr

rr

θρθ

θθρ )()(

k∀

iw

• iw is a weight associated with each measurement.Can be varied to provide robustness to outliers.

Choices of the );r( i σρ function:

2SS

σ1

r)r(ρ=

222

2GM

)rσ(σ2

r)r(ρ

+=

2

2

SS σ2rρ =

22

22

GM σr1σrρ

+=

© SRI International Sarnoff© SRI International

Optimization Functions & their Corresponding Weight Plots

Geman-Mclure Sum-of-squares

© SRI International Sarnoff© SRI International Sarnoff© SRI International

Model-based Coarse-to-fine Image AlignmentPyramid Processing and Alignment

∑ +−p

2

ΘΘ))u(p;(pI(p)I( 21min )

{ R, T, d(p) }{ H, e, k(p) }

{ dx(p), dy(p) }

d(p)

Warper-

© SRI International Sarnoff© SRI International Sarnoff© SRI International

Application : Object Insertion/Deletion with Layers Video Stream with

deleted moving objectOriginal Video

Dynamic Mosaic Video

© SRI International Sarnoff© SRI International Sarnoff© SRI International

1D vs. 2D SCANNING

• 1D : The topology of frames is a ribbon or a string.Frames overlap only with their temporal neighbors.

• 2D : The topology of frames is a 2D graphFrames overlap with neighbors on manysides

(A 300x332 mosaic captured by mosaicing a 1D sequence of 6 frames)

© SRI International Sarnoff© SRI International Sarnoff© SRI International

1D vs. 2D SCANNING

The 1D scan scaled by 2 to 600x692 A 2D scanned mosaic of size 600x692

© SRI International Sarnoff© SRI International Sarnoff© SRI International

FRAME-TO-Frame VS. LOCAL-TO-GLOBALALIGNMENT

• Uses limited 2D spatial context

• Causal commitment to parameters cannot be corrected

• Demands large overlap betweenframes

• Uses all the available frame-to-frameconstraints

• Global solution is optimal subject tolocal frame-to-frame constraints

• Works even with small overlap betweenframes

© SRI International Sarnoff© SRI International Sarnoff© SRI International

CHOICE OF 2D MANIFOLD

Plane Cylinder Cone

Sphere Arbitrary

© SRI International Sarnoff© SRI International Sarnoff© SRI International

PROBLEM FORMULATION

Given an arbitrary scan of a scene

Create a globally aligned mosaic by minimizing

∑ ∑∈

++=Gij i

iij EEE mosaic) theofArea (min 2

}{σ

iP

Create a compact appearance while being geometrically consistent

© SRI International Sarnoff© SRI International Sarnoff© SRI International

Loss Function to be Minimized

∑ ∑∈

++=Gij i

iij EEE mosaic) theofArea (min 2

}{σ

iP

where

ation transformdistortionleast like criterion prioria for allow toerror term reference to Frame:

relations odneighborho therepresents that Graph : and neighbors betweenerror alignment of measureAny :

mapping, image-to- Reference:

i

ij

EG

jiEXPuP iii =

© SRI International Sarnoff© SRI International

GLOBALLY CONSISTENT ALIGNMENT: Bundle Adjustment

• Given: arcs ij in graph G of neighbors

• The local alignment parameters, Qij, help establish feature correspondence between i and j

• If uil and ujl are correspondingpoints in frames i,j, then

211 |)()(| jljili PP uuEij−− −=

• Incrementally adjust poses Pi to minimize

ui uj

Eij

∑ ∑∈

+=Gij i

iij EEE}{

miniP

© SRI International Sarnoff© SRI International Sarnoff© SRI International

LOCAL TO GLOBAL MOSAIC ALGORITHM

TopologyDetermination

TemporalCoarse

RegistrationLocal

Coarse&FineRegistration

GlobalConsistency

ColorMatching/Blending

MosaicRepresentation

Imagesor

Video

Panoramic Visualization

Virtual Reality

Other Applications

© SRI International Sarnoff© SRI International Sarnoff© SRI International

PLANAR TOPOLOGY EVOLUTION

Whiteboard Video Sequence75 frames

PLANAR TOPOLOGY EVOLUTION

© SRI International Sarnoff© SRI International Sarnoff© SRI International

FINAL MOSAIC

© SRI International Sarnoff© SRI International Sarnoff© SRI International

SPHERICAL MOSAICS

Sarnoff Library VideoCaptures almost the complete sphere

with 380 frames

© SRI International Sarnoff© SRI International Sarnoff© SRI International

SPHERICAL TOPOLOGY EVOLUTION

© SRI International Sarnoff© SRI International Sarnoff© SRI International

SPHERICAL MOSAICSarnoff Library

© SRI International Sarnoff© SRI International Sarnoff© SRI International

SPHERICAL MOSAICSarnoff Library

NEW SYNTHESIZED VIEWS

© SRI International Sarnoff© SRI International Sarnoff© SRI International

FINAL MOSAICPrinceton University Courtyard

© SRI International Sarnoff© SRI International Sarnoff© SRI International

VIDEO MOSAIC EXAMPLE

Princeton Chapel Video Sequence54 frames

© SRI International Sarnoff© SRI International Sarnoff© SRI International

UNBLENDED CHAPEL MOSAIC

© SRI International Sarnoff© SRI International Sarnoff© SRI International

Image Merging withLaplacian Pyramids

Image 1 Image 2

1 2

Combined Seamless Image

© SRI International Sarnoff© SRI International Sarnoff© SRI International

VORONOI TESSELATIONS W/ L1 NORM

© SRI International Sarnoff© SRI International Sarnoff© SRI International

BLENDED CHAPEL MOSAIC

© SRI International Sarnoff

Applications of 2D/3D Alignment

© SRI International Sarnoff© SRI International

High Dynamic Range Management

68

Improve overall driving experience, see under adverse conditions

Today’s imagers can’t image full outdoor scene dynamic range

Real-time, low latency high dynamic range sensor processing

• Robust motion adaptive frame to frame alignment• Local contrast enhancement for deep pixel range• Tight sensor exposure management

© SRI International Sarnoff© SRI International Sarnoff© SRI International

Extreme blur reduction examples

70

Eye Chart

Aerial

SRI’s image processing (MASI)for extreme camera motion

© SRI International Sarnoff© SRI International

Temporal Image Enhancement and Haze Reduction

71

Challenges:- Robust under low

SNR conditions- Difficult temporal

registration- Moving platforms- Low feature

contentRaw Low SNR video Multi-frame Temporal

Alignment and Fusion

Original Imagery Dehaze Dehaze and CN

Dehaze and Enhancement for Submarine Periscope

Video

© SRI International Sarnoff© SRI International

Contrast Normalization for Wide Dynamic Range

7272

© SRI International Sarnoff© SRI International

Three Band Fusion with Contrast Normalization

73

VIS SWIR

LWIR Fused

© SRI International Sarnoff© SRI International

High Quality Stereo Sequence Synthesis (IMAX 3D Content Creation)

Live Action Content• Camera is very large.

• Requires two strips of large format film.

• Size of camera and cost of film limits production.

15 perforations

70 mm

© SRI International Sarnoff© SRI International Sarnoff© SRI International

Live Action Sequence

© SRI International Sarnoff© SRI International Sarnoff© SRI International

Live Action : Hybrid Input

Left

Right

© SRI International Sarnoff© SRI International Sarnoff© SRI International

Synthesized Output

Left

Right

© SRI International Sarnoff© SRI International Sarnoff© SRI International

Hybrid Stereo Camera... pure upsampling is not an option ...

INPUT OUTPUT

Left Eye(1.5K)

Right Eye(6K)

Left Eye(6K)

Right Eye(6K)

1:16

© SRI International Sarnoff© SRI International Sarnoff© SRI International

Render the High-Res content into the coordinate systemof the Low-Res Frame !

How can the Hybrid Camera be Realized ?

??

Left Eye

Right Eyet

t

t+2t+1t-1t-2

© SRI International Sarnoff© SRI International

ApproachConvergence of Computer Vision & IBR

• Compute stereo disparities at lo-res.

• Compute motion (Optical Flow) at lo-res.

• Compute quality map at lo-res.

• Synthesize hi-res frame.

• Fill-in and color correct mis-matched pixels.

• Temporal de-scintillation.

© SRI International Sarnoff© SRI International Sarnoff© SRI International

Correspondences by Coarse-to-fine Model-based Image AlignmentAPrimer

Synthesis vs. Up-resing : Live Action

Synthesis vs. Up-resing : CG Animation

top related