computer vision - a modern approach set: pyramids and texture slides by d.a. forsyth scaled...

53
Computer Vision - A Modern Approach Set: Pyramids and Texture Slides by D.A. Forsyth Scaled representations Big bars (resp. spots, hands, etc.) and little bars are both interesting Stripes and hairs, say Inefficient to detect big bars with big filters And there is superfluous detail in the filter kernel Alternative: Apply filters of fixed size to images of different sizes Typically, a collection of images whose edge length changes by a factor of 2 (or root 2) This is a pyramid (or Gaussian pyramid) by visual analogy

Post on 21-Dec-2015

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Computer Vision - A Modern Approach Set: Pyramids and Texture Slides by D.A. Forsyth Scaled representations Big bars (resp. spots, hands, etc.) and little

Computer Vision - A Modern ApproachSet: Pyramids and Texture

Slides by D.A. Forsyth

Scaled representations

• Big bars (resp. spots, hands, etc.) and little bars are both interesting– Stripes and hairs, say

• Inefficient to detect big bars with big filters– And there is superfluous detail

in the filter kernel

• Alternative:– Apply filters of fixed size to

images of different sizes

– Typically, a collection of images whose edge length changes by a factor of 2 (or root 2)

– This is a pyramid (or Gaussian pyramid) by visual analogy

Page 2: Computer Vision - A Modern Approach Set: Pyramids and Texture Slides by D.A. Forsyth Scaled representations Big bars (resp. spots, hands, etc.) and little

Computer Vision - A Modern ApproachSet: Pyramids and Texture

Slides by D.A. Forsyth

A bar in the big images is a hair on the zebra’s nose; in smaller images, a stripe; in the smallest, the animal’s nose

Page 3: Computer Vision - A Modern Approach Set: Pyramids and Texture Slides by D.A. Forsyth Scaled representations Big bars (resp. spots, hands, etc.) and little

Computer Vision - A Modern ApproachSet: Pyramids and Texture

Slides by D.A. Forsyth

Aliasing

• Can’t shrink an image by taking every second pixel

• If we do, characteristic errors appear – In the next few slides

– Typically, small phenomena look bigger; fast phenomena can look slower

– Common phenomenon

• Wagon wheels rolling the wrong way in movies

• Checkerboards misrepresented in ray tracing

• Striped shirts look funny on colour television

Page 4: Computer Vision - A Modern Approach Set: Pyramids and Texture Slides by D.A. Forsyth Scaled representations Big bars (resp. spots, hands, etc.) and little

Computer Vision - A Modern ApproachSet: Pyramids and Texture

Slides by D.A. Forsyth

Resample the checkerboard by taking one sample at each circle. In the case of the top left board, new representation is reasonable. Top right also yields a reasonable representation. Bottom left is all black (dubious) and bottom right has checks that are too big.

Page 5: Computer Vision - A Modern Approach Set: Pyramids and Texture Slides by D.A. Forsyth Scaled representations Big bars (resp. spots, hands, etc.) and little

Computer Vision - A Modern ApproachSet: Pyramids and Texture

Slides by D.A. Forsyth

Constructing a pyramid by taking every second pixel leads to layers that badly misrepresent the top layer

Page 6: Computer Vision - A Modern Approach Set: Pyramids and Texture Slides by D.A. Forsyth Scaled representations Big bars (resp. spots, hands, etc.) and little

Computer Vision - A Modern ApproachSet: Pyramids and Texture

Slides by D.A. Forsyth

Open questions

• What causes the tendency of differentiation to emphasize noise?

• In what precise respects are discrete images different from continuous images?

• How do we avoid aliasing?

• General thread: a language for fast changes The Fourier Transform

Page 7: Computer Vision - A Modern Approach Set: Pyramids and Texture Slides by D.A. Forsyth Scaled representations Big bars (resp. spots, hands, etc.) and little

Computer Vision - A Modern ApproachSet: Pyramids and Texture

Slides by D.A. Forsyth

F g x,y u,v g x, y e i2 uxvy dxdyR 2

The Fourier Transform

• Represent function on a new basis– Think of functions as vectors,

with many components

– We now apply a linear transformation to transform the basis

• dot product with each basis element

• In the expression, u and v select the basis element, so a function of x and y becomes a function of u and v

• basis elements have the form

e i2 uxvy

Page 8: Computer Vision - A Modern Approach Set: Pyramids and Texture Slides by D.A. Forsyth Scaled representations Big bars (resp. spots, hands, etc.) and little

Computer Vision - A Modern ApproachSet: Pyramids and Texture

Slides by D.A. Forsyth

To get some sense of what basis elements look like, we plot a basis element --- or rather, its real part ---as a function of x,y for some fixed u, v. We get a function that is constant when (ux+vy) is constant. The magnitude of the vector (u, v) gives a frequency, and its direction gives an orientation. The function is a sinusoid with this frequency along the direction, and constant perpendicular to the direction.

Page 9: Computer Vision - A Modern Approach Set: Pyramids and Texture Slides by D.A. Forsyth Scaled representations Big bars (resp. spots, hands, etc.) and little

Computer Vision - A Modern ApproachSet: Pyramids and Texture

Slides by D.A. Forsyth

Here u and v are larger than in the previous slide.

Page 10: Computer Vision - A Modern Approach Set: Pyramids and Texture Slides by D.A. Forsyth Scaled representations Big bars (resp. spots, hands, etc.) and little

Computer Vision - A Modern ApproachSet: Pyramids and Texture

Slides by D.A. Forsyth

And larger still...

Page 11: Computer Vision - A Modern Approach Set: Pyramids and Texture Slides by D.A. Forsyth Scaled representations Big bars (resp. spots, hands, etc.) and little

Computer Vision - A Modern ApproachSet: Pyramids and Texture

Slides by D.A. Forsyth

Phase and Magnitude

• Fourier transform of a real function is complex– difficult to plot, visualize

– instead, we can think of the phase and magnitude of the transform

• Phase is the phase of the complex transform

• Magnitude is the magnitude of the complex transform

• Curious fact– all natural images have about

the same magnitude transform

– hence, phase seems to matter, but magnitude largely doesn’t

• Demonstration– Take two pictures, swap the

phase transforms, compute the inverse - what does the result look like?

Page 12: Computer Vision - A Modern Approach Set: Pyramids and Texture Slides by D.A. Forsyth Scaled representations Big bars (resp. spots, hands, etc.) and little

Computer Vision - A Modern ApproachSet: Pyramids and Texture

Slides by D.A. Forsyth

Page 13: Computer Vision - A Modern Approach Set: Pyramids and Texture Slides by D.A. Forsyth Scaled representations Big bars (resp. spots, hands, etc.) and little

Computer Vision - A Modern ApproachSet: Pyramids and Texture

Slides by D.A. Forsyth

This is the magnitude transform of the cheetah pic

Page 14: Computer Vision - A Modern Approach Set: Pyramids and Texture Slides by D.A. Forsyth Scaled representations Big bars (resp. spots, hands, etc.) and little

Computer Vision - A Modern ApproachSet: Pyramids and Texture

Slides by D.A. Forsyth

This is the phase transform of the cheetah pic

Page 15: Computer Vision - A Modern Approach Set: Pyramids and Texture Slides by D.A. Forsyth Scaled representations Big bars (resp. spots, hands, etc.) and little

Computer Vision - A Modern ApproachSet: Pyramids and Texture

Slides by D.A. Forsyth

Page 16: Computer Vision - A Modern Approach Set: Pyramids and Texture Slides by D.A. Forsyth Scaled representations Big bars (resp. spots, hands, etc.) and little

Computer Vision - A Modern ApproachSet: Pyramids and Texture

Slides by D.A. Forsyth

This is the magnitude transform of the zebra pic

Page 17: Computer Vision - A Modern Approach Set: Pyramids and Texture Slides by D.A. Forsyth Scaled representations Big bars (resp. spots, hands, etc.) and little

Computer Vision - A Modern ApproachSet: Pyramids and Texture

Slides by D.A. Forsyth

This is the phase transform of the zebra pic

Page 18: Computer Vision - A Modern Approach Set: Pyramids and Texture Slides by D.A. Forsyth Scaled representations Big bars (resp. spots, hands, etc.) and little

Computer Vision - A Modern ApproachSet: Pyramids and Texture

Slides by D.A. Forsyth

Reconstruction with zebra phase, cheetah magnitude

Page 19: Computer Vision - A Modern Approach Set: Pyramids and Texture Slides by D.A. Forsyth Scaled representations Big bars (resp. spots, hands, etc.) and little

Computer Vision - A Modern ApproachSet: Pyramids and Texture

Slides by D.A. Forsyth

Reconstruction with cheetah phase, zebra magnitude

Page 20: Computer Vision - A Modern Approach Set: Pyramids and Texture Slides by D.A. Forsyth Scaled representations Big bars (resp. spots, hands, etc.) and little

Computer Vision - A Modern ApproachSet: Pyramids and Texture

Slides by D.A. Forsyth

Various Fourier Transform Pairs

• Important facts– The Fourier transform is linear– There is an inverse FT– if you scale the function’s

argument, then the transform’s argument scales the other way. This makes sense --- if you multiply a function’s argument by a number that is larger than one, you are stretching the function, so that high frequencies go to low frequencies

– The FT of a Gaussian is a Gaussian.

• The convolution theorem– The Fourier transform of the

convolution of two functions is the product of their Fourier transforms

– The inverse Fourier transform of the product of two Fourier transforms is the convolution of the two inverse Fourier transforms

• There’s a table in the book.

Page 21: Computer Vision - A Modern Approach Set: Pyramids and Texture Slides by D.A. Forsyth Scaled representations Big bars (resp. spots, hands, etc.) and little

Computer Vision - A Modern ApproachSet: Pyramids and Texture

Slides by D.A. Forsyth

Sampling in 1D takes a continuous function and replaces it with a vector of values, consisting of the function’s values at a set of sample points. We’ll assume that these sample points are on a regular grid, and can place one at each integer for convenience.

Page 22: Computer Vision - A Modern Approach Set: Pyramids and Texture Slides by D.A. Forsyth Scaled representations Big bars (resp. spots, hands, etc.) and little

Computer Vision - A Modern ApproachSet: Pyramids and Texture

Slides by D.A. Forsyth

Sampling in 2D does the same thing, only in 2D. We’ll assume that these sample points are on a regular grid, and can place one at each integer point for convenience.

Page 23: Computer Vision - A Modern Approach Set: Pyramids and Texture Slides by D.A. Forsyth Scaled representations Big bars (resp. spots, hands, etc.) and little

Computer Vision - A Modern ApproachSet: Pyramids and Texture

Slides by D.A. Forsyth

A continuous model for a sampled function

• We want to be able to approximate integrals sensibly

• Leads to– the delta function

– model on right

Sample2D f (x,y) f (x, y) (x i, y j)i

i

f (x,y) (x i, y j)i

i

Page 24: Computer Vision - A Modern Approach Set: Pyramids and Texture Slides by D.A. Forsyth Scaled representations Big bars (resp. spots, hands, etc.) and little

Computer Vision - A Modern ApproachSet: Pyramids and Texture

Slides by D.A. Forsyth

The Fourier transform of a sampled signal

F Sample2D f (x,y) F f (x, y) (x i,y j)i

i

F f (x,y) **F (x i, y j)i

i

F u i,v j j

i

Page 25: Computer Vision - A Modern Approach Set: Pyramids and Texture Slides by D.A. Forsyth Scaled representations Big bars (resp. spots, hands, etc.) and little

Computer Vision - A Modern ApproachSet: Pyramids and Texture

Slides by D.A. Forsyth

Page 26: Computer Vision - A Modern Approach Set: Pyramids and Texture Slides by D.A. Forsyth Scaled representations Big bars (resp. spots, hands, etc.) and little

Computer Vision - A Modern ApproachSet: Pyramids and Texture

Slides by D.A. Forsyth

Page 27: Computer Vision - A Modern Approach Set: Pyramids and Texture Slides by D.A. Forsyth Scaled representations Big bars (resp. spots, hands, etc.) and little

Computer Vision - A Modern ApproachSet: Pyramids and Texture

Slides by D.A. Forsyth

Smoothing as low-pass filtering

• The message of the FT is that high frequencies lead to trouble with sampling.

• Solution: suppress high frequencies before sampling

– multiply the FT of the signal with something that suppresses high frequencies

– or convolve with a low-pass filter

• A filter whose FT is a box is bad, because the filter kernel has infinite support

• Common solution: use a Gaussian– multiplying FT by Gaussian is

equivalent to convolving image with Gaussian.

Page 28: Computer Vision - A Modern Approach Set: Pyramids and Texture Slides by D.A. Forsyth Scaled representations Big bars (resp. spots, hands, etc.) and little

Computer Vision - A Modern ApproachSet: Pyramids and Texture

Slides by D.A. Forsyth

Sampling without smoothing. Top row shows the images,sampled at every second pixel to get the next; bottom row shows the magnitude spectrum of these images.

Page 29: Computer Vision - A Modern Approach Set: Pyramids and Texture Slides by D.A. Forsyth Scaled representations Big bars (resp. spots, hands, etc.) and little

Computer Vision - A Modern ApproachSet: Pyramids and Texture

Slides by D.A. Forsyth

Sampling with smoothing. Top row shows the images. Weget the next image by smoothing the image with a Gaussian with sigma 1 pixel,then sampling at every second pixel to get the next; bottom row shows the magnitude spectrum of these images.

Page 30: Computer Vision - A Modern Approach Set: Pyramids and Texture Slides by D.A. Forsyth Scaled representations Big bars (resp. spots, hands, etc.) and little

Computer Vision - A Modern ApproachSet: Pyramids and Texture

Slides by D.A. Forsyth

Sampling with smoothing. Top row shows the images. Weget the next image by smoothing the image with a Gaussian with sigma 1.4 pixels,then sampling at every second pixel to get the next; bottom row shows the magnitude spectrum of these images.

Page 31: Computer Vision - A Modern Approach Set: Pyramids and Texture Slides by D.A. Forsyth Scaled representations Big bars (resp. spots, hands, etc.) and little

Computer Vision - A Modern ApproachSet: Pyramids and Texture

Slides by D.A. Forsyth

Applications of scaled representations

• Search for correspondence– look at coarse scales, then refine with finer scales

• Edge tracking– a “good” edge at a fine scale has parents at a coarser scale

• Control of detail and computational cost in matching– e.g. finding stripes

– terribly important in texture representation

Page 32: Computer Vision - A Modern Approach Set: Pyramids and Texture Slides by D.A. Forsyth Scaled representations Big bars (resp. spots, hands, etc.) and little

Computer Vision - A Modern ApproachSet: Pyramids and Texture

Slides by D.A. Forsyth

The Gaussian pyramid

• Smooth with gaussians, because– a gaussian*gaussian=another gaussian

• Synthesis – smooth and sample

• Analysis– take the top image

• Gaussians are low pass filters, so repn is redundant

Page 33: Computer Vision - A Modern Approach Set: Pyramids and Texture Slides by D.A. Forsyth Scaled representations Big bars (resp. spots, hands, etc.) and little

Computer Vision - A Modern ApproachSet: Pyramids and Texture

Slides by D.A. Forsyth

Page 34: Computer Vision - A Modern Approach Set: Pyramids and Texture Slides by D.A. Forsyth Scaled representations Big bars (resp. spots, hands, etc.) and little

Computer Vision - A Modern ApproachSet: Pyramids and Texture

Slides by D.A. Forsyth

Texture

• Key issue: representing texture– Texture based matching

• little is known

– Texture segmentation

• key issue: representing texture

– Texture synthesis

• useful; also gives some insight into quality of representation

– Shape from texture

• cover superficially

Page 35: Computer Vision - A Modern Approach Set: Pyramids and Texture Slides by D.A. Forsyth Scaled representations Big bars (resp. spots, hands, etc.) and little

Computer Vision - A Modern ApproachSet: Pyramids and Texture

Slides by D.A. Forsyth

Page 36: Computer Vision - A Modern Approach Set: Pyramids and Texture Slides by D.A. Forsyth Scaled representations Big bars (resp. spots, hands, etc.) and little

Computer Vision - A Modern ApproachSet: Pyramids and Texture

Slides by D.A. Forsyth

Representing textures

• Textures are made up of quite stylised subelements, repeated in meaningful ways

• Representation:– find the subelements, and

represent their statistics

• But what are the subelements, and how do we find them?– recall normalized correlation

– find subelements by applying filters, looking at the magnitude of the response

• What filters?– experience suggests spots and

oriented bars at a variety of different scales

– details probably don’t matter

• What statistics?– within reason, the more the

merrier.

– At least, mean and standard deviation

– better, various conditional histograms.

Page 37: Computer Vision - A Modern Approach Set: Pyramids and Texture Slides by D.A. Forsyth Scaled representations Big bars (resp. spots, hands, etc.) and little

Computer Vision - A Modern ApproachSet: Pyramids and Texture

Slides by D.A. Forsyth

Page 38: Computer Vision - A Modern Approach Set: Pyramids and Texture Slides by D.A. Forsyth Scaled representations Big bars (resp. spots, hands, etc.) and little

Computer Vision - A Modern ApproachSet: Pyramids and Texture

Slides by D.A. Forsyth

Page 39: Computer Vision - A Modern Approach Set: Pyramids and Texture Slides by D.A. Forsyth Scaled representations Big bars (resp. spots, hands, etc.) and little

Computer Vision - A Modern ApproachSet: Pyramids and Texture

Slides by D.A. Forsyth

Page 40: Computer Vision - A Modern Approach Set: Pyramids and Texture Slides by D.A. Forsyth Scaled representations Big bars (resp. spots, hands, etc.) and little

Computer Vision - A Modern ApproachSet: Pyramids and Texture

Slides by D.A. Forsyth

Page 41: Computer Vision - A Modern Approach Set: Pyramids and Texture Slides by D.A. Forsyth Scaled representations Big bars (resp. spots, hands, etc.) and little

Computer Vision - A Modern ApproachSet: Pyramids and Texture

Slides by D.A. Forsyth

Gabor filters at differentscales and spatial frequencies

top row shows anti-symmetric (or odd) filters, bottom row thesymmetric (or even) filters.

Page 42: Computer Vision - A Modern Approach Set: Pyramids and Texture Slides by D.A. Forsyth Scaled representations Big bars (resp. spots, hands, etc.) and little

Computer Vision - A Modern ApproachSet: Pyramids and Texture

Slides by D.A. Forsyth

The Laplacian Pyramid

• Synthesis– preserve difference between upsampled Gaussian pyramid level

and Gaussian pyramid level

– band pass filter - each level represents spatial frequencies (largely) unrepresented at other levels

• Analysis– reconstruct Gaussian pyramid, take top layer

Page 43: Computer Vision - A Modern Approach Set: Pyramids and Texture Slides by D.A. Forsyth Scaled representations Big bars (resp. spots, hands, etc.) and little

Computer Vision - A Modern ApproachSet: Pyramids and Texture

Slides by D.A. Forsyth

Page 44: Computer Vision - A Modern Approach Set: Pyramids and Texture Slides by D.A. Forsyth Scaled representations Big bars (resp. spots, hands, etc.) and little

Computer Vision - A Modern ApproachSet: Pyramids and Texture

Slides by D.A. Forsyth

Page 45: Computer Vision - A Modern Approach Set: Pyramids and Texture Slides by D.A. Forsyth Scaled representations Big bars (resp. spots, hands, etc.) and little

Computer Vision - A Modern ApproachSet: Pyramids and Texture

Slides by D.A. Forsyth

Page 46: Computer Vision - A Modern Approach Set: Pyramids and Texture Slides by D.A. Forsyth Scaled representations Big bars (resp. spots, hands, etc.) and little

Computer Vision - A Modern ApproachSet: Pyramids and Texture

Slides by D.A. Forsyth

Oriented pyramids

• Laplacian pyramid is orientation independent

• Apply an oriented filter to determine orientations at each layer– by clever filter design, we can simplify synthesis

– this represents image information at a particular scale and orientation

Page 47: Computer Vision - A Modern Approach Set: Pyramids and Texture Slides by D.A. Forsyth Scaled representations Big bars (resp. spots, hands, etc.) and little

Computer Vision - A Modern ApproachSet: Pyramids and Texture

Slides by D.A. Forsyth

Reprinted from “Shiftable MultiScale Transforms,” by Simoncelli et al., IEEE Transactionson Information Theory, 1992, copyright 1992, IEEE

Page 48: Computer Vision - A Modern Approach Set: Pyramids and Texture Slides by D.A. Forsyth Scaled representations Big bars (resp. spots, hands, etc.) and little

Computer Vision - A Modern ApproachSet: Pyramids and Texture

Slides by D.A. Forsyth

Analysis

Page 49: Computer Vision - A Modern Approach Set: Pyramids and Texture Slides by D.A. Forsyth Scaled representations Big bars (resp. spots, hands, etc.) and little

Computer Vision - A Modern ApproachSet: Pyramids and Texture

Slides by D.A. Forsyth

synthesis

Page 50: Computer Vision - A Modern Approach Set: Pyramids and Texture Slides by D.A. Forsyth Scaled representations Big bars (resp. spots, hands, etc.) and little

Computer Vision - A Modern ApproachSet: Pyramids and Texture

Slides by D.A. Forsyth

Final texture representation

• Form an oriented pyramid (or equivalent set of responses to filters at different scales and orientations).

• Square the output

• Take statistics of responses– e.g. mean of each filter output (are there lots of spots)

– std of each filter output

– mean of one scale conditioned on other scale having a particular range of values (e.g. are the spots in straight rows?)

Page 51: Computer Vision - A Modern Approach Set: Pyramids and Texture Slides by D.A. Forsyth Scaled representations Big bars (resp. spots, hands, etc.) and little

Computer Vision - A Modern ApproachSet: Pyramids and Texture

Slides by D.A. Forsyth

Texture synthesis

• Use image as a source of probability model

• Choose pixel values by matching neighbourhood, then filling in

• Matching process – look at pixel differences

– count only synthesized pixels

Page 52: Computer Vision - A Modern Approach Set: Pyramids and Texture Slides by D.A. Forsyth Scaled representations Big bars (resp. spots, hands, etc.) and little

Computer Vision - A Modern ApproachSet: Pyramids and Texture

Slides by D.A. Forsyth

Figure from Texture Synthesis by Non-parametric Sampling, A. Efros and T.K. Leung, Proc. Int. Conf. Computer Vision, 1999 copyright 1999, IEEE

Page 53: Computer Vision - A Modern Approach Set: Pyramids and Texture Slides by D.A. Forsyth Scaled representations Big bars (resp. spots, hands, etc.) and little

Computer Vision - A Modern ApproachSet: Pyramids and Texture

Slides by D.A. Forsyth

Variations

• Texture synthesis at multiple scales

• Texture synthesis on surfaces

• Texture synthesis by tiles

• “Analogous” texture synthesis