vision, video and virtual reality feature extraction lecture 9 image segmentation csc 59866cd fall...

Vision, Video

and Virtual Reality Feature ExtractionFeature Extraction

Lecture 9

Image Segmentation

CSC 59866CDFall 2004

Zhigang Zhu, NAC 8/203Ahttp://www-cs.engr.ccny.cuny.edu/~zhu/

Capstone2004/Capstone_Sequence2004.html

Vision, Video

and Virtual Reality Finding CirclesFinding Circles

If we don’t know r, accumulator array is 3-dimensional If edge directions are known, computational complexity if reduced

Suppose there is a known error limit on the edge direction (say +/- 10o) - how does this affect the search?

Hough can be extended in many ways….see, for example: Ballard, D. H. Generalizing the Hough Transform to Detect Arbitrary Shapes, Pattern

Recognition 13:111-122, 1981. Illingworth, J. and J. Kittler, Survey of the Hough Transform, Computer Vision,

Graphics, and Image Processing, 44(1):87-116, 1988

Vision, Video

and Virtual Reality Region SegmentationRegion Segmentation

Partitioning of an image into different regions (connected components), each having uniform properties in some (set of) image feature(s): gray value color value(s) textural qualities local gradient motion shape info ..... etc.

Vision, Video

and Virtual Reality Goal of SegmentationGoal of Segmentation

Segment a scene into image elements which may correspond to meaningful scene elements

High Level Interpretations

Objects

Scene Elements

Image Elements

Raw Images

Vision, Video

and Virtual Reality Primary Goal of SegmentationPrimary Goal of Segmentation

“Segmenting an image into image elements which may correspond to meaningful scene elements”

What sort of image elements may correspond to meaningful scene elements?

Answer depends on type and complexity of images: Less constrained scenes must be segmented more conservatively.

Segmentation is not a well defined problem.

Vision, Video

and Virtual Reality Region Segmentation ExampleRegion Segmentation Example

Vision, Video

and Virtual Reality Color Image SegmentationColor Image Segmentation

Given a grayscale image, how do we generate a region segmentation? In general, regions can be formed from the original image data or from

'derived' images: - color images from R, G, B - textural images - displacement images from motion analyses - 3D depth images

?

Vision, Video

and Virtual Reality Problems with SegmentationProblems with Segmentation

In general, high level contextual knowledge is required for successful segmentation

Vision, Video

and Virtual Reality Formal Definition of RegionsFormal Definition of Regions

A region segmentation of an image, I, is a partition of the set of pixels of I into a set of K regions {Rj}, 1≤j≤K, such that:

1. I =”

i=1

K

Rj

2. Ri Rj = for i ≠ j

3. p connected to p’ for all p, p’ in Rj

4. For some predicate P

P(Ri) is TRUE for I = 1,2,…,KP(Ri Rj) is FALSE for Ri, Rj adjacent and i≠j

Every pixel belongs to a regionEvery pixel belongs to a region

No pixel belongs to more than one regionNo pixel belongs to more than one region

Spatial coherenceSpatial coherence

Feature coherenceFeature coherence

Vision, Video

and Virtual Reality Representing RegionsRepresenting Regions

Region Occupancy Map A set of region labels in registration with image I

specifying the region association for each pixel

1 1 1 1 1

1 1 1 1 1

1 1 1

1 1

1 1 1

1

2 2 2 2 2

2 2 2 2 2

2 2 2 2 2

2 2 2 2 2

2 2 2 2

22 2

2

3 3 3

3 3 3

4

4

4

4

4

4

4

4

4

4

4

4

5 5 5

5 5 55

5 5 555

5 5 555

5 5 555

5 5 555

55

6

6

6

6

6

6

6

6

6

6

6

6

6

6

6

7 7 7 7 7 7

77

7 7 7 7

7 7 7 7 7 7

1

8 8 8 8 8

8 8 8 8 8

8 8

Image Occupancy Map or Label Plane

Vision, Video

and Virtual Reality Contour RepresentationContour Representation

C 12C 3C

4C

5C

6C7C

8C

9C

10C

11C 12C

13C

14C

15C16C

17C18C

R 1

2R

3R

4R

5R

6R7R 8R

20C 19C

2C 3C

4C

5C

6C7C

8C

9C

10C

11C 12C

13C

14C

15C16C

17C18C

2R

3R

4R

5R

6R7R 8R

20C 19C

1C

1R

R1 : {C

1,C

8,C

11}

R8 : {C

1,C

5,C

17}

.

.

.

Image

Vision, Video

and Virtual Reality Chain CodeChain Code

The chain code representation of a boundary is found by 'walking' counterclockwise around the boundary and recording the direction to turn to stay on the border:

3 2 1 4 0 5 6 7

Direction Code

Vision, Video

and Virtual Reality Chain CodeChain Code

3 2 1 4 0 5 6 7

CC = (i,j) {5 5 6 6 6 0 0 0 0 0 0 0 1 1 2 2 2 4 4 5 4 3 4 4}

Vision, Video


Basic Approaches Generalized thresholding Region growing Region merging Region splitting Split and Merge Extensions to split and merge K-means clustering Watershed algorithms

Partitioning methods Grouping methods

Vision, Video

and Virtual Reality Partitioning MethodsPartitioning Methods

Partitioning: Given: a large data set. Goal: carve it up according to some notion of the association between items inside the set. We would like to decompose it into pieces that are “good” according to our model. For example, we might:

decompose an image into regions which have coherent color and/or texture inside them; take a video sequence and decompose it into shots — segments of video showing about the same

stuff from about the same view point; decompose a video sequence into motion blobs, consisting of regions thatwhave coherent color,

texture and motion.

Vision, Video

and Virtual Reality Grouping MethodsGrouping Methods

Grouping: Given: a set of distinct data items Goal: we wish to collect together sets of data items that “look similar” according to our

model. Effects like occlusion mean that image components that belong to the same object are often

separated. Examples of grouping include:

collecting together tokens that, taken together, form an interesting object collecting together tokens that seem to be moving together. Collecting together regions that have similar color and/or texture

Vision, Video


Two Basic Techniques Region Merging

START with many trivial regions (each pixel?) MERGE regions into larger regions based on some similarity criteria CONTINUE merging until no further merging is possible

Region Splitting START with a single large region (entire image?) SPLIT into several smaller regions based on a 'splitting' criterion CONTINUE until no further splitting is possible (regions are 'uniform')

Vision, Video



Vision, Video

and Virtual Reality Generalized ThresholdingGeneralized Thresholding

RLP(i,j) = k if Tk-1 <= I(i,h) < Tk k = 1,...,m Tk are thresholds. m is the number of distinct thresholds. RLP (Region Label Plane) may contain significantly more than m 'regions', hence connected components must be found and the

regions relabeled with distinct label numbers. The thresholds Tk may depend on:

the entire image I(i,j), GLOBAL THRESHOLDS N(i,j) (local neighborhood), LOCAL THRESHOLDS or I(i,j) and N(i,j). DYNAMIC THRESHOLDS

To apply thresholding, must determine: m - the number of thresholds, and, Tk - the threshold values.

Vision, Video

and Virtual Reality Threshold SelectionThreshold Selection

Manual: try one and see if it looks good Histogram analysis

Strategies Search for minimum between P1 and P2

(search for minima between several peaks - multi-thresholding) Fit second order equation

Differentiate to find minimum Smooth image and/or histogram first Histogram only points not on edges in histogram

Or unweight contribution of pixels having high gradient magnitude Gray level g

p(g)

Threshold T

P1

P2

Vision, Video

and Virtual Reality Threshold SelectionThreshold Selection

Gaussian fitting Intensity distribution for objects assumed to be normally distributed Minimize false positives/false negatives

Match properties of binary and gray level image e.g. moments

Choose threshold to maximize/minimize some function of the total edge gradient Maximize property of image or histogram

e.g. entropy

…………and probably lots of others

Vision, Video

and Virtual Reality Optimal ThresholdingOptimal Thresholding

Approximate the histogram using a weighted sum of two or more probability densities with normal distribution Threshold set at the closest gray level corresponding to the minimum probability between the maxima of two or more normal

distributions. Results in minimum error segmentation

Need to estimate parameters of the density functions Implies optimization

Probability distributions of backgrounds and object

Corresponding histograms and optimal thresholds

Vision, Video

and Virtual Reality Local or Adaptive ThresholdingLocal or Adaptive Thresholding

Intensity values can vary as a function of lighting (for example)….e.g.:

Very difficult to threshold using conventional methods.

Vision, Video

and Virtual Reality Local or Adaptive ThresholdingLocal or Adaptive Thresholding

Each pixel in image needs a unique threshold Two basic approaches

Chow and Kaneka Adaptive Thresholding (1972) Compute thresholds in a local window at sampled locations using histogram

technique Interpolate local thresholds across image

Local Thresholding Examine statistically pixel values in local neighborhood around pixel to be

thresholding Use local statistic as threshold Possibilities include mean, median, or mean of max and min value

Mean of 7x7 neighborhood

Vision, Video

and Virtual Reality Improved ResultsImproved Results

Results can be improved if the threshold employed is not the mean, but (mean-C), where C is a constant.

Using this statistic, all pixels which exist in a uniform neighborhood (e.g. along the margins) are set to background.

Mean7x7 neighborhood; C=7

Mean75x75 neighborhood; C=10

Median7x7 neighborhood; C=4

Vision, Video

and Virtual Reality Local Histogram ThresholdingLocal Histogram Thresholding

Vision, Video



Vision, Video

and Virtual Reality Region GrowingRegion Growing

Goal: Segment the image my repeatedly starting from a particular pixel, called a "seed" point, growing it into a region by iteratively adding neighboring points while some similarity criterion is met.

Is a set of algorithms to group pixels with similar attributes together GENERAL IDEA

A pixel is added to a partially grown region if two conditions are satisfied: The pixel must be adjacent to the region, and, The pixel must be "similar enough" to pixels already in the region.

The process continues until no further points can be added. Some other seed point, not already in any region, is chosen and the process is repeated until the entire image is

segmented.

Vision, Video

and Virtual Reality Region GrowingRegion Growing

Use a region label plane RLP same size as the original image contains a 1 if the corresponding image point is in the

region, 0 otherwise initially contains a 1 corresponding to the seed point

32 15 13 11 12 10 13 9

12 12 11 11 13 11 8 10

15 13 10 10 12 14 16 9

14 12 18 17 11 14 20 19

13 11 16 17 9 11 18 17

13 11 16 17 9 11 17 19

12 10 18 16 10 11 16 20

15 12 17 19 11 10 18 22

1

Image Region Label Plane

Vision, Video

and Virtual Reality Similarity CriteriaSimilarity Criteria

When is a pixel "similar enough" to be added to the region? different choices:

statistical population tests simple feature differences

General Idea Apply iteratively to all pixels until no more pixels are added.

IF RLP(i,j) = 0AND RLP(i-n, j-m) = 1 for some n,m = -1 to 1AND |I(i-n, j-m) - I(i,j)| < T (a predefined threshold)

THEN RMP(i,j) = 1

Vision, Video

and Virtual Reality Example: Simple Region GrowingExample: Simple Region Growing

Seed Point

T=16 T=32

T=64 T=128

Vision, Video

and Virtual Reality ProblemsProblems

What’s the similarity criterion? How do I select the threshold? Why does the algorithm ‘leak’?

Because of leaks, features in a region can vary arbitrarily. Fixed thresholds do not take into account characteristics of global spatial distribution.

pixel sequence: p1,..., pk

pj, pj+1neighbors| pj - pj+1| < T but | p1 - pk| > T

p1

pk

Vision, Video



Vision, Video

and Virtual Reality Region MergingRegion Merging

Define a distance function between regions. general form:

dij = D(Ri, Rj) > 0

Typically D is a distance measure in feature space and is a function only of the feature vectors associated with regions Ri and Rj:

dij = D(fi, fj) > 0

Merge regions with minimum distance Need to define some kind of termination criteria

Vision, Video

and Virtual Reality Region MergingRegion Merging

Define a distance measure dij = D(f(Ri), f(Rj)) >0 Algorithm:

{ While termination condition false

Determine the minimum distance regions

{i*, j*} = arg min dij

Merge the minimum distance regions

Ri* Ri* U Rj*

Remove merged region from region list L L-{Rj*

Compute termination condition

}

i,j

Ri, Rj L

Vision, Video

and Virtual Reality Merging HierarchyMerging Hierarchy

Algorithm generates a binary tree

Merging can be terminated when the minimum distance exceeds a threshold

d i*,j* > T stop merging Different thresholds produce different segmentations

Vision, Video

and Virtual Reality Fisher’s CriterionFisher’s Criterion

Mean and variance are good features to use for merging Especially if data is distributed normally

e.g. is modeled by the Gaussian distribution Peak occurs at (the mean) and has a value a a is (1 / 22) and ensures that the area under the curve = 1

For modeling histograms Compute area under the histogram and divide by (2 ) Parameter is called the variance

= sqrt () is the standard deviation measures the ‘flatness’ of the distribution

1/2

1/2

Vision, Video

and Virtual Reality Histogram ModelHistogram Model

Vision, Video

and Virtual Reality Fisher’s CriterionFisher’s Criterion

Discrimination between regions of different means and standard deviations can be done using Fischer’s criterion:

is a threshold If two regions have good separation in the means and low variance, then we can separate them.1 - 2

1 - 22

2

>

Vision, Video

and Virtual Reality UniformityUniformity

Thus, the merging threshold for the mean intensity for two adjacent regions, should vary depending on the expected uniformity of the merged region

Less uniform regions will require a lower threshold to prevent under merging Uniformity a function of both intensity mean and variance of the region

Combine them (heuristic) as Uniformity = 1 - / In range 0-1 for case where the samples are all positive

The threshold value decreases with the decrease in uniformity as = (1 - / ) 0

User need supply only one threshold 0

2

2

Vision, Video



Vision, Video

and Virtual Reality Region SplittingRegion Splitting

Region Splitting:(1) Start with a single large region (initially, entire image).

(2) Recursively split it into smaller regions.

(3) Continue splitting until each region is uniform (no further splits are possible).

A simple approach: Global Thresholding Define a global threshold T Apply to every pixel in the image I:

RLP(i,j) = 0 if I(i,j) < T

RLP(i,j) = 1 if I(i,j) ≥ T

Vision, Video

and Virtual Reality Region splitting: Multiple FeaturesRegion splitting: Multiple Features

Old algorithm: Ohlander-Price Basically a sequential histogram-based multiple threshold algorithm. Features used RGB, HSI, and YIQ (9 images) General Idea:

Start with the entire image as the initial region. Get the next region to be segmented

if none - the segmentation is complete. Compute the set of one-dimensional histograms. Select the "best" peak and find valleys on either side

if none, the region is "done" - put on finished list. Apply the threshold to the region and determine connected components. Add these regions to the list of regions to be further segmented and go to step 1.

Vision, Video

and Virtual Reality Ohlander-Price DataOhlander-Price Data

Vision, Video

and Virtual Reality Ohlander-Price ResultOhlander-Price Result

Vision, Video

and Virtual Reality Region SplittingRegion Splitting

Vision, Video

and Virtual Reality Peak Selection CriteriaPeak Selection Criteria0 Intensity peak in 0-60 or 200-255 ranges

Best is closest to end

1 Both minima < 10% highest value max/min ratio > 4

Another peak exists with max/min ratio > 2

2 Both minima < 25% of the peak value max/min ratio > 2

Another peak with max/min ratio > 2

3 Max/min ratio > 2 Another peak with max/min ratio > 2

If maxima are within 10%, then both are acceptable

(a bimodal distribution)

4 (Saturation only) Minima in 0-200 (lowest 20%)< 25% of the peak value

max/min ratio > 2

Specified minima must separate peak with max/min ratio > 1.2

5 Minima < 10% of highest value

10% of all points must be outside the peak

6 Minima < 70% of highest value

max/min ratio > 1.7

Vision, Video



Vision, Video

and Virtual Reality Hybrid TechniqueHybrid Technique

Split and Merge combination: splits followed by merges (or vice-versa) split and merge decisions can be either

local: a pixel and its immediate neighbors

a region and its immediate neighbors global: on the basis of a large number of

pixels scattered throughout the image

Vision, Video

and Virtual Reality Split and MergeSplit and Merge

General idea: Begin with an arbitrary region decomposition in a quadtree plane

Initial decomposition = entire image? Split each region which violates a uniformity predicate into its 4

quadtree children Merge (recursively) all regions which jointly satisfy a uniformity

criterion Supporting data structure: region adjacency graph

R

R1 R2 R3 R4

R41 R42 R43 R44

R1 R2

R3R41 R42

R43 R44

Vision, Video

and Virtual Reality Split and Merge ExampleSplit and Merge Example

Vision, Video



Vision, Video

and Virtual RealitySegmentation by k-means clusteringSegmentation by k-means clustering

Assume we know the data has k clusters (k known) Each cluster is assumed to

have a center ci

The jth element to be cluster is described by a feature vector xj

For scattered points, x would be coordinate(s) For an intensity image, x might be the intensity at a pixel

Define an objective function that measures how good the clustering result is.

Develop an algorithm to maximize the objective function.

Vision, Video

and Virtual Reality Objective FunctionObjective Function

Assume that element are close to the center of their cluster One possible objective function is

Note that if allocation of points to clusters is known, we can compute the best center easily Far too many associations of points to clusters to search this space for a minimum Instead, define an algorithm that alternates

Assume centers are known, allocate points Assume allocation is known, compute centers

Vision, Video

and Virtual Reality k-means algorithmk-means algorithm

Choose k data points to act as cluster centers Random selection First k data points

Until the cluster centers are unchanged Allocate each data point to cluster whose center is nearest Now ensure that every cluster has at least one data point; possible techniques for doing this include:

supplying empty clusters with a point chosen at random from points far from their cluster center. Replace the cluster centers with the mean of the elements in their clusters.

end Apply connected components algorithm to generate regions

Vision, Video

and Virtual Reality k-means Clusteringk-means Clustering

Vision, Video

and Virtual Reality ExampleExample

Assume k=5

Each pixel is represented by the mean value of the cluster to which the pixel belongs A connected components algorithm must be applied to make these true region segementations

Original Image k-means on intensity k-means on color

K=5

Vision, Video

and Virtual Reality Results, k=11Results, k=11

Sample clusters with k-means clustering based on color

Vision, Video

and Virtual Reality Other Distance MeasuresOther Distance Measures

Suppose we want to have compact regions New feature space: 5D

(2 spatial coordinates, 3 color components)

Points close in this space are close both in color and in actual proximity Problem with simple Euclidean distance:

what if coordinates range from 0-1000 but colors only range from 0-255? Depending on how things are scaled, gives different weight to different kinds of data

Weighted Euclidean distance: adjust weights to emphasize different dimensions

22)( iii yxcyx 22)( iii yxcyx

Vision, Video

and Virtual Reality Hybrid Edge-Region ApproachesHybrid Edge-Region Approaches

Same idea as region merging—start with oversegmented image. Use edge detection information as well. Merge not based on region statistics but on weak boundaries. Usually use fraction of edge pixels along shared boundary that are

below some threshold. Often used as post-processing for edge-based segmentation.

Vision, Video



Vision, Video

and Virtual Reality Watershed AlgorithmsWatershed Algorithms

A gray scale image can be viewed as a topographic relief map where the intensity function of the image represents the altitude.

A watershed region or catchment basin is defined as the region over which all points flow “downhill” to a common point. Points at which water would be equally likely to fall to more than one such minimum: watersheds or watershed lines Watersheds of gradient magnitude make useful region-based segmentation primitives. Boundaries of watersheds are one way to define ridges—

this is basically the

same idea as finding

a ridge of gradient

magnitude.

Vision, Video

and Virtual Reality WatershedsWatersheds

Vision, Video

and Virtual Reality Example on GrayScale ImageExample on GrayScale Image

Vision, Video

and Virtual Reality Immersion AlgorithmImmersion Algorithm

Start with all pixels with the lowest possible value (grad. mag.) these form the basis for initial watersheds

For each intensity level k: For each group of pixels of intensity k

If adjacent to exactly one existing region, add these pixels to that region Else if adjacent to more than one existing regions, mark as boundary Else start a new region

Can use a histogram-like structure to keep lists of all pixels with each intensity level k.

Vision, Video

and Virtual Reality Watershed ProblemsWatershed Problems

Oversegmentation watershed from markers

Computationally intensive computation intensive graph algorithm and appropriate data structures graph

Graylevel might not be the optimal choice as the local similarity measure similarity measure other local features

Statistical edge enhanced image distance transformed imagee transformed image…)

Vision, Video

and Virtual Reality Over-segmentation ProblemOver-segmentation Problem

Oversegmentation due to noise and other local irregularities of the gradient

Solution: markers (of ‘object’ locations)

Vision, Video

and Virtual Reality TobogganingTobogganing

Again work on the gradient magnitude image. Link each pixel to the smallest of its neighbors. If no smaller neighbors, become a are’ region seed’. All pixels that “flow” downhill (smallest neighbor) to

the same point form a single region.

Vision, Video

and Virtual Reality Multispectral SegmentationMultispectral Segmentation

Given N multi-spectral images:1. Independently segment each multi-spectral feature image using the localized histogram

algorithm.

2. Intersect the N segmentation to create a new, combined segmentation.

3. Merge using a region merging algorithm.

See J. Beveridge, J. Griffith, R. Kohler, A. Hanson, and E. Riseman, “Segmenting Images Using Localized Histograms and Region Merging”, IJCV 2(3), January 1989, pp. 311-347

Vision, Video

and Virtual Reality Color ImageColor Image

Intensity Red

Green Blue

Vision, Video

and Virtual Reality SegmentationsSegmentations

Intensity Red

Green Blue

Vision, Video

and Virtual Reality Three Color UnionThree Color Union

Note: This result was obtained by taking the union of three color segmentations obtained with a more sensitive set of parameters than those shown on the previous page and consequently has more boundaries.

Vision, Video

and Virtual Reality Final Segmentation: After MergingFinal Segmentation: After Merging

Segmentation obtained from over-segmented image on previous slide after applying a rule-based merging strategy using regions similarity, size, and connectivity.

Vision, Video

and Virtual Reality House ResultsHouse Results

Three Color Union

Intensity Only

Vision, Video

and Virtual Reality Region FeaturesRegion Features

Many features can be use to characterize a region and its properties Supports many tasks, including object recognition. Example features:

area, height, and width perimeter, bounding box, area of bounding box centroid orientation compactness moments .....etc.

Vision, Video

and Virtual Reality Region FeaturesRegion Features

Basics Perimeter Area Orientation

Rotation/translation/scale invariant Compactness = perimeter2/area Rectangularity = AreaRect/AreaObject (next slide) Euler number = #regions - #holes Convexity = AreaConvexHull/AreaObject

Vision, Video

and Virtual Reality RectangularityRectangularity

W and L are a function of Rectangularity = WxL/A Choose to get WxL minimum

Called the ‘minimum bounding rectangle’

Minimized for rectangular objects

W

L

A

Vision, Video

and Virtual Reality PerimeterPerimeter

Region plane boundary

P = number of pixels on boundary

P = number of horizontal steps + number of vertical steps + 2 x number of diagonal steps

or

Vision, Video

and Virtual Reality MomentsMoments

Centroid: center of mass

Higher order moments

Note: A = m00 ; r = m10 / m00 ; c = m01 / m00 ;

r = r I(r,c)1Ar c

c = c I(r,c)1Ar c

mpq = r c I(r,c)r c

p q

Vision, Video

and Virtual Reality Central MomentsCentral Moments

Moments around the center of mass

Note that A = u00 ; u01 = 0; u10 = 0

Higher order (2nd order and higher) moments are used heavily in oject recognition

upq = (r-r) (c-c) I(r,c)r c

p q

Vision, Video

and Virtual Reality OrientationOrientation

Let be the region orientation with respect to the r axis

Can be shown that

tan 2211

20 - 02

Vision, Video

and Virtual Reality SummarySummary Covered only basic approaches

simple region growing split, merge, and split and merge algorithms generalized thresholds …….

Many approaches to region segmentation statistical techniques: parameter estimation, mode estimation, clustering, decision theoretic methods... surface fits to the intensity surface: constant, plane, or bivariate polynomial,... relaxation: traditional, stochastic (simulated annealing) markov random fields methods context sensitive and knowledge-driven methods combinations with edge detection techniques optimization and learning methods multi-resolution

Literature is enormous

Vision, Video

and Virtual Reality Next TopicNext Topic

Next: Camera Model

vision, video and virtual reality feature extraction lecture 9 image segmentation csc 59866cd fall...

Documents