low level vision - a tuturial

24
Low-level vision tutorial 1 © E.R. Davies, 2014 Low-level vision – a tutorial Professor Roy Davies Overview and further reading Abstract This tutorial aims to help those with some experience of vision to obtain a more in-depth understanding of the problems of low-level vision. As it is not possible to cover everything in the space of 90 minutes, a carefully chosen series of topics is presented. This section of the notes is a commentary to accompany the slides in the presentation. 1. The nature of vision Vision is a complex process and has long been modelled, rather loosely, as a cascade of low, intermediate and high-level stages following image acquisition. However, this model is somewhat impoverished in that it ignores the possibility of downward as well as upward flow of information which can help with interpretation. In addition, there is need for considerable complexity and sophistication – so much so that the problems of low-level vision are often felt to be unimportant and are forgotten. Yet it remains the case that information that is lost at low level is never regained, while distortions that are introduced at low level can cause undue trouble at higher levels [1]. Furthermore, image acquisition is equally important. Thus, simple measures to arrange suitable lighting can help to make the input images easier and less ambiguous to interpret, and can result in much greater reliability and accuracy in applications such as automated inspection [1]. Nevertheless, in applications such as surveillance, algorithms should be made as robust as possible so that the vagaries of ambient illumination are rendered relatively unimportant. 2. Low-level vision This tutorial is concerned with low-level vision, and aims to indicate how some of its problems and limitations can be solved. This is a huge subject covering image acquisition, noise suppression, segmentation, colour and texture analysis, shape analysis, object recognition, and many other topics. Hence it is not possible to do justice to it in the space of one and a half hours. It is therefore assumed that participants have a reasonable concept of the subject area, and the tutorial therefore focusses on a number of topics that have been chosen because they involve important issues. The topics fall into five broad categories: Feature detection and sensitivity Image filters and morphology Robustness of object location Validity and accuracy in shape analysis Scale and affine invariance. In what follows, references are given for the topics falling under each of these categories, and for other topics that could not be covered in depth in the tutorial. 3. Feature detection and sensitivity General [1]. Edge detection [2, 3, 4]. Line segment detection [5, 6, 7]. Corner and interest point detection [1, 8, 9]. General feature mask design [10, 11]. The value of thresholding [1, 12, 13]. 4. Image filters and morphology General [1]. Effect of applying rank-order filters [1, 14]. Mathematical morphology [1, 15]. 5. Robustness of object location General [1, 16]. Centroidal profiles v. Hough transforms [1, 17– 19]. Fast object location by selective scanning [1, 20]. Robust statistics [1, 21, 22]. The RANSAC approach [23].

Upload: potaters

Post on 12-Jul-2015

49 views

Category:

Science


1 download

TRANSCRIPT

Page 1: Low level vision - A tuturial

Low-level vision tutorial 1

© E.R. Davies, 2014

Low-level vision – a tutorial

Professor Roy Davies

Overview and further reading

Abstract This tutorial aims to help those with some experience of vision to obtain a more in-depth understanding of the problems of low-level vision. As it is not possible to cover everything in the space of 90 minutes, a carefully chosen series of topics is presented. This section of the notes is a commentary to accompany the slides in the presentation. 1. The nature of vision Vision is a complex process and has long been modelled, rather loosely, as a cascade of low, intermediate and high-level stages following image acquisition. However, this model is somewhat impoverished in that it ignores the possibility of downward as well as upward flow of information which can help with interpretation. In addition, there is need for considerable

complexity and sophistication – so much so that the problems of low-level vision are often felt to be unimportant and are forgotten. Yet it remains the case that information that is lost at low level is never regained, while distortions that are introduced at low level can cause undue trouble at higher levels [1]. Furthermore, image acquisition is equally important. Thus, simple measures to arrange suitable lighting can help to make the input images easier and less ambiguous to interpret, and can result in much greater reliability and accuracy in applications such as automated inspection [1]. Nevertheless, in applications such as surveillance, algorithms should be made as robust as possible so that the vagaries of ambient illumination are rendered relatively unimportant. 2. Low-level vision This tutorial is concerned with low-level vision, and aims to indicate how some of its problems and limitations can be solved. This is a huge subject covering image acquisition, noise suppression, segmentation, colour and texture analysis, shape analysis, object recognition, and many other topics. Hence it is not possible to do justice to it in the space of one and a half hours. It is therefore assumed that participants have a reasonable concept of the subject area, and the tutorial therefore focusses on a number of topics that have been chosen because they involve important issues. The topics fall into five broad categories:

• Feature detection and sensitivity • Image filters and morphology • Robustness of object location • Validity and accuracy in shape analysis • Scale and affine invariance. In what follows, references are given for the topics falling under each of these categories, and for other topics that could not be covered in depth in the tutorial. 3. Feature detection and sensitivity General [1]. Edge detection [2, 3, 4]. Line segment detection [5, 6, 7]. Corner and interest point detection [1, 8, 9]. General feature mask design [10, 11]. The value of thresholding [1, 12, 13]. 4. Image filters and morphology General [1]. Effect of applying rank-order filters [1, 14]. Mathematical morphology [1, 15]. 5. Robustness of object location General [1, 16]. Centroidal profiles v. Hough transforms [1, 17–

19]. Fast object location by selective scanning [1, 20]. Robust statistics [1, 21, 22]. The RANSAC approach [23].

Page 2: Low level vision - A tuturial

Low-level vision tutorial 2

© E.R. Davies, 2014

6. Validity and accuracy in shape analysis Shape analysis [1, 24]. Object labelling [1]. Distance transforms and skeletons [1, 25, 26]. Boundary distance measures [27, 28]. Symmetry detection [29, 30]. 7. Scale and affine invariant features At this point it will be useful to consider a topic that has been developing for just over a decade, starting with papers [31, 32]. It arose largely because of difficulties in wide baseline stereo work, and with tracking object features over many video frames – because features change their character over time and correspondences are easily lost. To overcome this problem it was necessary first to eliminate the relatively simple problem of features changing in size, thereby necessitating ‘scale invariance’ (it being implicit that translation and rotation invariance have already been dealt with). Later, improvements became necessary to cope with ‘affine invariance’ (which also covers shear and skew invariance). Lindeberg’s pioneering theory [31] was soon followed by Lowe’s work [32, 33] on the ‘Scale Invariant Feature Transform’ (SIFT). This was followed by affine invariant methods [34–37]. In parallel with these developments, work was proceeding on maximally stable extremal regions [38] and other extremal methods [39, 40] (the latter concept has also been followed in a different context [13]).

Much of this work employs the interest point methods of Harris and Stephens [9], and is underpinned by careful in-depth experimental investigations and comparisons [37, 41, 42]. Finally, there are signs that the tide may be turning in other directions, specifically the design of feature detectors that concentrate on especially robust forms of scale invariance – as in the case of the ‘Speeded Up Robust Features’ (SURF) approach [43, 44]. See also the fuller review article [45]. Note that reference [1] contains a full and up-to-date appraisal of this important area. 8. Concluding remarks The topics presented in this tutorial necessarily give an incomplete picture because of the short span of time available. Nevertheless, they provide interesting lessons, and demonstrate important factors – specifically, the need for sensitivity, accuracy, robustness, reliability and validity. Further factors enter into the equation when practical systems are being designed – speed and cost of real-time hardware being amongst the most relevant. Overall, it is clear that low-level vision is an essential ingredient of the vision hierarchy and that its problems are fundamental to the whole of vision. Although the focus of attention in the subject may have moved on to more modern topics, it is definitely not the case that this side of the subject is played out and that everything is

now known about it. While data sets change and specifications for low-level algorithms remain incomplete, workers will have to go on developing algorithms to cope with the specific idiosyncrasies of their data to make the most of it in all the ways outlined above. 9. Further reading Participants may wish to further explore the material covered in this tutorial. The author’s book [1] covers many of the topics in fair depth; particular attention is drawn to the following chapters: • Chapter 3, for median and mode filters,

including edge shifts; • Chapter 5, for edge detection; • Chapters 9 and 10, for shape analysis –

especially thinning and centroidal profiles; • Chapter 12, for Hough-based methods of

circular object location; • Appendix on robust statistics. The book also contains a considerable amount of detailed information on 3D interpretation, invariants, texture analysis, automated inspection and practical issues. Acknowledgements The author would like to credit the following sources for permission to reproduce material from his earlier publications:

Page 3: Low level vision - A tuturial

Low-level vision tutorial 3

© E.R. Davies, 2014

• Elsevier for permission to reprint text and figures from [1, 2, 20, 26];

• IEE/IET for permission to reprint text and figures from [5, 7, 8, 11, 13, 14, 16, 28];

• IFS Publications Ltd. for permission to reprint figures from [18];

• Woodhead Publishing Ltd, for permission to reprint figures from [46].

References 1. Davies, E.R. (2012) Computer and Machine

Vision: Theory, Algorithms, Practicalities, Academic Press (4th edition), Academic Press, Oxford, UK

2. Davies, E.R. (1986) Constraints on the design of template masks for edge detection. Pattern Recogn. Lett. 4, no. 2, 111–120

3. Canny, J. (1986) A computational approach to edge detection. IEEE Trans. Pattern Anal. Mach. Intell. 8, 679–698

4. Petrou, M. and Kittler, J. (1988) On the optimal edge detector. Proc. 4th Alvey Vision Conf., Manchester (31 Aug.–2 Sept.), pp. 191–196

5. Davies, E.R. (1997) Designing efficient line segment detectors with high orientation accuracy. Proc. 6th IEE Int. Conf. on Image Processing and its Applications, Dublin (14–17 July), IEE Conf. Publication no. 443, pp. 636–640

6. Davies, E.R. (1997) Vectorial strategy for designing line segment detectors with high orientation accuracy. Electronics Lett. 33, no. 21, 1775–1777

7. Davies, E.R., Bateman, M., Mason, D.R., Chambers, J. and Ridgway, C. (1999) Detecting insects and other dark line features using isotropic masks. Proc. 7th IEE Int. Conf. on Image Processing and its Applications, Manchester (13–15 July), IEE Conf. Publication no. 465, pp. 225–229

8. Davies, E.R. (2005) Using an edge-based model of the Plessey operator to determine localisation properties, Proc. IEE Int. Conference on Visual Information Engineering (VIE 2005), University of Strathclyde, Glasgow (4–6 April), pp. 385–391

9. Harris, C. and Stephens, M. (1988) A combined corner and edge detector. Proc. Alvey Vision Conf., Manchester University, UK, pp. 147–151

10. Davies, E.R. (1992) Optimal template masks for detecting signals with varying background level. Signal Process. 29, no. 2, 183–189

11. Davies, E.R. (1999) Designing optimal image feature detection masks: equal area rule. Electronics Lett. 35, no. 6, 463–465

12. Hannah, I., Patel, D. and Davies, E.R. (1995) The use of variance and entropic thresholding methods for image segmentation. Pattern Recogn. 28, no. 8, 1135–1143

13. Davies, E.R. (2008) Stable bi-level and multi-level thresholding of images using a new global transformation. IET Computer Vision, 2, no. 2, Special Issue on Visual Information Engineering, Ed. Valestin, S.,

pp. 60–74 14. Davies, E.R. (1999) Image distortions

produced by mean, median and mode filters. IEE Proc. – Vision Image and Signal Processing 146, no. 5, 279–285

15. Bangham, J.A. and Marshall, S. (1998) Image and signal processing with mathematical morphology. IEE Electronics and Commun. Journal 10, no. 3, 117–128

16. Davies, E.R. (2000) Low-level vision requirements. Electron. Commun. Eng. J. 12, no. 5, 197–210

17. Hough, P.V.C. (1962) Method and means for recognising complex patterns. US Patent 3069654

18. Davies, E.R. (1984) Design of cost-effective systems for the inspection of certain foodproducts during manufacture. In Pugh, A. (ed.), Proc. 4th Conf. on Robot Vision and Sensory Controls, London (9–11 Oct.), pp. 437–446

19. Ballard, D.H. (1981) Generalizing the Hough transform to detect arbitrary shapes. Pattern Recogn. 13, 111–122

20. Davies, E.R. (1987) A high speed algorithm for circular object location. Pattern Recogn. Lett. 6, no. 5, 323–333

21. Meer, P., Mintz, D., Rosenfeld, A. and Kim, D.Y. (1991) Robust regression methods for computer vision: a review. Int. J. Comput. Vision 6, 59–70

22. Kim, D.Y., Kim, J.J., Meer, P., Mintz, D. and Rosenfeld, A. (1989) Robust computer vision: a least median of squares based approach. Proc. DARPA Image

Page 4: Low level vision - A tuturial

Low-level vision tutorial 4

© E.R. Davies, 2014

Understanding Workshop, Palo Alto, CA (23–26 May), pp. 1117–1134

23. Fischler, M.A. and Bolles, R.C. (1981) Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Comm. ACM 24, no. 6, 381–395

24. Hu, M.K. (1962) Visual pattern recognition by moment invariants. IRE Trans. Inf. Theory 8, 179–187

25. Rosenfeld, A. and Pfaltz, J.L. (1968) Distance functions on digital pictures. Pattern Recogn. 1, 33–61

26. Davies, E.R. and Plummer, A.P.N. (1981) Thinning algorithms: a critique and a new methodology. Pattern Recogn. 14, 53–63

27. Koplowitz, J. and Bruckstein, A.M. (1989) Design of perimeter estimators for digitized planar shapes. IEEE Trans. Pattern Anal. Mach. Intell. 11, 611–622

28. Davies, E.R. (1991) Insight into operation of Kulpa boundary distance measure. Electronics Lett. 27, no. 13, 1178–1180

29. Kuehnle, A. (1991) Symmetry-based recognition of vehicle rears. Pattern Recogn. Lett. 12, 249–258

30. Cho, M. and Lee, K.M. (2009) Bilateral symmetry detection via symmetry-growing. Proc. British Machine Vision Conf. (BMVC), paper 286, pp. 1–11

31. Lindeberg, T. (1998) Feature detection with automatic scale selection. Int. J. Computer Vision 30, no. 2, 79–116

32. Lowe, D.G. (1999) Object recognition from local scale-invariant features. Proc. 7th Int.

Conf. on Computer Vision (ICCV), Corfu, Greece, pp. 1150–1157

33. Lowe, D. (2004) Distinctive image features from scale-invariant keypoints. Int. J. Computer Vision 60, 91–110

34. Tuytelaars, T. and Van Gool, L. (2000) Wide baseline stereo matching based on local, affinely invariant regions. Proc. British Machine Vision Conf. (BMVC), Bristol University, UK, pp. 412–422

35. Mikolajczyk, K. and Schmid, C. (2002) An affine invariant interest point detector. Proc. European Conf. on Computer Vision (ECCV), Copenhagen, Denmark, pp. 128–142

36. Mikolajczyk, K. and Schmid, C. (2004) Scale and affine invariant interest point detectors. Int. J. Computer Vision 60, no. 1, 63–86

37. Mikolajczyk, K., Tuytelaars, T., Schmid, C., Zisserman, A., Matas, J., Schaffalitzky, F., Kadir, T. and Van Gool, L. (2005) A comparison of affine region detectors. Int. J. Computer Vision 65, 43–72

38. Matas, J., Chum, O., Urban, M. and Pajdla, T. (2002) Robust wide baseline stereo from maximally stable extremal regions. Proc. British Machine Vision Conf. (BMVC), Cardiff University, UK, pp. 384–393

39. Kadir, T. and Brady, M. (2001) Scale, saliency and image description. Int. J. Computer Vision 45, no. 2, 83–105

40. Kadir, T., Brady, M. and Zisserman, A. (2004) An affine invariant method for selecting salient regions in images. Proc. 8th

European Conf. on Computer Vision (ECCV), pp. 345–457

41. Schmid, C., Mohr, R. and C. Bauckhage (2000) Evaluation of interest point detectors. Int. J. Computer Vision 37, no. 2, 151–172

42. Mikolajczyk, K. and Schmid, C. (2005) A performance evaluation of local descriptors. IEEE Trans. Pattern Anal. Mach. Intell. 27, no. 10, 1615–1630

43. Bay, H., Tuytelaars, T. and Van Gool, L. (2006) SURF: speeded up robust features. Proc. 9th European Conf. on Computer Vision (ECCV), Springer LNCS volume 3951, part 1, pp. 404–417

44. Bay, H., Ess, A., Tuytelaars, T., Van Gool, L. (2008) Speeded-up robust features (SURF). Computer Vision Image Understanding 110, no. 3, 346–359

45. Tuytelaars, T. and Mikolajczyk, K. (2008) Local invariant feature detectors: a survey. Foundations and Trends in Computer Graphics and Vision 3, no. 3, 177–280

46. Davies, E.R. (2012) Computer vision for automatic sorting in the food industry. Chapter 6 In: Sun, D.-W. (ed.), Computer Vision Technology in the Food and Beverage Industries, Woodhead Publishing Ltd, Cambridge, UK, ISBN 978-0-85709-036-2, pp. 150–180.

http://www.woodheadpublishing.com/en/book.aspx?bookID=2323

Page 5: Low level vision - A tuturial

1

Low-level vision – a tutorial

by

Professor Roy Davies

of

Royal Holloway, University of London

2

The role of low-level vision*

Image acquisition

Feature detection

Intermediate level vision

High-level vision

*A somewhat simplified and naïve schema.

Low-level vision

Image preprocessing

3

An original image

4

Result of applying a median filter

5

Page 6: Low level vision - A tuturial

5

Result of applying a Harris corner detector

6

Result of applying a Sobel edge detector

7

Edge detection

Objectives:

To demonstrate the design of efficient edge detectors. This section contains copyright material reproduced from [1,2] with permission of

Elsevier.

8

Theory of 3 3 template operators First we assume that eight masks are to be used, for angles differing by 45°. Four of the masks differ from the others only in sign, and we are left with two paradigm masks: The all-too-ready assumption that C = A and D = B is by no means confirmed by theory, as we shall see.

6

Page 7: Low level vision - A tuturial

9

Let us apply these masks to the window: then estimation of the 0°, 90° and 45° components of gradient by the earlier general masks gives:

10

If vector addition is to be valid:

Equating coefficients of a, b, c, d, e, f, g, h, i leads to the self-consistent pair of conditions:

Insisting that the masks give equal responses at 22.5° leads to the final formula:

11

We have now obtained the following template masks for edge detection:

It is also possible to use the first two of these masks for detecting and orientating edge segments, taking them to provide vector components of edge intensity: gx, gy .

Thus we have derived the Sobel operator masks in a principled, non-ad hoc way.

12

Basic theory of edge detection and orientation edge intensity: edge orientation:

7

Page 8: Low level vision - A tuturial

13

The Canny operator The aim of the Canny edge detector is to be far more accurate than basic edge detectors such as the Sobel. To achieve this, a number of processes are applied in turn: Smooth image using a 2D Gaussian Differentiate using 1D derivative functions, e.g. Sobel Perform non-maximum suppression to thin the edges

(resample to achieve sub-pixel accuracy) Perform hysteresis thresholding.

14

Canny in action © Elsevier 2012

15

Line segment detection

Objectives:

To demonstrate the design of efficient line segment detectors. To show how high orientation accuracy can be achieved.

This section contains copyright material reproduced from [5] and [7] with

permission of the IET.

16

Insects can be modelled as dark rectangular bars:

Arbitrary orientation means that detection could involve considerable computation.

8

Page 9: Low level vision - A tuturial

17

Applying the vectorial approach to line segment detection This is difficult as line segments have a rotational period of , apparently ruling out the vectorial approach. The key is to refrain from considering line segments as local lines of the form: Instead we consider line segments as intensity maps in the form of alternating quadrants:

18

The result of this strategy leads to masks of the forms: However, the last two masks are sign-inverted versions of the first two: this gives just two masks – just the number required for a vectorial computation.

19

This leads to an effective gradient magnitude:

g = [g02 + g45

2]1/2

and apparent orientation:

= arctan (g45 /g0) and finally to an actual line orientation of:

= = arctan (g45 /g0)

where is defined only within the range 0 .

20

0

45 o

45 o

1

3 2

thin line segments (w = 0) wide line segments 1 w = 1.0 2 w = 1.4 3 w = 2.0 Estimated v. actual orientation for wide line segments

© IEE 1997

0

45 o

45 o

9

Page 10: Low level vision - A tuturial

21 © IEE 1999

Line segment detector locating insects

22 Mask design geometry

23

Corner and interest point detection

Objectives:

To demonstrate the principle of the Harris corner detector. To show how the signal depends on corner geometry.

This section contains copyright material reproduced from [1,8] with permission of

Elsevier and the IET.

24

Note: 1. The need for averaging 2. The built-in rotational invariance

Mathematical definition of the Harris operator

10

Page 11: Low level vision - A tuturial

25

General corner in a circular window

Single straight edge in a circular window

© IEE 2005

26

det = l1 l2 g4 sin2

and trace = (l1 + l2)g2

det = 0, so C = 0.

Case of a corner:

Case of a straight edge:

27

Possible geometries for sharp corners

(c) is case of maximum signal (corner shift = a). © IEE 2005

28

Harris interest points + non-maximum suppression

© Elsevier 2012

11

Page 12: Low level vision - A tuturial

29

Harris interest points + non-maximum suppression

© Elsevier 2012

30

Harris interest points + non-maximum suppression

31

Harris interest points + non-maximum suppression

© Elsevier 2012

32

Harris interest points + non-maximum suppression

© Elsevier 2012

12

Page 13: Low level vision - A tuturial

33

General feature mask design

Objectives: To show how sensitivity can be optimised using the

spatial matched filter concept. To demonstrate a limitation on the sizes of template

masks. This section contains copyright material reproduced from [11] with permission of

the IET.

34

Optimisation calculation

When performing feature detection in saturated grey-scale images, we shall take the object feature being detected as region 2, and its background as region 1.

© IEE 1999

35

On this model we have to calculate optimal values for the mask weighting factors w1 and w2 and for the region areas A1 and A2. First we write the total signal from a template mask as:

S = w1A1S1 + w2A2S2

and the total noise power as:

N2 = w12A1N1

2 + w22A2N2

2

36

Thus we obtain a power SNR: It is easy to see that if both mask regions are increased in size by the same factor , 2 will also be increased by this factor. This makes it interesting to optimise the mask by adjusting the relative values of A1, A2, leaving the total area A unchanged. Let us first eliminate w2 using the zero-mean condition:

w1A1 + w2A2 = 0

The signal and noise equations now become:

S = w1A1(S1 – S2)

N2 = w12A1N1

2 + w12A2 (A1/A2)2 N2

2

13

Page 14: Low level vision - A tuturial

37

Optimising the SNR now leads to:

A1/A2 = N1/N2

Taking N1 = N2, we obtain an important result – the equal area rule:

A1 = A2 = A/2 Finally, when the equal area rule applies, the zero mean rule takes the form:

w1 = – w2

Note that many cases, such as those arising when the foreground and background have different textures, can be modelled by writing N1 N2, In that case the equal area rule will not apply.

38

Applying the equal area rule. (a) Small feature, such as a hole. (b) Thick line segment. (c) Large blob. (d) Corner of a large object. (e) Corner detected by a specially shaped mask. (f) Sharp corner detected away from its apex.

© IEE 1999

39

The value of thresholding

Here the aims are to note that:

Where applicable, thresholding can be extremely effective. Considerable advances are still being made in this area. It is silly to regard thresholding as ‘old hat’: the old adage ‘horses for courses’ should be borne in mind.

This section contains copyright material reproduced from [1,13] with permission of

Elsevier and the IET.

40

Selecting the most suitable thresholds When considering whether a minimum is ‘significant’, it seems best to judge it in relation to global rather than local (noisy) maxima. dark foreground objects light background peaks from irrelevant ‘clutter’

f(I )

I

Distribution of intensities for dark objects on a light background

14

Page 15: Low level vision - A tuturial

41 Use the Arithmetic Mean or the Geometric Mean for finding significant minima?

AM

GM

orig y1 y2

42

smoothed GM

GM

original

Intensity matching to locate the road surface

© IET 2007

43

smoothed GM

GM

original

Finding cars from their shadows, within the road region

© IET 2007

44

Intensity matching to locate ergot contaminants amongst wheat grains

smoothed GM

GM

original

© IET 2007

15

Page 16: Low level vision - A tuturial

45

Image filters and morphology

Objectives: To model the effect of applying rank-order filters. To highlight the shifts produced by median filters.

To show how rank-order filters can be used for morphology.

This section contains copyright material reproduced from [14,46] with permission of

the IET and Woodhead Publishing Ltd.

46

Edge shifts D obtained on applying rank-order filters with various rank-order coefficients at locations of various curvatures relative to the window radius a.

1 –1 – a

a

D

0

0

= 0.3a

= 0

47

Geometry for estimation of contour shifts. This figure shows idealised intensity variations within a circular neighbourhood C of radius a.

© IEE 1999

48 Optimal segmentation of rat droppings

original

opened

thresholded

closed

© Woodhead 2012

16

Page 17: Low level vision - A tuturial

49

closed and eroded

median, eroded

best option

median

opened and eroded

© Woodhead 2012

Optimal segmentation of rat droppings

50

Centroidal profiles v. Hough

transforms

Objectives:

To show that the centroidal profile approach is non-robust.

To demonstrate the robustness of the Hough transform.

This section contains copyright material reproduced from [1,16,18] with permission

of Elsevier and IFS.

51

(a) A broken circular object, with centroid indicated. (b) Centroidal profile of an ideal round object. (c) Centroidal profile of broken object (a).

Note the difficulties caused in (c) by the shift of the centroid – the reference point for the computation.

© IEE 2000

52

Robust circle detection by the Hough transform: candidate circle centre locations are shown being accumulated in parameter space.

© IEE 2000

17

Page 18: Low level vision - A tuturial

53 Locating three biscuits from incomplete edges © IFS 1984

54

Location of broken and overlapping biscuits, showing the robustness of the Hough transform. Location accuracy is indicated by the black dots.

© IFS 1984

55 Iris location using the Hough transform

56

Fast object location by selective

scanning Objectives: To show that selective scanning locates objects

rapidly. To show that selective scanning is subject to graceful

degradation. This section contains copyright material reproduced from [20] with permission of

Elsevier.

18

Page 19: Low level vision - A tuturial

57

Object location using the chord bisection algorithm, using a step size of 8 pixels. The black dots show the positions of the horizontal and vertical chord bisectors, and the white dot shows the position found for the centre.

© Elsevier 1987

58

Location of a broken object using the chord bisection algorithm when about quarter of the boundary is missing: This gives a good indication of the limitations of the technique.

© Elsevier 1987

59

Robust statistics

Objectives: To demonstrate the problems of line fitting. To outline M-estimator and LMedS techniques.

This section contains copyright material reproduced from [1] with permission of

Elsevier.

60

(b) shows the ‘averaging in’ effect that results from application of a mean filter.

(c) shows that it is not evident for a median filter. © IEE 2000

19

Page 20: Low level vision - A tuturial

61 Fitting data points to straight lines. Rogue (‘outlier’) points make the problem far from straightforward.

© Elsevier 2005

62

The idea of influence functions such as these is to limit the influence of outlier residuals ri by applying a variable weighting which tends to zero for large values.

© Elsevier 2005

Tukey biweight Hampel 3-part re- descending

63

The least median of squares technique finds the narrowest parallel-sided strip that includes half the population of the distribution. The RANSAC technique applies a number of narrow parallel-sided strips, searching for the one that best matches the data. © Elsevier 2005 Random Sampling Consensus

64

RANSAC in action

20

Page 21: Low level vision - A tuturial

65 Use of RANSAC to locate lines in a medical application.

Video of laparoscopic tool location

66

Shape analysis

Objectives:

To demonstrate the complexities of object labelling. To indicate the problems of using simple shape

descriptors. To show that skeleton formation should be

mediated by global operations. This section contains copyright material reproduced from [1,26] with permission of

Elsevier.

67

Object labelling. This diagram shows how a basic labelling algorithm copes with a simple scene: two objects have label clashes which may be corrected by a reverse scan.

© IEE 2000

68 Distance function of a simple shape. © Elsevier 1981

21

Page 22: Low level vision - A tuturial

69

Guided thinning: here a thinning operation is applied in which the local maxima of the distance function must not be removed. Finally, excess local maxima points are eliminated.

© Elsevier 1981

70

Boundary distance measures

Objectives: To show that boundary distance measurement has

limited accuracy. To calculate a boundary distance correction factor.

This section contains copyright material reproduced from [28] with permission of

the IET.

71

Introduction Many recognition schemes involve tracking around the boundaries of objects and measuring boundary distance s or perimeter P. To do this the most obvious method is to assume that adjacent pixels are unit distance from the current pixel:

Clearly, a better measure assumes the following distances:

72

Near 0° and 45°, boundary coding leads to an over-estimate of the boundary length by a factor of 2.

© IEE 1991

22

Page 23: Low level vision - A tuturial

73

Geometry for calculating the error in boundary length estimation. Take a boundary segment to consist of horizontal and diagonal sections, with overall horizontal and vertical displacements a and b, as shown above.

© IEE 1991

74

Summary Optimisation of accuracy involves rigorous estimation, rather than the all too ready assumption that quick fixes will lead to good results. Note also that rigorous estimation of boundary length involves at its heart the estimation of the length of the original analogue curve rather than some chance version that results when the boundary is digitised. Finally, note that the final form: L = (nh + 2 nd) where still has limited accuracy for individual orientations – a factor that can only be improved by shape modelling over more than adjacent pixel separation distances.

75

Symmetry detection

Objectives: To note the wide applicability of this approach to

object detection. To note that it is in principle achievable by using 1D or 2D Hough transforms.

76

Potential applications of symmetry detection

23

Page 24: Low level vision - A tuturial

77

Scale and affine invariance

Objectives: To consider the need for scale-invariant and affine-

invariant feature detectors. To outline recent approaches to the problem.

(This section will merely aim to provide a lead-in to the lecture of Krystian Mikolajczyk.)

78

Recent important feature detectors SIFT = Scale Invariant Feature Transform (Lowe 1999, 2004)

Similarity transform (4 DoF):

Affine transform (6 DoF):

Affine transforms have the properties:

They convert parallel lines to parallel lines. They preserve ratios of distances on straight lines. They cover the transforms produced by Weak Perspective Projection (WPP).

79

The 4th edition of this book was published in 2012. It covers:

the whole of this talk detailed discussions of scale- and affine-invariant features new chapters on surveillance and in-vehicle vision systems over 1000 references, including ~500 since 2000.

24