low level vision - a tuturial
TRANSCRIPT
Low-level vision tutorial 1
© E.R. Davies, 2014
Low-level vision – a tutorial
Professor Roy Davies
Overview and further reading
Abstract This tutorial aims to help those with some experience of vision to obtain a more in-depth understanding of the problems of low-level vision. As it is not possible to cover everything in the space of 90 minutes, a carefully chosen series of topics is presented. This section of the notes is a commentary to accompany the slides in the presentation. 1. The nature of vision Vision is a complex process and has long been modelled, rather loosely, as a cascade of low, intermediate and high-level stages following image acquisition. However, this model is somewhat impoverished in that it ignores the possibility of downward as well as upward flow of information which can help with interpretation. In addition, there is need for considerable
complexity and sophistication – so much so that the problems of low-level vision are often felt to be unimportant and are forgotten. Yet it remains the case that information that is lost at low level is never regained, while distortions that are introduced at low level can cause undue trouble at higher levels [1]. Furthermore, image acquisition is equally important. Thus, simple measures to arrange suitable lighting can help to make the input images easier and less ambiguous to interpret, and can result in much greater reliability and accuracy in applications such as automated inspection [1]. Nevertheless, in applications such as surveillance, algorithms should be made as robust as possible so that the vagaries of ambient illumination are rendered relatively unimportant. 2. Low-level vision This tutorial is concerned with low-level vision, and aims to indicate how some of its problems and limitations can be solved. This is a huge subject covering image acquisition, noise suppression, segmentation, colour and texture analysis, shape analysis, object recognition, and many other topics. Hence it is not possible to do justice to it in the space of one and a half hours. It is therefore assumed that participants have a reasonable concept of the subject area, and the tutorial therefore focusses on a number of topics that have been chosen because they involve important issues. The topics fall into five broad categories:
• Feature detection and sensitivity • Image filters and morphology • Robustness of object location • Validity and accuracy in shape analysis • Scale and affine invariance. In what follows, references are given for the topics falling under each of these categories, and for other topics that could not be covered in depth in the tutorial. 3. Feature detection and sensitivity General [1]. Edge detection [2, 3, 4]. Line segment detection [5, 6, 7]. Corner and interest point detection [1, 8, 9]. General feature mask design [10, 11]. The value of thresholding [1, 12, 13]. 4. Image filters and morphology General [1]. Effect of applying rank-order filters [1, 14]. Mathematical morphology [1, 15]. 5. Robustness of object location General [1, 16]. Centroidal profiles v. Hough transforms [1, 17–
19]. Fast object location by selective scanning [1, 20]. Robust statistics [1, 21, 22]. The RANSAC approach [23].
Low-level vision tutorial 2
© E.R. Davies, 2014
6. Validity and accuracy in shape analysis Shape analysis [1, 24]. Object labelling [1]. Distance transforms and skeletons [1, 25, 26]. Boundary distance measures [27, 28]. Symmetry detection [29, 30]. 7. Scale and affine invariant features At this point it will be useful to consider a topic that has been developing for just over a decade, starting with papers [31, 32]. It arose largely because of difficulties in wide baseline stereo work, and with tracking object features over many video frames – because features change their character over time and correspondences are easily lost. To overcome this problem it was necessary first to eliminate the relatively simple problem of features changing in size, thereby necessitating ‘scale invariance’ (it being implicit that translation and rotation invariance have already been dealt with). Later, improvements became necessary to cope with ‘affine invariance’ (which also covers shear and skew invariance). Lindeberg’s pioneering theory [31] was soon followed by Lowe’s work [32, 33] on the ‘Scale Invariant Feature Transform’ (SIFT). This was followed by affine invariant methods [34–37]. In parallel with these developments, work was proceeding on maximally stable extremal regions [38] and other extremal methods [39, 40] (the latter concept has also been followed in a different context [13]).
Much of this work employs the interest point methods of Harris and Stephens [9], and is underpinned by careful in-depth experimental investigations and comparisons [37, 41, 42]. Finally, there are signs that the tide may be turning in other directions, specifically the design of feature detectors that concentrate on especially robust forms of scale invariance – as in the case of the ‘Speeded Up Robust Features’ (SURF) approach [43, 44]. See also the fuller review article [45]. Note that reference [1] contains a full and up-to-date appraisal of this important area. 8. Concluding remarks The topics presented in this tutorial necessarily give an incomplete picture because of the short span of time available. Nevertheless, they provide interesting lessons, and demonstrate important factors – specifically, the need for sensitivity, accuracy, robustness, reliability and validity. Further factors enter into the equation when practical systems are being designed – speed and cost of real-time hardware being amongst the most relevant. Overall, it is clear that low-level vision is an essential ingredient of the vision hierarchy and that its problems are fundamental to the whole of vision. Although the focus of attention in the subject may have moved on to more modern topics, it is definitely not the case that this side of the subject is played out and that everything is
now known about it. While data sets change and specifications for low-level algorithms remain incomplete, workers will have to go on developing algorithms to cope with the specific idiosyncrasies of their data to make the most of it in all the ways outlined above. 9. Further reading Participants may wish to further explore the material covered in this tutorial. The author’s book [1] covers many of the topics in fair depth; particular attention is drawn to the following chapters: • Chapter 3, for median and mode filters,
including edge shifts; • Chapter 5, for edge detection; • Chapters 9 and 10, for shape analysis –
especially thinning and centroidal profiles; • Chapter 12, for Hough-based methods of
circular object location; • Appendix on robust statistics. The book also contains a considerable amount of detailed information on 3D interpretation, invariants, texture analysis, automated inspection and practical issues. Acknowledgements The author would like to credit the following sources for permission to reproduce material from his earlier publications:
Low-level vision tutorial 3
© E.R. Davies, 2014
• Elsevier for permission to reprint text and figures from [1, 2, 20, 26];
• IEE/IET for permission to reprint text and figures from [5, 7, 8, 11, 13, 14, 16, 28];
• IFS Publications Ltd. for permission to reprint figures from [18];
• Woodhead Publishing Ltd, for permission to reprint figures from [46].
References 1. Davies, E.R. (2012) Computer and Machine
Vision: Theory, Algorithms, Practicalities, Academic Press (4th edition), Academic Press, Oxford, UK
2. Davies, E.R. (1986) Constraints on the design of template masks for edge detection. Pattern Recogn. Lett. 4, no. 2, 111–120
3. Canny, J. (1986) A computational approach to edge detection. IEEE Trans. Pattern Anal. Mach. Intell. 8, 679–698
4. Petrou, M. and Kittler, J. (1988) On the optimal edge detector. Proc. 4th Alvey Vision Conf., Manchester (31 Aug.–2 Sept.), pp. 191–196
5. Davies, E.R. (1997) Designing efficient line segment detectors with high orientation accuracy. Proc. 6th IEE Int. Conf. on Image Processing and its Applications, Dublin (14–17 July), IEE Conf. Publication no. 443, pp. 636–640
6. Davies, E.R. (1997) Vectorial strategy for designing line segment detectors with high orientation accuracy. Electronics Lett. 33, no. 21, 1775–1777
7. Davies, E.R., Bateman, M., Mason, D.R., Chambers, J. and Ridgway, C. (1999) Detecting insects and other dark line features using isotropic masks. Proc. 7th IEE Int. Conf. on Image Processing and its Applications, Manchester (13–15 July), IEE Conf. Publication no. 465, pp. 225–229
8. Davies, E.R. (2005) Using an edge-based model of the Plessey operator to determine localisation properties, Proc. IEE Int. Conference on Visual Information Engineering (VIE 2005), University of Strathclyde, Glasgow (4–6 April), pp. 385–391
9. Harris, C. and Stephens, M. (1988) A combined corner and edge detector. Proc. Alvey Vision Conf., Manchester University, UK, pp. 147–151
10. Davies, E.R. (1992) Optimal template masks for detecting signals with varying background level. Signal Process. 29, no. 2, 183–189
11. Davies, E.R. (1999) Designing optimal image feature detection masks: equal area rule. Electronics Lett. 35, no. 6, 463–465
12. Hannah, I., Patel, D. and Davies, E.R. (1995) The use of variance and entropic thresholding methods for image segmentation. Pattern Recogn. 28, no. 8, 1135–1143
13. Davies, E.R. (2008) Stable bi-level and multi-level thresholding of images using a new global transformation. IET Computer Vision, 2, no. 2, Special Issue on Visual Information Engineering, Ed. Valestin, S.,
pp. 60–74 14. Davies, E.R. (1999) Image distortions
produced by mean, median and mode filters. IEE Proc. – Vision Image and Signal Processing 146, no. 5, 279–285
15. Bangham, J.A. and Marshall, S. (1998) Image and signal processing with mathematical morphology. IEE Electronics and Commun. Journal 10, no. 3, 117–128
16. Davies, E.R. (2000) Low-level vision requirements. Electron. Commun. Eng. J. 12, no. 5, 197–210
17. Hough, P.V.C. (1962) Method and means for recognising complex patterns. US Patent 3069654
18. Davies, E.R. (1984) Design of cost-effective systems for the inspection of certain foodproducts during manufacture. In Pugh, A. (ed.), Proc. 4th Conf. on Robot Vision and Sensory Controls, London (9–11 Oct.), pp. 437–446
19. Ballard, D.H. (1981) Generalizing the Hough transform to detect arbitrary shapes. Pattern Recogn. 13, 111–122
20. Davies, E.R. (1987) A high speed algorithm for circular object location. Pattern Recogn. Lett. 6, no. 5, 323–333
21. Meer, P., Mintz, D., Rosenfeld, A. and Kim, D.Y. (1991) Robust regression methods for computer vision: a review. Int. J. Comput. Vision 6, 59–70
22. Kim, D.Y., Kim, J.J., Meer, P., Mintz, D. and Rosenfeld, A. (1989) Robust computer vision: a least median of squares based approach. Proc. DARPA Image
Low-level vision tutorial 4
© E.R. Davies, 2014
Understanding Workshop, Palo Alto, CA (23–26 May), pp. 1117–1134
23. Fischler, M.A. and Bolles, R.C. (1981) Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Comm. ACM 24, no. 6, 381–395
24. Hu, M.K. (1962) Visual pattern recognition by moment invariants. IRE Trans. Inf. Theory 8, 179–187
25. Rosenfeld, A. and Pfaltz, J.L. (1968) Distance functions on digital pictures. Pattern Recogn. 1, 33–61
26. Davies, E.R. and Plummer, A.P.N. (1981) Thinning algorithms: a critique and a new methodology. Pattern Recogn. 14, 53–63
27. Koplowitz, J. and Bruckstein, A.M. (1989) Design of perimeter estimators for digitized planar shapes. IEEE Trans. Pattern Anal. Mach. Intell. 11, 611–622
28. Davies, E.R. (1991) Insight into operation of Kulpa boundary distance measure. Electronics Lett. 27, no. 13, 1178–1180
29. Kuehnle, A. (1991) Symmetry-based recognition of vehicle rears. Pattern Recogn. Lett. 12, 249–258
30. Cho, M. and Lee, K.M. (2009) Bilateral symmetry detection via symmetry-growing. Proc. British Machine Vision Conf. (BMVC), paper 286, pp. 1–11
31. Lindeberg, T. (1998) Feature detection with automatic scale selection. Int. J. Computer Vision 30, no. 2, 79–116
32. Lowe, D.G. (1999) Object recognition from local scale-invariant features. Proc. 7th Int.
Conf. on Computer Vision (ICCV), Corfu, Greece, pp. 1150–1157
33. Lowe, D. (2004) Distinctive image features from scale-invariant keypoints. Int. J. Computer Vision 60, 91–110
34. Tuytelaars, T. and Van Gool, L. (2000) Wide baseline stereo matching based on local, affinely invariant regions. Proc. British Machine Vision Conf. (BMVC), Bristol University, UK, pp. 412–422
35. Mikolajczyk, K. and Schmid, C. (2002) An affine invariant interest point detector. Proc. European Conf. on Computer Vision (ECCV), Copenhagen, Denmark, pp. 128–142
36. Mikolajczyk, K. and Schmid, C. (2004) Scale and affine invariant interest point detectors. Int. J. Computer Vision 60, no. 1, 63–86
37. Mikolajczyk, K., Tuytelaars, T., Schmid, C., Zisserman, A., Matas, J., Schaffalitzky, F., Kadir, T. and Van Gool, L. (2005) A comparison of affine region detectors. Int. J. Computer Vision 65, 43–72
38. Matas, J., Chum, O., Urban, M. and Pajdla, T. (2002) Robust wide baseline stereo from maximally stable extremal regions. Proc. British Machine Vision Conf. (BMVC), Cardiff University, UK, pp. 384–393
39. Kadir, T. and Brady, M. (2001) Scale, saliency and image description. Int. J. Computer Vision 45, no. 2, 83–105
40. Kadir, T., Brady, M. and Zisserman, A. (2004) An affine invariant method for selecting salient regions in images. Proc. 8th
European Conf. on Computer Vision (ECCV), pp. 345–457
41. Schmid, C., Mohr, R. and C. Bauckhage (2000) Evaluation of interest point detectors. Int. J. Computer Vision 37, no. 2, 151–172
42. Mikolajczyk, K. and Schmid, C. (2005) A performance evaluation of local descriptors. IEEE Trans. Pattern Anal. Mach. Intell. 27, no. 10, 1615–1630
43. Bay, H., Tuytelaars, T. and Van Gool, L. (2006) SURF: speeded up robust features. Proc. 9th European Conf. on Computer Vision (ECCV), Springer LNCS volume 3951, part 1, pp. 404–417
44. Bay, H., Ess, A., Tuytelaars, T., Van Gool, L. (2008) Speeded-up robust features (SURF). Computer Vision Image Understanding 110, no. 3, 346–359
45. Tuytelaars, T. and Mikolajczyk, K. (2008) Local invariant feature detectors: a survey. Foundations and Trends in Computer Graphics and Vision 3, no. 3, 177–280
46. Davies, E.R. (2012) Computer vision for automatic sorting in the food industry. Chapter 6 In: Sun, D.-W. (ed.), Computer Vision Technology in the Food and Beverage Industries, Woodhead Publishing Ltd, Cambridge, UK, ISBN 978-0-85709-036-2, pp. 150–180.
http://www.woodheadpublishing.com/en/book.aspx?bookID=2323
1
Low-level vision – a tutorial
by
Professor Roy Davies
of
Royal Holloway, University of London
2
The role of low-level vision*
Image acquisition
Feature detection
Intermediate level vision
High-level vision
*A somewhat simplified and naïve schema.
Low-level vision
Image preprocessing
3
An original image
4
Result of applying a median filter
5
5
Result of applying a Harris corner detector
6
Result of applying a Sobel edge detector
7
Edge detection
Objectives:
To demonstrate the design of efficient edge detectors. This section contains copyright material reproduced from [1,2] with permission of
Elsevier.
8
Theory of 3 3 template operators First we assume that eight masks are to be used, for angles differing by 45°. Four of the masks differ from the others only in sign, and we are left with two paradigm masks: The all-too-ready assumption that C = A and D = B is by no means confirmed by theory, as we shall see.
6
9
Let us apply these masks to the window: then estimation of the 0°, 90° and 45° components of gradient by the earlier general masks gives:
10
If vector addition is to be valid:
Equating coefficients of a, b, c, d, e, f, g, h, i leads to the self-consistent pair of conditions:
Insisting that the masks give equal responses at 22.5° leads to the final formula:
11
We have now obtained the following template masks for edge detection:
It is also possible to use the first two of these masks for detecting and orientating edge segments, taking them to provide vector components of edge intensity: gx, gy .
Thus we have derived the Sobel operator masks in a principled, non-ad hoc way.
12
Basic theory of edge detection and orientation edge intensity: edge orientation:
7
13
The Canny operator The aim of the Canny edge detector is to be far more accurate than basic edge detectors such as the Sobel. To achieve this, a number of processes are applied in turn: Smooth image using a 2D Gaussian Differentiate using 1D derivative functions, e.g. Sobel Perform non-maximum suppression to thin the edges
(resample to achieve sub-pixel accuracy) Perform hysteresis thresholding.
14
Canny in action © Elsevier 2012
15
Line segment detection
Objectives:
To demonstrate the design of efficient line segment detectors. To show how high orientation accuracy can be achieved.
This section contains copyright material reproduced from [5] and [7] with
permission of the IET.
16
Insects can be modelled as dark rectangular bars:
Arbitrary orientation means that detection could involve considerable computation.
8
17
Applying the vectorial approach to line segment detection This is difficult as line segments have a rotational period of , apparently ruling out the vectorial approach. The key is to refrain from considering line segments as local lines of the form: Instead we consider line segments as intensity maps in the form of alternating quadrants:
18
The result of this strategy leads to masks of the forms: However, the last two masks are sign-inverted versions of the first two: this gives just two masks – just the number required for a vectorial computation.
19
This leads to an effective gradient magnitude:
g = [g02 + g45
2]1/2
and apparent orientation:
= arctan (g45 /g0) and finally to an actual line orientation of:
= = arctan (g45 /g0)
where is defined only within the range 0 .
20
0
45 o
45 o
1
3 2
thin line segments (w = 0) wide line segments 1 w = 1.0 2 w = 1.4 3 w = 2.0 Estimated v. actual orientation for wide line segments
© IEE 1997
0
45 o
45 o
9
21 © IEE 1999
Line segment detector locating insects
22 Mask design geometry
23
Corner and interest point detection
Objectives:
To demonstrate the principle of the Harris corner detector. To show how the signal depends on corner geometry.
This section contains copyright material reproduced from [1,8] with permission of
Elsevier and the IET.
24
Note: 1. The need for averaging 2. The built-in rotational invariance
Mathematical definition of the Harris operator
10
25
General corner in a circular window
Single straight edge in a circular window
© IEE 2005
26
det = l1 l2 g4 sin2
and trace = (l1 + l2)g2
det = 0, so C = 0.
Case of a corner:
Case of a straight edge:
27
Possible geometries for sharp corners
(c) is case of maximum signal (corner shift = a). © IEE 2005
28
Harris interest points + non-maximum suppression
© Elsevier 2012
11
29
Harris interest points + non-maximum suppression
© Elsevier 2012
30
Harris interest points + non-maximum suppression
31
Harris interest points + non-maximum suppression
© Elsevier 2012
32
Harris interest points + non-maximum suppression
© Elsevier 2012
12
33
General feature mask design
Objectives: To show how sensitivity can be optimised using the
spatial matched filter concept. To demonstrate a limitation on the sizes of template
masks. This section contains copyright material reproduced from [11] with permission of
the IET.
34
Optimisation calculation
When performing feature detection in saturated grey-scale images, we shall take the object feature being detected as region 2, and its background as region 1.
© IEE 1999
35
On this model we have to calculate optimal values for the mask weighting factors w1 and w2 and for the region areas A1 and A2. First we write the total signal from a template mask as:
S = w1A1S1 + w2A2S2
and the total noise power as:
N2 = w12A1N1
2 + w22A2N2
2
36
Thus we obtain a power SNR: It is easy to see that if both mask regions are increased in size by the same factor , 2 will also be increased by this factor. This makes it interesting to optimise the mask by adjusting the relative values of A1, A2, leaving the total area A unchanged. Let us first eliminate w2 using the zero-mean condition:
w1A1 + w2A2 = 0
The signal and noise equations now become:
S = w1A1(S1 – S2)
N2 = w12A1N1
2 + w12A2 (A1/A2)2 N2
2
13
37
Optimising the SNR now leads to:
A1/A2 = N1/N2
Taking N1 = N2, we obtain an important result – the equal area rule:
A1 = A2 = A/2 Finally, when the equal area rule applies, the zero mean rule takes the form:
w1 = – w2
Note that many cases, such as those arising when the foreground and background have different textures, can be modelled by writing N1 N2, In that case the equal area rule will not apply.
38
Applying the equal area rule. (a) Small feature, such as a hole. (b) Thick line segment. (c) Large blob. (d) Corner of a large object. (e) Corner detected by a specially shaped mask. (f) Sharp corner detected away from its apex.
© IEE 1999
39
The value of thresholding
Here the aims are to note that:
Where applicable, thresholding can be extremely effective. Considerable advances are still being made in this area. It is silly to regard thresholding as ‘old hat’: the old adage ‘horses for courses’ should be borne in mind.
This section contains copyright material reproduced from [1,13] with permission of
Elsevier and the IET.
40
Selecting the most suitable thresholds When considering whether a minimum is ‘significant’, it seems best to judge it in relation to global rather than local (noisy) maxima. dark foreground objects light background peaks from irrelevant ‘clutter’
f(I )
I
Distribution of intensities for dark objects on a light background
14
41 Use the Arithmetic Mean or the Geometric Mean for finding significant minima?
AM
GM
orig y1 y2
42
smoothed GM
GM
original
Intensity matching to locate the road surface
© IET 2007
43
smoothed GM
GM
original
Finding cars from their shadows, within the road region
© IET 2007
44
Intensity matching to locate ergot contaminants amongst wheat grains
smoothed GM
GM
original
© IET 2007
15
45
Image filters and morphology
Objectives: To model the effect of applying rank-order filters. To highlight the shifts produced by median filters.
To show how rank-order filters can be used for morphology.
This section contains copyright material reproduced from [14,46] with permission of
the IET and Woodhead Publishing Ltd.
46
Edge shifts D obtained on applying rank-order filters with various rank-order coefficients at locations of various curvatures relative to the window radius a.
1 –1 – a
a
D
0
0
= 0.3a
= 0
47
Geometry for estimation of contour shifts. This figure shows idealised intensity variations within a circular neighbourhood C of radius a.
© IEE 1999
48 Optimal segmentation of rat droppings
original
opened
thresholded
closed
© Woodhead 2012
16
49
closed and eroded
median, eroded
best option
median
opened and eroded
© Woodhead 2012
Optimal segmentation of rat droppings
50
Centroidal profiles v. Hough
transforms
Objectives:
To show that the centroidal profile approach is non-robust.
To demonstrate the robustness of the Hough transform.
This section contains copyright material reproduced from [1,16,18] with permission
of Elsevier and IFS.
51
(a) A broken circular object, with centroid indicated. (b) Centroidal profile of an ideal round object. (c) Centroidal profile of broken object (a).
Note the difficulties caused in (c) by the shift of the centroid – the reference point for the computation.
© IEE 2000
52
Robust circle detection by the Hough transform: candidate circle centre locations are shown being accumulated in parameter space.
© IEE 2000
17
53 Locating three biscuits from incomplete edges © IFS 1984
54
Location of broken and overlapping biscuits, showing the robustness of the Hough transform. Location accuracy is indicated by the black dots.
© IFS 1984
55 Iris location using the Hough transform
56
Fast object location by selective
scanning Objectives: To show that selective scanning locates objects
rapidly. To show that selective scanning is subject to graceful
degradation. This section contains copyright material reproduced from [20] with permission of
Elsevier.
18
57
Object location using the chord bisection algorithm, using a step size of 8 pixels. The black dots show the positions of the horizontal and vertical chord bisectors, and the white dot shows the position found for the centre.
© Elsevier 1987
58
Location of a broken object using the chord bisection algorithm when about quarter of the boundary is missing: This gives a good indication of the limitations of the technique.
© Elsevier 1987
59
Robust statistics
Objectives: To demonstrate the problems of line fitting. To outline M-estimator and LMedS techniques.
This section contains copyright material reproduced from [1] with permission of
Elsevier.
60
(b) shows the ‘averaging in’ effect that results from application of a mean filter.
(c) shows that it is not evident for a median filter. © IEE 2000
19
61 Fitting data points to straight lines. Rogue (‘outlier’) points make the problem far from straightforward.
© Elsevier 2005
62
The idea of influence functions such as these is to limit the influence of outlier residuals ri by applying a variable weighting which tends to zero for large values.
© Elsevier 2005
Tukey biweight Hampel 3-part re- descending
63
The least median of squares technique finds the narrowest parallel-sided strip that includes half the population of the distribution. The RANSAC technique applies a number of narrow parallel-sided strips, searching for the one that best matches the data. © Elsevier 2005 Random Sampling Consensus
64
RANSAC in action
20
65 Use of RANSAC to locate lines in a medical application.
Video of laparoscopic tool location
66
Shape analysis
Objectives:
To demonstrate the complexities of object labelling. To indicate the problems of using simple shape
descriptors. To show that skeleton formation should be
mediated by global operations. This section contains copyright material reproduced from [1,26] with permission of
Elsevier.
67
Object labelling. This diagram shows how a basic labelling algorithm copes with a simple scene: two objects have label clashes which may be corrected by a reverse scan.
© IEE 2000
68 Distance function of a simple shape. © Elsevier 1981
21
69
Guided thinning: here a thinning operation is applied in which the local maxima of the distance function must not be removed. Finally, excess local maxima points are eliminated.
© Elsevier 1981
70
Boundary distance measures
Objectives: To show that boundary distance measurement has
limited accuracy. To calculate a boundary distance correction factor.
This section contains copyright material reproduced from [28] with permission of
the IET.
71
Introduction Many recognition schemes involve tracking around the boundaries of objects and measuring boundary distance s or perimeter P. To do this the most obvious method is to assume that adjacent pixels are unit distance from the current pixel:
Clearly, a better measure assumes the following distances:
72
Near 0° and 45°, boundary coding leads to an over-estimate of the boundary length by a factor of 2.
© IEE 1991
22
73
Geometry for calculating the error in boundary length estimation. Take a boundary segment to consist of horizontal and diagonal sections, with overall horizontal and vertical displacements a and b, as shown above.
© IEE 1991
74
Summary Optimisation of accuracy involves rigorous estimation, rather than the all too ready assumption that quick fixes will lead to good results. Note also that rigorous estimation of boundary length involves at its heart the estimation of the length of the original analogue curve rather than some chance version that results when the boundary is digitised. Finally, note that the final form: L = (nh + 2 nd) where still has limited accuracy for individual orientations – a factor that can only be improved by shape modelling over more than adjacent pixel separation distances.
75
Symmetry detection
Objectives: To note the wide applicability of this approach to
object detection. To note that it is in principle achievable by using 1D or 2D Hough transforms.
76
Potential applications of symmetry detection
23
77
Scale and affine invariance
Objectives: To consider the need for scale-invariant and affine-
invariant feature detectors. To outline recent approaches to the problem.
(This section will merely aim to provide a lead-in to the lecture of Krystian Mikolajczyk.)
78
Recent important feature detectors SIFT = Scale Invariant Feature Transform (Lowe 1999, 2004)
Similarity transform (4 DoF):
Affine transform (6 DoF):
Affine transforms have the properties:
They convert parallel lines to parallel lines. They preserve ratios of distances on straight lines. They cover the transforms produced by Weak Perspective Projection (WPP).
79
The 4th edition of this book was published in 2012. It covers:
the whole of this talk detailed discussions of scale- and affine-invariant features new chapters on surveillance and in-vehicle vision systems over 1000 references, including ~500 since 2000.
24