0-robust low complexity corner detector

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 4, APRIL 2011 435

Robust Low Complexity Corner DetectorPradip Mainali, Student Member, IEEE, Qiong Yang, Member, IEEE, Gauthier Lafruit, Member, IEEE,

Luc Van Gool, Member, IEEE, and Rudy Lauwereins, Senior Member, IEEE

Abstract—Corner feature point detection with both the high-speed and high-quality is still very demanding for many real-time computer vision applications. The Harris and Kanade-Lucas-Tomasi (KLT) are widely adopted good quality cornerfeature point detection algorithms due to their invariance torotation, noise, illumination, and limited view point change.Although they are widely adopted corner feature point detectors,their applications are rather limited because of their inabilityto achieve real-time performance due to their high complexity.In this paper, we redesigned Harris and KLT algorithms toreduce their complexity in each stage of the algorithm: Gaussianderivative, cornerness response, and non-maximum suppression(NMS). The complexity of the Gaussian derivative and cornernessstage is reduced by using an integral image. In NMS stage, wereplaced a highly complex sorting and NMS by the efficientNMS followed by sorting the result. The detected feature pointsare further interpolated for sub-pixel accuracy of the featurepoint location. Our experimental results on publicly availableevaluation data-sets for the feature point detectors show thatour low complexity corner detector is both very fast and similarin feature point detection quality compared to the originalalgorithm. We achieve a complexity reduction by a factor of9.8 and attain 50 f/s processing speed for images of size 640×480on a commodity central processing unit with 2.53 GHz and 3 GBrandom access memory.

Index Terms—Corner, feature point, Harris, interest point,KLT.

I. Introduction

FEATURE point detection at a low computational cost is afundamental problem for many real-time computer vision

applications, such as image matching, robot navigation, videostabilization, video frames mosaicing, and others. The useful-ness of the feature point detector in real-world applicationsare mainly influenced by the feature point detection quality,often measured by the repeatability [1], and execution time.The feature point detectors mainly detect corners [2]–[4] orblobs [5], [6], which are typical salient feature point regionsin an image to detect and can be detected repeatedly on images

Manuscript received June 7, 2010; revised September 3, 2010; acceptedOctober 2, 2010. Date of publication March 10, 2011; date of current versionApril 1, 2011. This paper was recommended by Associate Editor J. Zhang.

P. Mainali and R. Lauwereins are with the Department of Elec-trical Engineering, Katholieke Universiteit Leuven, and Interuniversi-tair Micro-Electronica Centrum VZW, Leuven 3001, Belgium (e-mail:[email protected]; [email protected]).

G. Lafruit and Q. Yang are with the Digital Component Group, Interuni-versitair Micro-Electronica Centrum VZW, Leuven 3001, Belgium (e-mail:[email protected]; [email protected]).

L. Van Gool is with the Department of Electrical Engineering,Katholieke Universiteit Leuven, Leuven 3001, Belgium (e-mail:[email protected]).

Color versions of one or more of the figures in this paper are availableonline at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TCSVT.2011.2125411

taken with different viewing conditions. The corner detectorshave wide applications, due to the corner region being suitableto align images with a small baseline and a wide baselineby iteratively tracking feature points [3] and by feature pointmatching [7], respectively.

The Harris and KLT are the two most important corner de-tectors. But, both the algorithms are computationally expensiveand can achieve only 5 f/s for images of size 640×480 on acommodity central processing unit (CPU) running at 2.53 GHz[8], thus limiting its usefulness for real-time applications.Therefore, the Harris and the KLT corner detectors were im-plemented on graphics processing unit (GPU) to achieve a real-time performance [9], [10], as the algorithms were parallelizedto utilize the large number of parallel processors present inGPU. However, a GPU is not always applicable for manyreal-time applications due to its high power consumption.For example, achieving real-time performance for applications,such as video stabilization, video image mosaicing, and others,still remain challenging on constrained setups such as digitalsignal processor in surgery rooms without active cooling,which require low and moderate power processing architec-tures. Hence, for embedded applications, the Harris and KLTalgorithms need to be redesigned to reduce the number ofoperations to achieve high-speed feature point detection atsimilar quality.

We introduced a low complexity corner detector, namedLOCOCO, by reducing the complexity of both the Harris andKLT in our previous paper [11]. In this paper, we providea more in-depth study and we further improve the stabilityof the LOCOCO. To improve the stability and performanceof the algorithm, the corner feature points extracted by theLOCOCO are further interpolated for a sub-pixel accuracy.With our modification, we can achieve 50 f/s processing speedwith no drop in quality for images of size 640×480 on acommodity CPU running at 2.53 GHz.

The key idea behind our method to reduce the complexityof the Harris and the KLT corner detectors originates fromtwo observations: the computations existing among the pixelswithin the integration window are overlapping and the sortingoperation used for non-maximum suppression (NMS) ishighly complex [2], [3]. There are three major steps involvedin the original algorithms to find out the corner feature points.First, the image gradients are computed by convolving theimage with the first-order Gaussian derivative kernel. Second,compute the cornerness response. Last, Quick-sort and aNMS are conducted to retain a single maximum point foreach corner region. The complexity is reduced in each stage

1051-8215/$26.00 c© 2011 IEEE

436 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 4, APRIL 2011

of the algorithm. First, the first order Gaussian derivativekernel is approximated by the Box kernel, which can becalculated quickly in combination with the integral image.Second, the computation of the cornerness response of eachpixel creates a sliding window effect with many calculationsoverlapping over successive pixels, which can be reduceddrastically by using the integral image. Last, Quick-sort wasused to rank the response before NMS. However, the orderingis not important as the NMS will always find out the localmaximum no matter which pixel we start from. Additionally,the comparison results of NMS windows can be shared onsuccessive pixels. So, the complexity is reduced by using theefficient NMS [12], which optimally reuses the results in thesuccessive pixels. Once the maximum points are extracted,they are sorted according to their cornerness response.

The remainder of this paper is organized as follows.Section II describes the literature survey on corner detectors.In Section III, we briefly discuss the Harris and KLT cornerdetectors. In Section IV, we present our new approach toreduce the complexity of these corner detectors with sub-pixelaccuracy, named as (S-LOCOCO) and Section V describesmemory complexity. In Section VI, we briefly describe therepeatability evaluation criteria of the feature point detec-tors. Section VII provides experimental results. Section VIIIprovides an evaluation in image registration applications andSection IX concludes this paper.

II. Related Work

In the past, several corner detectors were proposed toimprove quality or execution time performance. The corner de-tectors can be broadly categorized into two methods: contour-based and intensity-based. The contour-based corner detectorsfirst detect the edges in the image. The edge map is used to lo-cate corner feature points at intersections of the line segments[13], at high curvatures formed by ridges and valley [14] andat T-junctions [15]. However, methods in this category do notdetect every corner region in an image. In particular, they failto detect corner regions formed by a texture. Thus, the othercategory of corner detector operates on the intensity of eachpixel. The intensity-based techniques can be further subdividedinto two categories based on operating directly on intensity andderivatives of intensity. The former category, operates directlyon the intensities in the image patch. The corner detectors inthis category are SUSAN [16] and FAST [4]. The SUSANdetects corner points if the number of pixels that have samebrightness as a center pixel in a circular disc are below somethreshold. The FAST also uses the circular window to computethe darker and brighter pixel compared to a center pixel anduses machine-learning method to classify the pixel as a corner.These methods were mainly designed for a high speed. The lat-ter category, does not operate on pixel directly but uses the im-age derivatives to locate a corner point. In [17], second deriva-tive of the image is computed and the corner point is located bymeasuring the determinant of Hessian, which is also rotation-ally invariant. In [18], the curvature of planar curves is usedto detect a corner point. The corner detectors in [2], [3], [19],and [20] are based on computing the auto-correlation function

which averages the derivatives of the image in a small window.Harris [2] and KLT [3] locate corner regions if both eigenval-ues of the auto-correlation function are significantly large. Theimproved version of the Harris corner detector was proposedby replacing the template used for computing derivatives byderivatives of a Gaussian [1]. Among all these corner detectors,Harris [2] and KLT [3] are the two most important cornerdetectors, which were shown to have good performance due totheir invariance to rotation, illumination variation, and imagenoise as compared to other existing corner detectors [1], [21].

III. Harris and KLT Corner Detectors

Both the Harris [2] and KLT [3] corner detectors are basedon computing the cornerness response of each pixel as in (1),which measures the change in intensities due to shifts of a localintegration window in all directions giving peaks in cornernessresponse for the corner pixels as follows:

C(p) =∑

x∈W

{[g2

x(x) gx(x)gy(x)gx(x)gy(x) g2

y(x)

]× v(x)

}

=

[Gxx Gxy

Gyx Gyy

]. (1)

Here

gi = ∂i(g ⊗ I) = (∂ig) ⊗ I, i ∈ (x, y), g = G(x, y; σ)

and v(x) is a weighting function which is usually Gaussianor uniform, p is a center pixel, W is an integration windowcentered at p, I is the image, g is a 2-D Gaussian function,and gx and gy are image gradients obtained by the convolutionof Gaussian first order partial derivatives in x and y directionswith I, respectively.

The Harris corner detector evaluates the cornerness of eachpixel without explicit eigenvalue decomposition as in (2) asfollows:

R = |C| − k × (trace(C))2. (2)

Since |C| = λ1 ×λ2 and trace(C) = λ1 +λ2, where λ1 and λ2

are eigenvalues of C, thus the Harris corner detector does notrequire to do eigenvalue decomposition explicitly. The value ofk is chosen between [0.04, 0.06]. The image pixel is a corner ifboth the eigenvalues are large, thus resulting in a peak responsein R.

The KLT explicitly computes the eigenvalues of C(p).It selects those points for which the minimum eigenvaluecomputed as in (3) is larger than a given threshold [3] asfollows:

R = λmin = min(λ1, λ2) (3)

λmin =1

2(Gxx + Gyy −

√(Gxx − Gyy)2 + 4 × Gxy

2).

In fact, both the Harris and KLT corner detectors followa similar approach to detect corner points, but differ only inthe way the cornerness functions are evaluated as in (2) and(3), respectively. Thus, both the Harris and KLT algorithmscan be divided into three main steps [2], [3]: 1) the Gaussianderivatives of the image I, gx, and gy, are computed byconvolution with the Gaussian derivative kernel; 2) the matrix

MAINALI et al.: ROBUST LOW COMPLEXITY CORNER DETECTOR 437

TABLE I

Execution Time of Each Stage of the KLT Algorithm for the

Image of Size 640×480 on Commodity 2.53 GHz CPU

Gaussian Derivative Cornerness SortingExecution Time (ms) 40 122 46

Algorithm 1 Robust LOCOCO detector

1: Compute an integral image, ii, of the original image I.2: Compute image gradients, gx and gy, by using a Box

kernel and the integral image ii.3: Compute the integral images, iixx, iiyy, and iixy of g2

x, g2y,

and gxgy, respectively.4: Compute Gxx, Gyy, and Gxy by using the integral images

in step 3. Evaluate the cornerness response R as in (2)and (3) for SLC-Harris and SLC-KLT, respectively.

5: Perform efficient NMS on cornerness response image R

to extract maximum points for each corner pixel.6: Perform interpolation of R about feature locations to find

the sub-pixel accurate corner point location.7: For the SLC-KLT, sort the feature points according to their

cornerness response R. (F is the number of feature points.)

C(p) and cornerness measure R are evaluated individually foreach pixel; and 3) Quick-sort and NMS are used to suppressthe locally less strong points. The complexity is reduced bytwo aspects in our approach. First, the integral image is usedto reduce the complexity of both convolution and evaluationof cornerness response. Second, efficient NMS is adopted forNMS, thus avoiding highly complex sorting. Table I shows thecomplexity for each stage of the algorithm, where cornernessresponse is computationally most expensive.

IV. Robust LOCOCO Detector

The S-LOCOCO algorithm comprises of sub-pixel accuratelow complexity Harris (SLC-Harris) and sub-pixel accuratelow complexity KLT (SLC-KLT). Since both the Harris andKLT algorithms follow a similar approach to detect cornerpixels in the image, hence our method is applicable to bothalgorithms. To speedup the algorithm, the first order Gaussianderivative kernel is approximated by the Box kernel, wherethe convolution can be calculated at low computational costby using the integral image as described in Section IV-B. Wecreated the integral images of g2

x, g2y, and gxgy to speedup

the computation of the cornerness response in (1) required by(2) and (3) as described in Section IV-C. The highly complexsorting operation is replaced by the efficient NMS as describedin Section IV-D. Finally, for feature point stability, we performthe interpolation to localize the feature points at sub-pixelaccuracy as described in Section IV-E. The steps involved inthe S-LOCOCO algorithm are summarized in Algorithm 1.

A. Integral Image

In this section, we first describe the integral image brieflyto make this article self-contained. The integral image ii(x, y)at point (x, y), as shown in Fig. 1, is given by the summation

Fig. 1. Using the integral image takes three operations and four memoryaccesses to sum pixels within a rectangular window.

of all the pixels contained within the rectangle formed by theorigin and the location (x, y) [22], as follows:

ii(x, y) =∑

x′≤x, y′≤y

I(x′, y′). (4)

With a recursive approach, two operations per pixel arerequired to compute the integral image as follows:

s(x, y) = s(x, y − 1) + I(x, y)ii(x, y) = ii(x − 1, y) + s(x, y)

(5)

Once the integral image is computed, the summation of all thepixel values �S within a rectangular window can be calculatedin three operations and four memory accesses as shown inFig. 1.

B. Gaussian Derivative

The first stage in detecting the corner feature point requirescomputing the partial image derivatives, gx and gy, of an imageI in x and y directions, respectively. The Gaussian kernelsneed to be discretized and it is often approximated with thefinite impulse response filter, hence requiring a filter length ofat least 4σ. The performance of the Harris corner detectorwas improved by using the Gaussian derivative kernel [1]as indicated by the improved repeatability. For faster com-putations of convolutions, the recursive implementation of theGaussian derivative kernel was used. Also in the recursive filterapproach, the Gaussian derivative filter is approximated withthe infinite impulse response filter [23], which also allows tofix the length of filter. In this approach, the Gaussian derivativekernel calculation for different scales are performed in constanttime. However, this approach is still computationally moreexpensive than our method.

Inspired by SURF [6], we also approximate the first orderGaussian derivative kernel by a Box kernel. Fig. 2(a) and(b) shows the first order Gaussian derivative kernel in thex-direction and its approximation by the Box kernel, respec-tively. The gray regions are set to zero. The black and whiteregions are approximated by −1 and +1, respectively. With theintegral image, gradients are calculated at a low computationalcost and in constant time. Computing the summation withthe left black region and right white region requires threeoperations each, as described in Section IV-A. Hence, intotal only seven operations and eight memory accesses arerequired to compute the gradient. The filter response is furthernormalized with the filter size. Convolution with the Gaus-sian derivative kernel combines two steps together: low-pass


Fig. 2. (a) Discrete Gaussian partial first order × derivative kernel. (b) Boxkernel for the partial first order × derivative σ=1.2. The kernel size is 9×9(4σ�9).

TABLE II

Computational Complexity of Convolution with the First

Order Gaussian Derivative Kernel, Recursive Filter, and the

Box Kernel

Additions Multiplications TotalGaussian Filter 2N 2N 4N

Recursive Filter 14 16 30Box Filter 2+3×2 1 9

Fig. 3. Computational cost reduction factor in computing image derivativeby using the Box kernel and integral image as compared to original separableGaussian derivative kernel for different kernel lengths (N).

filtering with a Gaussian and differentiation. The Gaussianfunction is well approximated by a triangle function after thesubsequent integration with the integral imaging technique.

Table II shows the complexity of the convolution of theoriginal separable Gaussian derivative kernel, of the recursivefilter (with fourth order approximation) and of the Box kernelwith our method, respectively. Here, N is the length of theGaussian kernel. For the example in Fig. 2, using our method,two operations are required to create the integral image andsix operations to calculate the filter response. Thus, for a 9×9kernel corresponding to σ = 1.2, the speedup factor is at least4×9/9=4. Moreover, the number of operations required by theBox kernel is independent of the kernel size, thus the gain willincrease linearly with the kernel size as shown in Fig. 3. Thenumber of multiplication operations are reduced to 1 (for thenormalization) and all other multiplications are replaced by theaddition and subtraction operations. For a filter of size N > 2,as shown in Fig. 3, our method starts performing better. Ingeneral, filter lengths greater than this are required.

TABLE III

Computational Complexity Comparison in Computing the

Cornerness for Each Pixel

C(p) R TotalAdditions Multiplications

Harris 3 × W2 3 × W2 7 6 × W2 + 7

KLT 3 × W2 3 × W2 9 6 × W2 + 9SLC-Harris 3 × (2+3) 3 7 25SLC-KLT 3 × (2+3) 3 9 27

C. Cornerness Response

The cornerness stage is computationally the most intensiveand time-consuming stage of the Harris and KLT algorithms.The cornerness C(p) of the pixel is evaluated by summingthe squares and products of gradients within an integrationwindow W as in (1), which is evaluated at each pixel. Con-sequently, overlapping of the computation among the pixelslying within the integration window W occurs. Hence, thisstage can be accelerated drastically by using the integralimage. Therefore, to accelerate the computation of the cornerresponse, we create the integral image of the gradients in (1),g2

x, g2y, and gxgy as follows:

iixx(x, y) =∑

x′≤x, y′≤y

gx2(x′, y′) (6)

iiyy(x, y) =∑

x′≤x, y′≤y

gy2(x′, y′) (7)

iixy(x, y) =∑

x′≤x, y′≤y

gx(x′, y′)gy(x′, y′). (8)

We assume a rectangular window for both the algorithms,as commonly done [3], [8], thus v(x) = 1, v ∈ W . Oncethe integral images are created as in (6)–(8), the summa-tions Gxx, Gyy, and Gxy in (1), can be evaluated at a lowcomputational cost, simply with three operations and fourmemory accesses. As a result, the repeated multiplication andsummation operations within the integration window W acrossthe image for each pixel are replaced by one time creationof the integral image and simple addition and subtractionoperations afterward. Moreover, there is no loss of efficiencyof the detector with this modification and at the same timecontributing huge speedup on performance of the algorithm.Table III shows the complexity of the original algorithm andS-LOCOCO algorithm for per pixel, where W is the integra-tion window size.

For a 9×9 window, the upper limit of the speedup factor is15.4 and 14.5 for Harris and KLT, respectively. But, practicallythis speedup is slightly penalized by extra memory accessesfor iixy, not present in the original algorithm. In the originalalgorithm, once gradients are read from the memory, the gxgy

can be calculated at the same time as both the elements arealready present in the internal registers of CPU, which is notthe case with the integral image as it has been pre-computedand located in the memory. In addition, the operations requiredto evaluate the cornerness are independent of the integrationwindow size W , thus the speedup factor will increase paraboli-cally with the increase in the window size as shown in Fig. 4.


Fig. 4. Computational cost reduction factor in computing cornerness re-sponse by using the integral image as compared to the original approachfor different integration window sizes.

Hence, we can expect even larger speed for a feature pointdetection at larger scales.

D. NMS

NMS is performed over the cornerness response image topreserve one single location for each corner feature point.NMS for the KLT detector was performed in two stages [3].First, Quick-sort [24] was used to arrange the cornernessresponse R over the image in descending order [3], [8],[10], which is computationally expensive operation due to thesorting of a response point list with the size of the image.Afterward, non-maximum points were suppressed by pickingup the strong response points from the sorted response listand removing the less strong response points successive inthe list within the distance d around this feature. As a result,a minimum distance between the feature points were enforced.The naive implementation of NMS in the local neighborhoodof (2d+1)×(2d+1), around each cornerness response also leadsto a higher complexity for the Harris [2]. To find a maximumpoint, for each pixel, the comparison was performed with allthe pixels lying in a window (2d + 1)×(2d + 1). The maximumpoint was selected if it is higher than all the pixels and abovethe threshold. This process was repeated for all the pixelsin the cornerness response image. Once a maximum point isfound, this would imply that we can skip all the neighboringpixels up to distance d in all directions, as they are smaller byconstruction. Such information is explored effectively in theefficient NMS [12].

For the S-LOCOCO algorithm, we adopt the efficient non-maximum suppression (E-NMS) method proposed in [12] toefficiently extract unique feature locations for each cornerregion. The E-NMS performs the NMS in the image blocksinstead of pixel by pixel, thus reducing the computationalcomplexity. Intuitively, both the Quick-sort and minimumdistance enforcement stages are mainly aimed to track [3] asingle location for each corner region. This is equivalent toNMS. However, we can switch the order by first perform-ing efficient NMS, which is computationally less complex,and only then sorting the feature points according to theircornerness response. Since, sorting is performed on a smallnumber of points, the complexity is reduced drastically. TheE-NMS optimally reuses the comparison results across theNMS window to perform NMS at a minimal cost. The E-

Fig. 5. Efficient NMS algorithm. d is a minimum distance between thefeature points. Black dots are maximum within each block and these pointsare tested for maximum locally as shown by dotted window for points a, b,c, and so on.

Fig. 6. Computation cost reduction factor in performing NMS by usingE-NMS as compared to Quick-sort for different image sizes (complexity ofsorting feature points is small and hence not shown).

TABLE IV

Computational Complexity of Quick-Sort and E-NMS

Quick-Sort Efficient NMS + Feature Point Sorting

Complexity 1.39×b×h×{

2 + 2.5d+1 + 0.5

(d+1)2

}× b × h

log2(b × h) + 1.39×F×log2(F)

NMS algorithm, as shown in Fig. 5, works as follows. First,it partitions the image into blocks of size (d + 1)×(d + 1).Then, the maximum element is searched within each blockindividually. Finally, for each maximum within a block, thefull neighborhood is tested for a maximum as shown in Fig. 5with the dotted window. The feature point location is retainedif it passes the local maximum test and is larger than thethreshold. Finally, for the KLT, we perform sorting of thefeature points according to the cornerness response associatedwith them.

The average theoretical computational complexity for animage of size b×h for Quick-sort and E-NMS is shown inTable IV, where d is the distance between features. For theimage of size 1000×700 and d = 10, the theoretical speedupfactor is 12 and the speedup factor increases logarithmicallywith the image size as shown in Fig. 6.

E. Sub-Pixel Accurate Feature Point Localization

The corner feature point located by the above three stagesare at the extremum of the cornerness response at the pixellocation. The sub-pixel accuracy minimizes the projectionerror. The feature point localization at the sub-pixel accuracy isneeded for accurate estimation of the correspondence betweenthe frames, while estimating the correspondence parameters


[25], because the projection error is used as an objectivefunction for optimization. For example, after correspondencesbetween feature points are obtained, RANSAC [26] uses theprojection error of the feature points to find inlier subsetof feature points and Levenberg–Marquardt (LM) [27] fur-ther utilizes projection error as an optimization function toestimate correspondence parameters accurately. Thus, accu-rately localizing the feature point at sub-pixel accuracy isrequired to estimate the correspondence parameters precisely.The repeatability of feature point detector was improved bylocalizing the interest point at sub-pixel accuracy in scale-space [5], [28]. To localize the corner feature point locationat the sub-pixel accuracy, the cornerness response image R

computed in Section IV-C is interpolated up to the quadraticterm by Taylor series expansion. For each feature point, thecornerness response image R is interpolated around the featurepoint location to compute the sub-pixel location. Since thecornerness response R is a nonlinear function of intensity,the Taylor series expansion of R up to the second term isperformed about the feature point location (x, y) as follows:

R(x) = R +δRT

δxx +

1

2xT ∂2R

∂x2x (9)

where R and its derivatives are evaluated at the corner featurepoint and x = (x, y)T . The derivatives of cornerness responseare calculated by taking the difference of neighboring samples.The location of the extremum

�

x of R is obtained by taking itsderivative with respect to x and setting it to zero. After solvingthe equation, the sub-pixel location is calculated as follows:

�

x = −∂2R

∂x2

−1δR

δx. (10)

The sub-pixel location�

x is added to the feature location toobtain the sub-pixel accurate feature point location.

V. Memory Complexity

Table V shows the memory requirement of the originaland S-LOCOCO. The memory required by feature pointsare negligible compared to the image size. We successfullyremoved the memory required by Quick-sort and NMS forthe KLT algorithm, where Quick-sort requires three times thesize of image to store the cornerness and image coordinates (xand y) for sorting. We also removed the extra buffer requiredby convolution (T) for both Harris and KLT. For S-LOCOCO,the gradients, gx and gy, are not used on later execution phaseof the algorithm, hence they are reused to store their integralimages. Since the computation of an integral image requiressummation of the previous integral value with the current data,the same memory location can be overwritten, thus savingtwo image size blocks of memory. Thus, S-LOCOCO requiresone extra image size buffer compared to Harris. Whereas,compared to KLT we saved three image size of memory.

VI. Repeatability

We used the repeatability criteria, as introduced in [1], tocompare the quality of feature point detection of S-LOCOCO

TABLE V

Memory Complexity Comparison

Memory Blocks Total Memory

KLT [8] I+gx+gy+T a +C+3×Sb +Mc +2F 9×W×H+2FHarris I + gx + gy + T a + C + 2F 5×W×H+2F

S-LOCOCO I+ii+iixx+iiyy+iixy+C+3Fd +2F 6×W×H+5F

aBuffer required by convolution.bPoint list required by Quick-sort to sort cornerness response.cMask table for each pixel to perform NMS.dQuick-sort feature points(F is the number of feature points) W=image width, H=image height.

with the original approach. The repeatability is an importantproperty of a feature point detection algorithm. The repeatabil-ity measures the usefulness of the corner detection algorithm,by checking whether it can accurately detect the same physicalregion in the images taken under various viewing conditions.The repeatability (r) is defined as follows:

r =No. of features repeated

No. of useful features detected· 100.

We used the data-sets and software provided by Mikolajczyk[29] to evaluate the repeatability. The evaluation data-setsprovide ground-truth homographies and include the imageswith decreasing illumination variation, increasing blur,view point angle changes, and scaling. To compute therepeatability, the feature points in each image are detectedindividually. Then, the repeated feature points within theoverlapped region of the images are counted by projectingthe feature point location between the images using theground truth homography of the data-sets. After projection,the feature points are searched in the other image around thevicinity with an distance ε <= 1.5 pixels. The repeatabilityis measured in percentages, which expresses the percentageof the total number of features detected around the samephysical location within an distance ε, out of the total usefulfeatures detected in the overlapped region of the image.

VII. Experimental Results and Discussion

The S-LOCOCO algorithm comprises of SLC-Harris andSLC-KLT. The LOCOCO algorithm comprises of LC-Harrisand LC-KLT [11]. We compared our SLC-Harris and SLC-KLT corner detection algorithms with the original Harrisand KLT corner detectors in terms of execution speed andthe quality of the feature point detection. We also evaluatethe S-LOCOCO with the LOCOCO algorithm to measurethe performance improvement with the sub-pixel accuratefeature point localization. We measured the quality of thefeature point detectors based on the repeatability as describedin Section VI. We used the KLT code provided in [8] andour implementation of Harris similarly as in the KLT code.We also performed the sub-pixel interpolation of featurepoint location for the original Harris and KLT algorithmsto compare the performance of the algorithms without theinfluence of the sub-pixel accurate localization.

Fig. 7 shows the comparison of the execution time andspeedup factor for each stage between the original and our


Fig. 7. Comparison of execution time (Intel i5, 2.53 GHz, 3 GB RAM, andN = 9, W = 9, d = 10, feature count ∼1000).

TABLE VI

Comparison with Other Methods

Execution Time (ms) PlatformImage Size = 768×288

SLC-Harris 13.9 2.53 GHz i5 CPUFAST [30] 1.34 2.6 GHz Opteron CPU

SUSAN [30] 7.58Image Size = 800×640

SLC-Harris 32.1 2.53 GHz i5 CPUSIFTa [6] 400 3 GHz Pentium 4 CPUSURFa [6] 70

a Only feature point detection. Note that SIFT and SURF arescale-invariant

robust LOCOCO algorithm (SLC-Harris and SLC-KLT). Weachieved the speedup by a factor of 9.8±0.6 for the Harrisand 8.6±0.6 for the KLT with the code implemented in C.The speedup factor for KLT is lower than the Harris due tosquare root operation needed by the eigenvalue decompositionfor KLT on computing cornerness response R, which requiresa higher cycle count. The extracted feature points are sorted ac-cording to their cornerness response for the KLT which takes avery limited amount of time. For the images of size 640×480,we achieve ∼50 f/s on i5 CPU with 2.53 GHz and 3 GBrandom access memory (RAM) for SLC-Harris corner detectorwhereas the original Harris algorithm achieves only ∼5 f/s,thus enabling many real-time applications, such as videostabilization, video mosaicing, and others, to detect cornerfeature points at an acceptable frame rate. Table VI shows theexecution time comparison with other feature point detectors.

Next, we evaluate the repeatability, to measure quality, ofthe S-LOCOCO, LOCOCO [11], and the original Harris andKLT algorithms on the data-sets provided by Mikolajczyk [29].We used the blur (Bike) and illumination variation (Leuven)data-sets for evaluation. We also evaluated the repeatabilityon planar rotated image and by adding increasing Gaussiannoise to data-sets for illumination variation (Leuven). Asshown in Fig. 8(a)–(d), the repeatability for S-LOCOCO iseither comparable or slightly better compared to the Harrisand KLT algorithm for rotation, increasing Gaussian noise,increasing blur, and decreasing illumination variation settings,respectively, for ε <= 1.5. With the sub-pixel interpolation,the repeatability of S-LOCOCO is improved by 5% on anaverage compared to the LOCOCO algorithm. For imagerotation as shown in Fig. 8(a), the repeatability is lowest

at an rotation angle of π/4 due to the square shape of thefilter. For image rotations, the repeatability for S-LOCOCOis also improved compared to LOCOCO due to sub-pixelinterpolation of the feature point location. The repeatabilitystays above 70% which shows its invariance to rotation isstill preserved. For the images with the increasing blur, asshown in Fig. 8(c), repeatability result drops drastically forthe final blur setting due to all the detectors not being scaleinvariant. With the increasing blur the feature point regionspreviously detected are blurred, changing the scale of thecorresponding region, which ultimately adds error in featurepoint localization for all the algorithms due to feature pointdetection performed at the single scale, thus impacting therepeatability results. For illumination variation, as shown inFig. 8(d), the repeatability of S-LOCOCO is slightly bettercompared to the Harris and KLT. The repeatability remainsabove 55% for all the algorithms, which shows its stronginvariance to illumination variation, due to corner detectorsoperate only on gradients instead of operating on the imagepixels directly. There is no loss of quality with our approach inevaluating the corner response by an integral image techniqueand efficient NMS, while at the same time enabling a majorspeedup in these stages of the corner detector. The majorchange that would likely impact the quality of the detectoris the replacement of Gaussian derivative kernel by the Boxkernel in computing the image gradients. But, the experimentalresults show good repeatability indicating a good quality infeature point detection. The convolution of the image withthe Gaussian derivative kernel combines two steps together:first, the image is filtered with a low-pass Gaussian filter andsecond, the image is differentiated. Thus computing imagederivative with the Box kernel, which approximates Gaussianderivative kernel as shown in Fig. 2(b), is equivalent tosmoothing the image with a triangle function, followed bydifferentiation. Moreover, the Gaussian kernel is discretized,which is similar to the discretization of the triangle function,at least for smaller sigma. Thus, our algorithm gives similardetection results as the original algorithm.

The repeatability only measures the feature points detectedwithin a certain distance, but does not indicate the actualperformance of the algorithms on actual projection error. Theaverage projection error is computed to measure the exactinfluence of the sub-pixel interpolation of the feature pointlocation. Consequently, to evaluate the actual performanceimprovement between S-LOCOCO and LOCOCO due to sub-pixel interpolation, we compute the average projection error.To compute the average projection error, feature points aredetected in each image individually. Then, the feature pointcorrespondences with a distance ε <= 1.5 are computedby using the ground truth homography of the data-sets. So,by using the ground truth homography, the effect of thetracking error on estimating correspondences can be avoided,which also allows to consider only errors from the featurepoint localization. Then, RANSAC [26] and LM [27] areapplied on the correspondence set to estimate the homographyparameters. Finally, the average projection error is computedby using this homography on projecting feature points betweenthe images. Table VII shows the average projection error of the


Fig. 8. Repeatability evaluated on (a) planar rotation of an image, (b) increasing Gaussian noise added in second image of Leuven data-set, (c) increasingBlur settings, and (d) decreasing illumination variation settings.

TABLE VII

Comparison of the Average Projection Error on Registering

the Images with the Blur, Illumination, and Rotation for With

and Without Sub-Pixel Interpolationa

SLC-Harris LC-Harris SLC-KLT LC-KLT

Illumination 2b 0.274 0.547 0.309 0.601

3b 0.341 0.635 0.401 0.607Blur 2c 0.269 0.553 0.372 0.637

3c 0.430 0.674 0.520 0.76510 0.161 0.477 0.253 0.52520 0.260 0.544 0.450 0.65230 0.361 0.614 0.549 0.77540 0.462 0.733 0.572 0.76850 0.526 0.797 0.538 0.775

Rotation 60 0.465 0.741 0.517 0.744(degree) 70 0.274 0.580 0.443 0.692

80 0.135 0.448 0.203 0.51790 0.003 0.008 0.004 0.008

aInterpolation time Ti = 0.20 ms for 1000 feature points.bOn registering Leuven data-set images 2 and 3 with the image 1.cOn registering Bike data-set images 2 and 3 with the image 1.

feature point correspondences, with and without the sub-pixelaccurate feature point localization, on the illumination varia-tion (Leuven) and Blur (Bike) data-sets by registering images2 and 3 with image 1. The average projection error obtained

TABLE VIII

Average Projection Error on Registering the Successive

Images Taken from the Video

Average Projection ErrorSLC-Harris + KLT Tracking

Laparoscopy video 1a 0.96±0.28Laparoscopy video 2a 1.08±0.25Laparoscopy video 3a 1.12±0.30

Endoscopy colonb 1.15±0.30Leuven Libraryc 0.29±0.15Arenberg Castlec 0.28±0.14

Aerial viewc 0.32±0.15

aLaparoscopy video in digital video disc (DVD) [31].bEndoscopic video of colon with narrow band light [32].cVideo captured by hand-held camera by sweeping over the scene.

with the sub-pixel interpolation is two to three times smallerthan without sub-pixel interpolation. Meanwhile, the executiontime overhead added by the interpolation is also very small.

VIII. Application to Image Registration

We used the SLC-Harris to detect corner feature pointswhile registering the video frames, which is the basic stepfor applications such as video stabilization and video framesmosaicing [33]. To perform image registration, we used our


Fig. 9. Image registration of the successive images of laparoscopic surgery video. (a)–(d) Feature points detected by SLC-Harris. (e)–(h) KLT tracking ofthe feature points in consecutive frame and RANSAC to estimate homography (black inliers and red outliers). (i)–(l) Registration of the second image withthe first image (second frame transparent and at the back).

SLC-Harris to detect corner feature points. Then, the cor-ner feature points are tracked by using the KLT trackingalgorithm [8] on the successive frames. The feature pointcorrespondences obtained from tracking are refined using theRANSAC scheme [25] to estimate the homography betweenthe frames. Then, LM optimization [25], [27] is applied tooptimize the homography parameters using the inlier subsetfrom RANSAC. We measured the average projection erroron registering the second frame with the first frame and byaveraging the resulting projection error. Experiments wereperformed on various laparoscopy [31], endoscopy [32], andoutdoor scene videos. For medical videos, the video is cap-tured at high frame rates hence rigidity between the frames

can be assumed [34]. On average ∼20 random image pairswere considered from each medical video and ∼50 randomimage pairs from each outdoor video. The outdoor videos werecaptured manually by holding a camera and sweeping overthe scene so that camera motion includes translation as wellas rotation. The medical video is taken from a video capturedduring surgery and provided in DVD in [31]. Table VIII showsthe average projection error and its variance in registeringthe frames using SLC-Harris feature points in various pair offrames of laparoscopy and outdoor video. Figs. 9 and 10 showfeature point detection, KLT tracking, and image registrationusing SLC-Harris on few images from the laparoscopy andoutdoor video used in evaluation in Table VIII.


Fig. 10. Image registration of successive video frames of University Library of Leuven, Arenberg Castle Leuven, and Aerial view of parking, respectively.(a)–(c) Feature points detected by SLC-Harris. (d)–(f) KLT tracking of the feature points in consecutive frame and RANSAC to estimate homography (blackinliers and red outliers). (g)–(i) Registration of the second image with the first image (second frame transparent and at the back).

IX. Conclusion

In this paper, we developed a method, named asS-LOCOCO, to speedup the Harris and KLT cornerdetectors. We cropped the Gaussian derivative kernel andrepresented it with the Box kernel to speedup the convolutionby using the integral image. We further used an integral imagerepresentation to speedup the computation of the cornernessresponse. We adopted an E-NMS for NMS of the cornernessresponse points, thus avoiding the highly complex sortingoperation. The feature point locations are interpolated forsub-pixel accuracy. We evaluated the quality of the detectorsbased on repeatability. The repeatability results indicatedthat our algorithm and the original algorithms have a similarquality of feature point detection. Thus, we achieved botha high speed and a good quality feature point detectionalgorithm. Future work will focus on extending our methodfor high-speed scale-invariant corner detection.

Acknowledgment

The authors would like to thank B. Geelen for his help whilepreparing this manuscript.

References

[1] C. Schmid, R. Mohr, and C. Bauckhage, “Evaluation of interest pointdetectors,” Int. J. Comput. Vision, vol. 37, no. 2, pp. 151–172, 2000.

[2] C. Harris and M. Stephens, “A combined corner and edge detection,” inProc. 4th Alvey Vision Conf., 1988, pp. 147–151.

[3] C. Tomasi and T. Kanade, “Detection and tracking of point features,”Dept. Comput. Sci., Carnegie Mellon Univ., Pittsburgh, PA, Tech. Rep.CMU-CS-91-132, Apr. 1991.

[4] E. Rosten, R. Porter, and T. Drummond, “Faster and better: A machinelearning approach to corner detection,” IEEE Trans. Patt. Anal. Mach.Intell., vol. 32, no. 1, pp. 105–119, Jan. 2010.

[5] D. G. Lowe, “Distinctive image features from scale-invariant keypoints,”Int. J. Comput. Vision, vol. 60, no. 2, pp. 91–110, 2004.

[6] H. Bay, A. Ess, T. Tuytelaars, and L. Van Gool, “Speeded-up robustfeatures (surf),” Comput. Vision Image Understand., vol. 110, no. 3, pp.346–359, 2008.

[7] K. Mikolajczyk and C. Schmid, “Scale and affine invariant interest pointdetectors,” Int. J. Comput. Vision, vol. 60, no. 1, pp. 63–86, 2004.

[8] S. Birchfield. (2007, Aug.). KLT: An Implementation of the Kanade-Lucas-Tomasi Feature Tracker [Online]. Available: http://www.ces.clemson.edu/∼stb/klt

[9] L. Teixeira, W. Celes, and M. Gattass, “Accelerated corner-detectoralgorithms,” in Proc. British Mach. Vision Conf., Sep. 2008.

[10] S. N. Sinha, J. M. Frahm, M. Pollefeys, and Y. Genc, “GPU-based videofeature tracking and matching,” in Proc. Workshop Edge Comput. UsingNew Commodity Architect., 2006.

[11] P. Mainali, Q. Yang, G. Lafruit, R. Lauwereins, and L. V. Gool,“LOCOCO: Low complexity corner detector,” in Proc. IEEE Int. Conf.Acou., Speech Signal Process., Mar. 2010, pp. 810–813.

[12] A. Neubeck and L. Van Gool, “Efficient non-maximum suppression,” inProc. IEEE Int. Conf. Patt. Recog., vol. 3. Sep. 2006, pp. 850–855.

[13] R. P. Horaud, T. Skordas, and F. Veillon, “Finding geometric andrelational structures in an image,” in Proc. 1st Eur. Conf. Comput. Vision,vol. 427. Apr. 1990, pp. 374–384.

[14] F. Shilat, M. Werman, and Y. Gdalyahn, “Ridge’s corner detection andcorrespondence,” in Proc. IEEE Conf. Comput. Vision Patt. Recog., Jun.1997, pp. 976–981.


[15] F. Mokhtarian and R. Suomela, “Robust image corner detection throughcurvature scale space,” IEEE Trans. Patt. Anal. Mach. Intell., vol. 20,no. 12, pp. 1376–1381, Dec. 1998.

[16] S. M. Smith and J. M. Brady, “Susan: A new approach to low level imageprocessing,” Int. J. Comput. Vision, vol. 23, no. 1, pp. 45–78, 1997.

[17] P. Beaudet, “Rotational invariant image operators,” in Proc. Int. Conf.Patt. Recog., 1978, pp. 579–583.

[18] L. Kitchen and A. Rosenfeld, “Gray-level corner detection,” in Proc.Patt. Recog. Lett., 1982, pp. 95–102.

[19] H. P. Morevec, “Toward automatic visual obstacle avoidance,” in Proc.5th Int. Joint Conf. Artif. Intell., 1977, p. 584.

[20] W. Forstner, “A framework for low level feature extraction,” in Proc.3rd Eur. Conf.-Vol. II Comput. Vision, 1994, pp. 383–394.

[21] L.-H. Zou, J. Chen, J. Zhang, and L.-H. Dou, “The comparison of twotypical corner detection algorithms,” in Proc. 2nd Int. Symp. Intell.Inform. Tech. Applicat., 2008, pp. 211–215.

[22] P. A. Viola and M. J. Jones, “Rapid object detection using a boostedcascade of simple features,” in Proc. IEEE Comput. Vision Patt. Recog.,Apr. 2001, pp. 511–518.

[23] R. Deriche, “Recursively implementing the Gaussian and its derivatives,”INRIA, Unité de Recherche Sophia-Antipolis, Sophia-Antipolis, France,Tech. Rep. 1893, 1993.

[24] C. A. R. Hoare, “Quicksort,” Comput. J., vol. 5, no. 1, pp. 10–16, 1962.[25] R. Hartley and A. Zisserman, Multiple View Geometry in Computer

Vision. Cambridge, U.K.: Cambridge Univ. Press, 2000, pp. 87–127.[26] M. A. Fischler and R. C. Bolles, “Random sample consensus: A

paradigm for model fitting with applications to image analysis and auto-mated cartography,” Commun. ACM, vol. 24, no. 6, pp. 381–395, 1981.

[27] D. W. Marquardt, “An algorithm for least-squares estimation ofnonlinear parameters,” J. Soc. Indust. Appl. Math., vol. 11, no. 2, pp.431–441, 1963.

[28] M. Brown and D. Lowe, “Invariant features from interest point groups,”in Proc. British Mach. Vision Conf., 2002, pp. 656–665.

[29] K. Mikolajczyk. (2007, Jun.). Affine Covariant Features [Online].Available: http://www.robots.ox.ac.uk/∼vgg/research/affine

[30] E. Rosten and T. Drummond, “Machine learning for high-speed cornerdetection,” in Proc. Eur. Conf. Comput. Vision, vol. 1. May 2006, pp.430–443.

[31] A. L. Covens and R. Kupets, Laparoscopic Surgery for GynecologicOncology. New York: McGraw Hill, 2009.

[32] J. East, N. Suzuki, and B. Saunders, “Narrow-band imaging-NBI inthe colon,” Wolfson Unit Endoscopy, St. Marks Hospital, London,U.K., 2006. Available: http://www.wolfsonendoscopy.org.uk/st-mark-multimedia-nbi-in-the-colon.html

[33] P. Suchit, S. Ryusuke, E. Tomio, and Y. Yagi, “Deformable registrationfor generating dissection image of an intestine from annular imagesequence,” in Proc. Comput. Vision Biomedical Image Applicat., 2005,pp. 271–280.

[34] R. Miranda-Luna, C. Daul, W. Blondel, Y. Hernandez-Mier, D. Wolf,and F. Guillemin, “Mosaicing of bladder endoscopic image sequences:Distortion calibration and registration algorithm,” IEEE Trans.Biomedical Eng., vol. 55, no. 2, pp. 541–553, Feb. 2008.

Pradip Mainali (S’10) received the B.E. degreefrom the National Institute of Technology, Surat,India, in 2002, and the Master of Technology Designin Embedded Systems degree jointly awarded bythe National University of Singapore, Singapore,and Technical University of Eindhoven, Eindhoven,The Netherlands, in 2006. He is currently pursuingthe Ph.D. degree from the Katholieke UniversiteitLeuven, Leuven, Belgium, and in collaboration withInteruniversitair Micro-Electronica Centrum VZW,Leuven.

His current research interests include computer vision and machine learning.

Qiong Yang (M’03) received the Ph.D. degree fromTsinghua University, Beijing, China, in 2004.

From 2004 to 2007, she was an Associate Re-searcher with Microsoft Research Asia, Beijing. InSeptember 2007, she joined VISICS Corporation,Leuven, Belgium. She has been with Interuniversi-tair Micro-Electronica Centrum VZW, Leuven, as aSenior Research Scientist since January 2009. Hercurrent research interests include pattern recognitionand machine learning, feature detection/extractionand matching, visual tracking and detection, object

segmentation and cutout, stereo and multiview vision, and multichannel fusionand visualization.

Gauthier Lafruit (M’99) is currently a PrincipleScientist Multimedia with the Department ofNomadic Embedded Systems, InteruniversitairMicro-Electronica Centrum VZW (IMEC), Leuven,Belgium. He has acquired image processingexpertise in various applications (video coding,multicamera acquisition, multiview rendering, stereomatching, image analysis, and others) applying the“Triple-A” philosophy, i.e., finding the appropriatetradeoff between “Application” specifications,“Algorithm” complexity, and “Architecture”

(platform) support. From 1989 to 1994, he was a Research Scientist withthe Belgian National Foundation for Scientific Research, Brussels, Belgium,mainly active in the area of wavelet image compression. Subsequently,he was a Research Assistant with the Vrije Universiteit Brussel (FreeUniversity of Brussels), Brussels. In 1996, he joined IMEC, where he wasa Senior Scientist for the design of low-power very large-scale integrationfor combined JPEG/wavelet compression engines. In this role, he hasmade decisive contributions to the standardization of 3-D-implementationcomplexity management in MPEG-4. Since 2006, he has been activelycontributing to the 3-DTV and multiview video research domain. Thispresentation will mainly focus on the latter, following the aforementioned“Triple-A” philosophy with “Application-Algorithm-Architecture” tradeoffs.His current research interests include progressive transmission in still image,video and 3-D object coding, as well as scalability and resource monitoringfor video stereoscopic applications and advanced 3-D graphics.

Luc Van Gool (M’85) received a Masters degree inelectromechanical engineering and the Ph.D. degreefrom the Katholieke Universiteit Leuven, Leuven,Belgium, in 1981 and 1991, respectively.

Currently, he is a Full Professor with theKatholieke Universiteit Leuven and the Eidgenos-sische Technische Hochschule, Zurich, Switzerland.He leads computer vision research at both places,where he also teaches computer vision. He is the Co-Founder of five spin-off companies. He has authoredover 200 papers in this field. His current research

interests include 3-D reconstruction and modeling, object recognition, andtracking and gesture analysis.

Prof. Gool is the recipient of several Best Paper Awards. He has been aprogram committee member of several, major computer vision conferences.

Rudy Lauwereins (SM’97) received the Ph.D. de-gree in electrical engineering in 1989.

He is currently the Vice President of Interuni-versitair Micro-Electronica Centrum VZW (IMEC),Leuven, Belgium, which performs world-leadingresearch and delivers industry-relevant technologysolutions through global partnerships in nano-electronics, information and communication tech-nologies, healthcare, and energy. He is responsiblefor IMEC’s Smart Systems Technology Office, cov-ering energy-efficient green radios, vision systems,

(bio)medical and lifestyle electronics, as well as wireless autonomous trans-ducer systems and large area organic electronics. He is also a part-timeFull Professor with the Department of Electrical Engineering, KatholiekeUniversiteit Leuven, Leuven, where he teaches computer architectures in theM.S. in Electrotechnical Engineering Program. Before joining IMEC in 2001,he held a tenure professorship with the Faculty of Engineering, KatholiekeUniversiteit Leuven, since 1993. He has authored and co-authored more than350 publications in international journals, books and conference proceedings.

Dr. Lauwereins has served on numerous international program committeesand organizational committees, and has given many invited and keynotespeeches. He was the General Chair of the Design, Automation and Testin Europe Conference in 2007.

0-robust low complexity corner detector

Documents