a survey of efficient hough transform methods j

8
A SURVEY OF EFFICIENT HOUGH TRANSFORM METHODS J. Illingworth and J. Kittler Department of Electronics and Electrical Engineering University of Surrey, Guildford, GU2 5XH. U.K. Abstract The Hough Transform, HT, has long been recog- nised as a technique of almost unique promise for shape and motion analysis in images containing noisy, missing and extraneous data. However its widespread adoption in practical systems has been slow due to its computa- tional complexity, the need for large storage arrays and the lack of a detailed understanding of its properties. The aims of Alvey project MMI/IP078 are • the investigation of the properties of HTs • the study of efficient implementations of HTs • the production of a HT hardware device for use in real time industrial inspection systems. • the development of general high level image inter- pretation strategies which use information derived from the HT. The first three items listed above are being stud- ied at Surrey University in collaboration with Com- puter Recognition Systems while the final item is un- der investigation at Heriot Watt University. There has recently been much activity concerned with space and time efficient implementations of HT ideas and this pa- per presents a survey of this area. Topics covered in- clude the development of new algorithms, the use of optical techniques and the implementation of HTs on novel hardware architectures. 1. Introduction Line drawings and silhouettes strongly suggest that in human vision the shape of objects provides a major part of the information needed for the interpretation of 2D images as 3D world scenes. The development of methods which will enable computers to reproduce these abilities is one of the major problems in com- puter vision. A good shape recognition system must be able to handle complex scenes which contain several objects that may partially overlap one another. In the case of computer vision systems the recognition algo- rithm must also be able to cope with difficulties caused by poor characteristics of image sensors and the limited ability of current segmentation algorithms. The Hough Transform [1] is an algorithm which has the potential to address these difficult problems. It is a likelihood based parameter extraction technique in which pieces of local evidence independently vote for many possible instances of a sought after model. In shape detection the model captures the relative position and orientation of the constituent points of the shape and distinguishes between particular instances of the shape using a set of parameters. The HT identifies specific values for these parameters and thereby allows image points of the shape to be found by comparison with predictions of the model instantiated at the identified parameter values. Image points on a parametrically defined shape or curve can be defined by an equation of the form (1.1) where P = (pi,...,p n ) are specific values of the n param- eters of the shape and X = (xi,X2) are the coordinates of any image point of the shape. Equation (l.l) defines a one to many mapping from a parameter space P to an image space X. One of the basic ideas behind the HT is that the form of equation (1.1) can be used to express a one to many mapping from image space to a space of parameters i.e. there is a transform (1.2) which maps each specific image point, (£1,22) into a set of parameter values and the form of g is the same as that of /. The mapping g generates the parameters of all possible shape instances which could include the image point (ii, £2)- For example in the case of a circu- lar shape the relationship between image and parameter space is given by -Pi = 0 (1.3) where pi and P2 are the parameters describing the posi- tion of the circle centre and P3 denotes the circle radius. Mapping / is obtained by taking values for pi,p2, and P3 and (1.3) then defines the set of image points which belong to a specific circle instance. Mapping g is ob- tained from (1.3) when specific values are defined for (^1)^2) and it then yields the set of (pi,P2>P3) values which define circles that the image point could lie on. 319 AVC 1987 doi:10.5244/C.1.43

Upload: others

Post on 21-Feb-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A SURVEY OF EFFICIENT HOUGH TRANSFORM METHODS J

A SURVEY OF EFFICIENT

HOUGH TRANSFORM METHODS

J. Illingworth and J. KittlerDepartment of Electronics and Electrical Engineering

University of Surrey, Guildford, GU2 5XH. U.K.

Abstract

The Hough Transform, HT, has long been recog-nised as a technique of almost unique promise for shapeand motion analysis in images containing noisy, missingand extraneous data. However its widespread adoptionin practical systems has been slow due to its computa-tional complexity, the need for large storage arrays andthe lack of a detailed understanding of its properties.The aims of Alvey project MMI/IP078 are

• the investigation of the properties of HTs

• the study of efficient implementations of HTs

• the production of a HT hardware device for use inreal time industrial inspection systems.

• the development of general high level image inter-pretation strategies which use information derivedfrom the HT.

The first three items listed above are being stud-ied at Surrey University in collaboration with Com-puter Recognition Systems while the final item is un-der investigation at Heriot Watt University. There hasrecently been much activity concerned with space andtime efficient implementations of HT ideas and this pa-per presents a survey of this area. Topics covered in-clude the development of new algorithms, the use ofoptical techniques and the implementation of HTs onnovel hardware architectures.

1. Introduction

Line drawings and silhouettes strongly suggest thatin human vision the shape of objects provides a majorpart of the information needed for the interpretationof 2D images as 3D world scenes. The developmentof methods which will enable computers to reproducethese abilities is one of the major problems in com-puter vision. A good shape recognition system mustbe able to handle complex scenes which contain severalobjects that may partially overlap one another. In thecase of computer vision systems the recognition algo-rithm must also be able to cope with difficulties causedby poor characteristics of image sensors and the limitedability of current segmentation algorithms. The Hough

Transform [1] is an algorithm which has the potentialto address these difficult problems. It is a likelihoodbased parameter extraction technique in which piecesof local evidence independently vote for many possibleinstances of a sought after model. In shape detectionthe model captures the relative position and orientationof the constituent points of the shape and distinguishesbetween particular instances of the shape using a set ofparameters. The HT identifies specific values for theseparameters and thereby allows image points of the shapeto be found by comparison with predictions of the modelinstantiated at the identified parameter values.

Image points on a parametrically defined shape orcurve can be defined by an equation of the form

(1.1)

where P = (pi,...,pn) are specific values of the n param-eters of the shape and X = (xi,X2) are the coordinatesof any image point of the shape. Equation (l.l) definesa one to many mapping from a parameter space P to animage space X. One of the basic ideas behind the HT isthat the form of equation (1.1) can be used to expressa one to many mapping from image space to a space ofparameters i.e. there is a transform

(1.2)

which maps each specific image point, (£1,22) into aset of parameter values and the form of g is the sameas that of / . The mapping g generates the parametersof all possible shape instances which could include theimage point (ii, £2)- For example in the case of a circu-lar shape the relationship between image and parameterspace is given by

-Pi = 0 (1.3)

where pi and P2 are the parameters describing the posi-tion of the circle centre and P3 denotes the circle radius.Mapping / is obtained by taking values for pi,p2, andP3 and (1.3) then defines the set of image points whichbelong to a specific circle instance. Mapping g is ob-tained from (1.3) when specific values are defined for(^1)^2) and it then yields the set of (pi,P2>P3) valueswhich define circles that the image point could lie on.

319AVC 1987 doi:10.5244/C.1.43

Page 2: A SURVEY OF EFFICIENT HOUGH TRANSFORM METHODS J

The parameter surface which g generates for each imagepoint is the surface of an upturned right circular conewith its apex at parameter values (zx,Z2,0). The HTinvolves taking each image point in turn and mappingit into multidimensional parameter space P using g. Ifseveral points belong to the same shape then the pa-rameter space surfaces which they generate via g willall intersect at the parameter point which characterisesthis shape. The HT is a method of identifying thesepoints of intersection.

In the standard digital computer implementationof the HT method both image and parameter space arerepresented by arrays of numbers which are spatially in-dexed to appropriately small, regularly shaped regionsof space. The image array is derived from real worldmeasurements and is a binary valued array in whichIs represent the likely presence of the shape within thecorresponding region of space. Non zero elements ofthe image array are mapped into parameter space sothat each element of the parameter array accumulates avalue which is equal to the number of parameter hyper-surfaces that pass through it. This discrete parameterspace is usually called the accumulator array and cellswhere many hypersurfaces intersect have a large countand form a peak structure. In using the HT the difficultproblem of identifying sets of spatially extended pointsis turned into the much easier task of finding compactlocal peaks in the accumulator array.

The above discussion applies to simple parametricshapes but the HT method can be extended to detectgeneral shapes [2,3]. In these cases a shape is describedby a list of template vectors which record the distanceand orientation of each point from an arbitrary local-isation point. A suitable localisation point is the cen-troid of the template shape. Translated instances of theshape may lie anywhere in the image plane. Each canbe characterised by the position of its centroid i.e. thecoordinates of the centroid are two parameters whichcan be used to identify translated instances of a tem-plate shape. In the generalised HT, GHT, each imagepoint is compared to every entry of the template list andfrom this a candidate centroid is calculated and the cor-responding cell in an accumulator array is incremented.As in other HTs, shape parameters are identified by lo-cating peaks in an accumulator array. The method canbe used to find scaled and rotated versions of a shapeusing two further parameters to describe these degreesof freedom.

A good way to view the HT is as an evidence gath-ering or voting procedure. Each image point generatesor votes for all parameter instances that could haveproduced it. The votes are counted in a multidimen-sional accumulator array and the final totals indicatethe relative likelihood of shapes described by parame-ters which are within the accumulator cell. Only sets

of image points which belong to a shape will vote co-herently and produce peaks in the accumulator array.Using this viewpoint it is easy to see that the HT cancope with many of the difficulties present in real worldimagery. The method is robust in the presence of ran-dom extraneous data as this maps into a distributedbackground of votes which is unlikely to hinder detec-tion of large compact peaks. The HT is also tolerantof missing data produced by poor segmentation or oc-clusion. These effects lead to a decrease in the heightof parameter peaks but do not necessarily affect theirdetectability catastrophically. Finally the HT is inher-ently parallel as it combines information from each im-age point independently. This means that it can simul-taneously accumulate evidence for several examples of ashape. Each instance of the shape produces a distinctpeak in the accumulator array.

One of the principal disadvantages of the HT in itsstandard implementation is the large storage and com-putational requirements which arise from the use of alarge accumulator array. For the determination of q pa-rameters we need to represent a q dimensional space. Ifeach parameter is to be resolved into a intervals thenthe accumulator array will have aq elements. If eitherq or a is large then this can be a prohibitively largeamount of storage . The size of the accumulator alsohas a direct bearing on the computational cost of the HTas most of the algorithm involves determining, for eachimage point, which cells of the accumulator array haveto be incremented. A large accumulator means thata large number of parameter hypersurface-accumulatorcell intersection tests have to be evaluated. As theseconsiderations are of great importance to the practicaladoption of the HT method we have begun to studymethods which can utilise the HT ideas but overcomethe inefficiencies of the most naive standard implemen-tation. In this survey paper we report published workwhich relates to ideas for the efficient implementationof the HT. Section 2 discusses how efficiency problemscan be tackled by consideration of task specific ideas.A very useful and widely applied idea is that instead ofsearching for many parameters using a single stage HTit is better to decompose large dimensional problemsinto multistage algorithms involving a sequence of lowdimensional HTs. Section 3 considers algorithms whichuse novel data structures or focusing approaches to theHT. It includes our own work on a method which wehave called the Adaptive Hough Transform, AHT. Sec-tion 4 describes the relationship between the HT andother transforms with a view to use of these transformsto compute the HT more easily. It can be shown thatthe HT is equivalent to calculating projections and theseresults lead to both optical and digital implementationsof the HT. In Section 5 we consider how the HT has beenmapped onto existing parallel architectures. Section 6is a brief summary of the paper.

320

Page 3: A SURVEY OF EFFICIENT HOUGH TRANSFORM METHODS J

2. Task specific principles.

In the standard implementation of the HT storageincreases exponentially as the number of parameters, q,increases or the parameter resolution, (oc a) increases.This curse of dimensionality also applies to the amountof computation required, as the parameter hypersurfacewhich each image point generates is usually a (q-1) di-mensional hypersurface. The most obvious way to im-prove efficiency is therefore to ensure that both q and aremain as small as practicable. An obvious strategy isto restrict the range of parameters which are to be foundby using apriori knowledge of the specific problem to besolved. A smaller parameter range can be covered to aspecified high resolution using fewer accumulator cellsi.e. a smaller value of a. For example in many problemswhich involve circle finding there may be limits on thelikely radius of circles or the centre of the circle maybe constrained to lie within the interior of the image.In a limiting case it may be known that circles are ofa single known radius and in that case the 3 parametercircle finding problem collapses to a 2 parameter prob-lem of finding circle centres. Alternatively the shapemay be known to have certain rotational symmetriesand these can be exploited to reduce the number of in-tervals needed to determine orientation. Davies [3] hasconsidered regular polygonal shapes and concludes thatthe orientation axis need only be divided into a num-ber of intervals which is equal to the number of polygonsides.

A less obvious way of improving the computationalefficiency of the HT is to use extra information to re-strict the mapping of image to parameter values froma "one to many" mapping to a "one to few" mappingi.e. use extra data to constrain the dimensionality orrange of the hypersurface that an image point gener-ates. Edge direction information [4] is the most com-monly used piece of extra information although curva-ture and gray level constraints have also been used inthis way [5,6]. For example in the case of straight linedetection the edge direction is directly related to the linegradient and therefore line finding can be reduced to a 1parameter histogramming task to find the best interceptvalue. In the case of circle finding edge angle informa-tion restricts each image point mapping to a single lineof parameter values on the surface of a cone. The ideaof using edge information has been used most notably inBallards [7] efficient HT for general shapes. This reliesfor its efficiency on the indexing of possible localisationpoint vectors by values of edge direction. Candidateshape points are then only matched to template pointswhich possess the same value of edge direction.

A sophisticated use of prior information involvesdecomposition of high dimensional problems into a se-ries of lower dimensional problems. One of the simplestexamples of this type is the use of edge direction infor-mation, 8, as a constraint in circle finding. All the edge

vectors on a circle must project to a common intersec-tion at the centre of the circle. In [8] it is shown how thiscan be used to define a 2 stage HT. Edge vectors can beestimated using standard operators such as the Sobeloperator. The first stage of the modified HT involvesmapping image features described by [x\,X2,8) into a 2parameter space (pi,P2) of circle centres. Peaks iden-tified in the 2 parameter space yield candidate (pi,P2)values which are then used with image point coordi-nates in a second stage to calculate a histogram of pos-sible P3 or radius values. This radius histogram canbe viewed as a trivial 1 parameter HT which maps 4tuples of (xi,X2,pi,P2) into p3 values. The maximumdimensionality of any of the substages is only 2 and theoverall efficiency of the modified HT is dominated bythe demands of this stage. The penalty paid for thisincreased efficiency is the cost of calculating, and thereliance on, good edge information. Edge detection isa computationally intensive task but in most cases thiswill not outweigh the benefit gained from its use in thetwo stage HT. Indeed there are many situations whereedge information may have to be computed for purposesother than the HT and in these cases no extra compu-tation is incurred. However decomposing the HT intoa multistage process may introduce systematic errors asinaccuracies in parameters determined at an early stagepropagate into later stages. Tsukune and Goto[9] haveshown that similar edge based constraints can be derivedand used to decompose the 5 parameter ellipse findingproblem into a 3 stage problem involving determinationof at most two parameters at any single step.

Ballard and Sabbath [10] have shown that param-eter decomposition can be a powerful technique in thematching of 2D views to 3D models. Descriptions of 3Dobjects are usually stored as object centred representa-tions and recognition involves determining the parame-ters of a transformation which maps a 2D view into these3D descriptions. The viewing transformation generallyinvolves 7 parameters which cater for changes in scale(l parameter), orientation (3 parameters) and transla-tion (3 parameters). However for orthographic projec-tion there is a natural ordering or dominance of param-eters such that they can be sequentially determined insubgroups i.e. scale can be determined without knowl-edge of orientation or translation and orienation can befound without knowing the value of translation parame-ters. Ballard and Sabbah give specific algorithms for thecase where objects are described by oriented edges andplanar surface patches. The maximum dimensionalityof any subproblem is only 3.

A further possibility for parameter reduction is toconsider how any shape maps into a low dimensionalspace such as that used in line finding. Several au-thors [11,12,13,14,15] have shown that the distributionof votes in this parameter space has a characteristicshape. Finite length lines produce a butterfly shaped

321

Page 4: A SURVEY OF EFFICIENT HOUGH TRANSFORM METHODS J

peak of counts rather than a symmetric peak. Leaversand Boyce [14] have designed filters to enhance this fea-ture in order to increase the reliability of line detection.They have also considered the mapping of circles [15]into this parameter space. Circles produce a band ofvotes with a characteristic falloff at the band edges. Theedges of the band can be enhanced by a filter and param-eters of the circle can be estimated from their positions.Casasent and Krishnapuram [16] use the same idea ofmapping any shape into a two parameter space but theysuggest manipulating the resulting HT parameter spaceso that upon applying the inverse HT the votes from ashape map to a compact peak. They indicate a generaltechnique for deriving the manipulations which must beperformed in parameter space. The position of the peakin the inverse transform space indicates the translationparameters associated with the shape.

3. Efficient accumulation methods

In this section we consider several methods whichutilise novel data structures or use techniques that em-ploy nonuniform or multiple resolutions in order toachieve storage and computation savings. Many schemesare based on the observation that it is necessary tohave high accumulator resolution only in places where ahigh density of votes accumulate. O'Rourke and Sloan[17,18,19] were among the first to use this principle.They developed two data structures. The first structure,Dynamically Quantised Spaces (DQS) [17,19], consistsof a binary tree in which each node encloses a rectan-gular region of space. As data is accumulated rules areused to split and merge cells so that each cell contains,irrespective of its size, approximately the same num-ber of counts evenly distributed throughout its volume.The total number of cells in the tree is an input pa-rameter of the method but in the final tree there willbe a concentration of small cells around peaks and thisallows peaks to be located more accurately than usingthe same number of cells in a standard uniform quan-tisation schemes. The second data structure, Dynami-cally Quantised Pyramids (DQP) [18,19], is based on amultiresolution multidimensional quadtree in which thenumber and connectivity of cells is small and fixed. Adisadvantage of the quadtree structure is that its bound-aries, and hence its spatial resolutions, are fixed. Thislimitation is overcome in the DQP data structure us-ing a dynamic technique, called hierarchical warping, tomodify the boundaries of cells by tracking the mean po-sition of data points in their regions of space. The neteffect of the boundary adjustment is to produce cellscontaining approximately equal numbers of votes. How-ever the warping process introduces some errors in thefinal accumulator totals and means that the final divi-sion of space depends on the order in which the data isaccumulated i.e. recently accumulated points are most

influential on the final result. O'Rourke and Sloan useboth data structures in a series of experiments where a3 parameter space is divided into about 64 cells. Theydemonstrate the relative abilities of each algorithm tofocus on peaks, to dynamically change attention and todetect multiple peaks. Their general conclusion is thatDQSs are difficult to implement but perform somewhatbetter than DQPs.

A useful idea suggested by several authors is theiterative focusing of the HT to identify peaks with highcounts. Initially the HT can be accumulated in a coarsebut uniform resolution parameter space. Regions or cellswith high counts at this resolution can then be inves-tigated at higher resolution. Only a few iterations ofthis procedure can lead to very high parameter resolu-tion. Consider searching for a peak in a q dimensionalspace resolved into a accumulator bins. The iterativeuse of a much smaller accumulator of size /5 requiresQ(io (fl ) derations to focus down to the same resolu-tion. The computational saving of the method is pro-portional to the ratio of the number of cells in the two

accumulators i.e. f %) times the ratio of iterations

i.e. O ( ( | ) ' * gj{Q). This can be very large. Thisstrategy provides enormous efficiency benefits in termsof both amount of computation and amount of storagespace required. It has been used by Adiv [20] as anefficient way to search for motion parameters in high di-mensional spaces and by Silberberg [21] to identify theviewing parameters relating stored models to observedshapes.

Recently Li et al [22,23] have detailed a focusingalgorithm, the Fast Hough Transform, FHT, which usesa multidimensional quadtree in conjunction with HTswhich map image points into hyperplanes. This use ofhyperplanes is advantageous as the approximate inter-section between planes and parameter cells can be effi-ciently computed using incremental tests which dependon the known spatial relationship between a quadrantand its four sons. Only additions and shifts are requiredto implement the test and the computational cost scalesonly linearly with the dimensionality of the parameterspace. Computational gains of O(103) for 2D lines andO(106) for 3D plane detection can be achieved. Thealgorithm is regular in structure and is therefore easilyextended to higher dimensional spaces and easily imple-mented in VLSI hardware. However the FHT currentlyidentifies areas for focusing by requiring that quadrantshave a vote count which exceeds a fixed threshold. Incomplex images this threshold may be difficult to deter-mine and if it is set too low the efficiency of the methodwill deteriorate as the algorithm will explore too manyquadrants. Another problem is related to the incremen-tal intersection testing method. This requires that eachquadrant must store the distance of every image featurefrom its centre. If there are many image features this

322

Page 5: A SURVEY OF EFFICIENT HOUGH TRANSFORM METHODS J

can represent a large overhead. It is also not clear thatall shapes can be naturally mapped into hyperplanes.Finally the quadtree data structure is a static structureand cannot optimally adapt to situations which may re-quire different parameter resolutions along different pa-rameter axes. Li et al [24] have recognised the last prob-lem and have suggested using a data structure, calledthe bintree, in which space is partitioned into rectangu-lar regions.

Illingworth and Kittler [8,25] have developed an it-erative coarse to fine search strategy for detecting linesand circles in 2D and 3D parameter spaces. Their im-plementation is called the Adaptive Hough Transform,AHT. It uses a small accumulator array (typical /? =9) which is thresholded and then analysed by a con-nected components algorithm. The shape and extentof connected components determine the new parame-ter limits which are used in the next iteration. Limitscan be decreased, expanded, rotated or merely trans-lated depending on the distribution of counts in the ac-cumulator. Parameter limits are defined independentlyfor each parameter dimension and therefore the methodselects appropriate precision for each parameter. Themethod works well for single object recognition and weare currently investigating strategies for using it to de-tect multiple objects. The storage gain using the AHT

is of the order expected for focusing methods, i.e. ( j J ,while experiments with searching for circles in a 3 pa-rameter space demonstrated that it was several hundredtimes times faster than a standard method.

Brown [26,27] attempted to overcome the storageproblems associated with the standard HT by replacingthe accumulator array by a small fixed size content ad-dressable store i.e. a hash table in software or a cacheif implemented in hardware. Counts are accumulated inthe store until it is full and then a flushing or garbagecollection method is invoked to release some storage forfurther use. The simplest flushing strategy is to removeentries with fewest votes. However it is possible to de-vise schemes which will favour the keeping of recent in-formation or which will incorporate geometric localityinformation. Brown studied caches of size 32, 64 and128 and compared their performance againt standardaccumulators as he varied flushing strategy and the or-der of accumulation of the data. He concluded thatfinite length caches were as practical as accumulatorsand he was able to predict qualitatively the degradationof their performance as noise increased and cache lengthdecreased.

4. Optical and projection based implementations.

In this section we consider the relation of the HTto other transforms. It was pointed out by Deans [28]that the HT for linear features was a special case of a

well known transform much studied in the mathematicalliterature, the Radon Transform. The Radon transformof a function f(x,y) on a two dimensional Euclideanplane is defined as

+ OO +OO

R{p,6)= / / f[x,y)8{p-xcos(9) - ysin{6) )dxdy— oo — oo

(4.1)where 8 is the Dirac delta function. The delta functionterm forces integration of f{x, y) along the line

p = xcos(8) + ysin(8) (4.2)

The Radon Transform yields the projections of the func-tion f{x,y) across the image for various values of lineparameters p and 0. It is a transform which is muchused in computer tomography and is equivalent to theHT when considering the transform for binary images.The Radon and Hough transforms for shapes other thanlines are calculated by integrating or projecting imagedata along paths which are identical to the curve to bedetected.

An important property of the Radon transform isthat it can be computed via the Fourier transform. TheFourier Slice theorem states that if F(u, v) is the Fouriertransform of a function f(x,y), Ro{p) is a projectionat an angle 9 across f(x,y) and S (̂w) is the Fouriertransform of Rg (p), then

5<y(w) = F(u>cos6, (4.3)

This allows efficient calculation of the Radon transformusing either optical Fourier techniques [29,30,32] or theFast Fourier Transform [31].

Several authors have used optical methods to com-pute the HT. Eichmann and Dong [29] suggested a setupwhich used coherent optics. Later Gindi and Gmitro[30] discussed a hybrid optical-digital hardware systemfor the calculation of the Radon transform at video rateson a 512x512 image. In their system a single projectionis acquired by imaging a scene onto a special ID sen-sor array whose elements span the vertical length of theimage. This is achieved using an anamorphic opticalsystem, one with differing magnification in two orthog-onal directions. Data for different angles of projectionis collected by rotating the image using a prism whichspins at 450rpm. 512 angular slices each consisting of512 samples could be collected at standard video rates.This projection data can be processed to yield the HTfor lines. Steier and Shori [32] have built a system basedon similar ideas and show results relating to the loca-tion of bridges and roadways in outdoor scenes and thedetection of tool parts in an industrial inspection task.

The idea of the HT as a projection has been used byMostafavi and Shah [33] in conjunction with hardware

323

Page 6: A SURVEY OF EFFICIENT HOUGH TRANSFORM METHODS J

features of commercially available image processing sys-tems to produce a rapid algorithm for line detection.The same idea has been used and extended in the re-cent work of Sanz and Dinstein [34,35]. Lines at a singleorientation, 6, can be detected by convolving the binaryimage of the line with a template image. In the tem-plate image each pixel lying on a line a normal distancep from the origin has an intensity of p. The resultanthistogram of gray levels should have large peaks onlyat values of intensity equal to the p value of the line.By generating and evaluating a series of histograms foreach value of 9 the full HT can be built up. Features ofspecialised image processing and general pipelined archi-tectures can be used in the template image generation,the image convolution and the gray level histogram gen-eration steps. The method can be used for shapes otherthan lines by using different template images.

5. Parallel architectures.

One of the main characteristics of the HT is thatit consists of a series of fairly simple calculations car-ried out independently on every feature in an image. Inthis section we will review ideas and attempts to capturethis inherent paralellism on specialised parallel architec-tures.

Several authors have investigated the implementa-tion of the HT on currently available SIMD architec-tures. These usually consist of square arrays of simpleprocessing elements, P.E.s, connected so that each cancommunicate with its 4 or 8 neighbours. All processorsconcurrently execute the same instructions on differentitems of data. Most often each P.E. has only a single bitarithmetic logic unit, several registers and a few kilobitsof on chip RAM. The mapping of the HT onto these ar-chitectures has been considered by both Silberberg [36]and Li [37]. Silberberg considered distributing both theimage and the accumulator arrays among all P.E.s i.e.each P.E. is assigned both an image feature and a cellof parameter space. However this necessitated a greatdeal of data transfers between P.E.s in order to passvotes to the correct P.E. for accumulation. Over 98%of the time of the algorithm was taken in shifting dataaround the processing array and the end result was thatthe use of 8000 simple P.E.s led to a speedup factor ofonly 7 relative to a standard implementaion on a VAX11/785. This poor result illustrates the care required inmapping algorithms onto parallel machines.

Li [37] considered two schemes for running his FasiHough Transform, FHT, on a SIMD architecture. Inthe first scheme each P.E. is assigned an image featureand the coordinates of a parameter cell are broadcastsimultaneouly to every P.E. by a central controller. EachP.E. decides whether the hypersurface generated by itsimage feature intersects the cell and if it does it sends

a vote back to the controller. The votes from each P.E.can be summed by the central controller and stored forlater analyis. A second scheme is to assign each P.E.a volume of parameter space and broadcast the imagefeatures. The choice of method depends on the numberof available P.E.'s, the number of image features andthe number of parameter cells. For the standard HTthe number of parameter cells increases exponentiallywith dimensionality of the problem and therefore thefirst alternative is likely to be the most feasible.

Simplistically one might argue that the use of nP.E.s, one for each of n image features, should speed upthe HT by a factor of O(n). However votes have to berouted to their correct places in the accumulator and ona simple square 2D mesh with only local shifting oper-ations this requires O(^/n) steps. As Silberberg foundout this can be a significant overhead and even the dom-inant time factor. However if the array is augmentedby a tree based voting network this can be reduced toO(log(n)) and the expected gains in efficiency shouldagain approach O(n).

Little et al [38] describe a possible implementationof the HT on an architecture called the Connection Ma-chine. This is similar to the previously mentioned SIMDarchitectures but as well as P.E.s communicating withnear neighbours there is a hardware router which im-plements rapid communication between any pair of pro-cessors. The architecture is based on a 12 dimensionalhypercube i.e. every processor can be reached from anyother by traversing at most 12 edges of the cube. Un-fortunately this paper concentrates on aspects of pro-gramming and addressing and does not give any figureson the efficiency gains in using this parallel implemen-tation.

Olsen et al [39] discuss the operation of the HT ona MIMD computer architecture. They use a commer-cially available system called the BBN Butterfly Paral-lel Processor™. This consists of 256 processing nodesconnected by a switching network. Each processor is amicroprocessor from the Motorola 68000 series and eachnode has a substantial amount of local memory i.e. 1megabyte. Memory is shared in the sense that a switch-ing network allows any processor to address the localmemory of any other processor. Olsen et al implementedtwo different line finding HTs using different parameter-isations for lines. The versions differed in whether theimage was partitioned among the processors with theparameter cells accessed from common memory or viceversa. They consider whether the algorithm is computa-tion or communication resource bound and their resultsindicate that it is possible to achieve processor utilisa-tion factors in the region of 80% i.e. if 100 processorsare used they will give a speedup of 80 times relative tousing a single processor.

The HT has attracted attention from researchersinterested in human vision as it is a prime example of

324

Page 7: A SURVEY OF EFFICIENT HOUGH TRANSFORM METHODS J

the ideas of the connectionist school of artificial intel-ligence [40,41]. The unifying principle of this approachto intelligence is that low and medium level vision tasksare done by massively parallel, cooperative computa-tions on large networks of simple neuron like units. Lowlevel pixel based properties, like edge or gray level esti-mates, can be represented by nodes in a feature networkwhich is spatially indexed to the image while parame-ter values associated with higher level image entities,such as straight lines, can be represented by nodes ina seperate parameter network. Each node records ameasure of confidence of the occurrence of its featureor parameter value and direct connections between thetwo networks define possible ways in which nodes caninfluence these confidence values. Different connectionpatterns can be used to impose different image space toparameter space mappings i.e. connections can be es-tablished so that when many low level units which lie ona straight line have high confidence then the higher levelunit describing the parameters of this line will acquire alarge confidence value. The major characteristic of thistype of implementation is the huge number of featureand parameter units needed and the very large numberof connections which are required between them.

6. Summary

We have discussed some of the most importantideas which have been proposed for increasing the ef-ficiency of the HT. These include principles which canbe exploited to decrease the storage and computationaldemands of specific tasks as well as new, general ways ofefficiently representing and accumulating the parameterspace. The consideration of optical techniques has led tothe development of realtime implementations of the HTwhile experience with programming the HT on generalpurpose parallel architectures has illustrated that cer-tain global communication facilities are needed if thisis to be sucessful. In future work we will investigateand develop many of these ideas as well as consideringother aspects of the shape representation and recogni-tion problem.

References

[l] Hough P.V.C., "A method and means for recognis-ing complex patterns.", US Patent 3,069,654 (1962)

[2] Merlin P.M. and Farber D.J., "A parallel mecha-nism for detecting curves in pictures." IEEE Trans,on Computers. 24 (1975) 96-98.

[3] Davies E.R., "Reduced parameter spaces for poly-gon detection using the generalised Hough Trans-form." In Proc. 8th IJCPR. Paris. (1986) 495-497.

[4] Kimme C, Ballard D.H. and Sklansky J., "Finding

circles by an array of accumulators." Communica-tions of the ACM. 18 (1975) 120-122.

[5] Hakalahti H., Harwood D. and Davis L.S., "Twodimensional object recognition by matching localproperties of contour points." Pattern RecognitionLetters. 2 (1984) 227-234.

[6] Moring I. and Hakalahti H., "Scale independentmethod for object recognition." In Proc. Scandi-navian Conf. on Image Processing. Trondheim.(1985) 881-889.

[7] Ballard D.H., "Generalising the Hough Transformto detect arbitrary shapes." Pattern Recognition.13 (1981) 111-122.

[8] Illingworth J. and Kittler J., "The Adaptive HoughTransform." to appear IEEE Trans, on PatternAnalysis and Machine Intelligence. September (1987).

[9] Tsukune H. and Goto K., "Extracting elliptical fig-ures from an edge vector field." In Proc. IEEEConf. Computer Vision and Pattern Recognition.Washington. (1983) 138-141.

[10] Ballard D.H. and Sabbah D., "Viewer independentshape recognition." IEEE Trans, on Pattern Anal-ysis and Machine Intelligence. 5 (1983) 653-660.

[11] Van Veen T.M. and Groen F.C.A., "Discretisationerrors in the Hough transform." Pattern Recogni-tion. 14 (1981) 137-145.

[12] Brown CM., "Inherent bias and noise in the HoughTransform." IEEE Trans, on Pattern Analysis andMachine Intelligence. 5 (1983) 493-505.

[13] Maitre H., "Contribution to the prediction of per-formances of the Hough Transform." IEEE Trans,on Pattern Analysis and Machine Intelligence. 8(1986) 669-674.

[14] Leavers V.F. and Boyce J.F., "The Radon Trans-form and its application to shape parameterisationin machine vision" Image and Vision Computing. 5(1987) 161-166.

[15] Boyce J.F., Jones G.A. and Leavers V.F., "An im-plementation of the Hough Transform for line andcircle detection." In Proc. SPIE Conf. The Hague.(1987).

[16] Casasent D. and Krishnapuram R., "Curved objectlocation by Hough transformations and inversions."Pattern Recognition. 20 (1987) 181-188.

[17] O'Rourke J., "Dynamically quantised spaces for fo-cusing the Hough transform." In Proc. 1th IJCAI.Vancouver. (1981) 737-739.

[18] Sloan K.R., "Dynamically quantised pyramids." InProc. 7ih IJCAI. Vancouver. (1981) 734-736.

325

Page 8: A SURVEY OF EFFICIENT HOUGH TRANSFORM METHODS J

[19] O'Rourke J. and Sloan K.R., "Dynamic quantisa-tion: two adaptive data structures for multidimen-sional spaces." IEEE Trans, on Pattern Analysisand Machine Intelligence. 6 (1984) 266-279.

[20] Adiv G., "Recovering motion parameters in scenescontaining multiple moving objects." In Proc. IEEEConf. on Computer Vision and Pattern Recogni-tion. Washington. (1983) 399-400.

[21] Silberberg T.M., Davis L. and Harwood D., "Aniterative Hough procedure for 3D object recogni-tion." Pattern Recognition. 17 (1984) 621-629.

[22] Li H., Lavin M.A. and LeMaster R.J., "Fast HoughTransform." In Proc. of 3 rd Workshop on Com-puter Vision: Representation and Control. Bellair.Michigan. (1985) 75-83.

[23] Li H., Lavin M.A. and LeMaster R.J., "Fast HoughTransform: a hierarchical approach." Computer Vi-sion Graphics and Image Processing. 36 (1986) 139-161.

[24] Li H. and Lavin M.A., "Fast Hough Transformbased on the bintree data structure." In Proc. IEEEConf. Computer Vision and Pattern Recognition.Miami Beach. (1986) 640-642.

[25] Illingworth J. and Kittler J., "Shape detection us-ing the Adaptive Hough Transform", In Proc. ofNato Advanced Research Workshop. Italy. (1987)to be published by Springer-Verlag.

[26] Brown CM. and Sher D.B., "Modelling the sequen-tial behaviour of Hough Transform schemes." InProc. Image Understanding Workshop, ed by L.S.Baumann. Palo Alto. (1982) 115-123.

[27] Brown CM., Curtiss M.B. and Sher D.B., "Ad-vanced Hough Transform implementations." In Proc.8th IJCAI. Karlsruhe. (1983) 1081-1085. or Inter-nal report TR113. Computer Science Department.University of Rochester.

[28] Deans S.R., "Hough transform from the RadonTransform." IEEE Trans, on Pattern Analysis andMachine Intelligence. 3 (1981) 185-188.

[29] Eichmann G. and Dong B.Z., "Coherent opticalproduction of the Hough Transform." Applied Op-tics. 22:6 (1983) 830-834.

[30] Gindi G.R. and Gmitro A.F., "Optical feature ex-traction via the Radon Transform." Optical Engi-neering. 23:5 (1984) 499-506.

[31] Murphy L.M., "Linear feature detection and en-hancement in noisy images via Radon Transform."Pattern Recognition Letters. 4 (1986) 279-284.

[32] Steier W.H. and Shori R.K., "Optical Hough Trans-form." Applied Optics. 25:16 (1986) 2734-2738.

[33] Mostafavi H. and Shah M., "High speed digitalimplementation of linear feature extraction algo-rithms." In Proc. SPIE Conf. 432 (1983) 182-188.

[34] Sanz J.L.C., Hinkle E.B. and Dinstein I., "Comput-ing geometric features of digital objects in generalpurpose image processing pipeline architectures."In Proc. IEEE Conf. Computer Vision and Pat-tern Recognition. San Francisco. (1985) 265-270.

[35] Sanz J.L.C. and Dinstein I., "Projection based ge-ometrical feature extraction for computer vision :Algorithms in pipeline architectures." IEEE Trans,on Pattern Analysis and Machine Intelligence. 9(1987) 160-168.

[36] Silberberg T.M., "The Hough transform on the Ge-ometric Arithmetic Parallel Processor." In Proc.IEEE Conf. on Computer Architecture for PatternAnalysis and Image Database Management. (1985)387-393.

[37] Li H., "Fast Hough Transform for multidimensionalsignal processing." IBM Research report. RC 11562.Yorktown Heights.

[38] Little J.J., Blelloch G. and Cass T., "Parallel algo-rithms for computer vision on the connection ma-chine." In Proc. 1st IEEE International Conf. onComputer Vision. London (1987) 587-591.

[39] Olson T.J., Bukys L. and Brown CM., "Low levelimage analysis on an MIMD architecture." In Proc.1st IEEE International Conf. on Computer Vision.London. (1987) 468-475.

[40] Ballard D.H., "Parameter Nets." Artificial Intelli-gence. 22 (1984) 235-267.

[41] Feldman J.A. and Ballard D.H., "Connectionistmodels and their properties", Cognitive Science. 6(1982) 205-254.

326