handprinted character/digit recognition using a multiple feature/resolution phitosophy ·...

10
HandprintedCharacter/Digit Recognition using a Multiple Feature/Resolution Phitosophy J.T. FavatA G Srikantan, and S. N. Srihari CEDAR StateUniversity of New York at Buffdo, USA Abstract tb prpcr outli4es the philosophy, desrgn and implementation of the Gradient, Structural, bvily (GSC) recognition algorithm, which has been used successfullyin several aGG"d rcading applications at CEDAR. The GSC algorithm takes a quasi multi- dio approach to feature generation.This philosophy coupled with the appropriaie *tf,aatim function results in a recognizer which has both high accuracy and good -'et-cc behavior. This allows it to be used in higher level digit string and word ggliti@ algorithms which search for digit/character boundaries.Tests of the GSC Ner oDstandard digit, character and non-character databases arereported. L htroduction lfuy different approaches have beenused by researchers to solve the problem of machine qn endcharacter recognition [Suen92].These approaches haveincluded investigations of: t&re seb [Srik93] Man86], classifier algorithms, multiple combinations of classifrers tltogSl and novel statistical methods [Klein93]. There can be much overlap between Cftreot methodsand a precise taxonomy can prove difficutt. It would be safe to say that thc pecise classification of an algorithm has much to do with the perspective of the Lrwstigator during the designof the algorithm. lvtany different algorithms have been explored by the researchers at CEDAR tlee93l. Thesealgorithms have encompassed a wide range of feature and classifiert1pes. Brcry algorithm has characteristics, such as high speed, high accuracy, good thresholding rbility, andgeneralization, which areuseful for specificapplications. Examplas of classifiers &veloped at CEDAR are listed in Table 1. This paper will outline the philosophical and Fcticsl deta'ls of one these classifiers: the Gradient, structural, concavity (GSC) cl'sifier. 2" Philosophy Tbe approach usedin designing the GSC was based on the observation that feature setscan be designed to extact certain tlpes of information from the image.Feature detectorscan be built to detect the local, intermediate and global feaares of an image.The basicunit of an image is the pixel and we are interested in both is location (x,y coordinate) and the relationship of the pixel to its neighbors at different rangesfrom locally and globally. This 57 Favata, Srikantan and Srihari, Proc. IWFHR 1994, pp. 57-66.

Upload: others

Post on 20-Jun-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Handprinted Character/Digit Recognition using a Multiple Feature/Resolution Phitosophy · 2013-01-31 · Handprinted Character/Digit Recognition using a Multiple Feature/Resolution

Handprinted Character/Digit Recognition usinga Multiple Feature/Resolution Phitosophy

J.T. FavatA G Srikantan, and S. N. SrihariCEDAR

State University of New York at Buffdo, USA

Abstract

tb prpcr outli4es the philosophy, desrgn and implementation of the Gradient, Structural,bvily (GSC) recognition algorithm, which has been used successfully in severalaGG"d rcading applications at CEDAR. The GSC algorithm takes a quasi multi-dio approach to feature generation. This philosophy coupled with the appropriaie*tf,aatim function results in a recognizer which has both high accuracy and good-'et-cc behavior. This allows it to be used in higher level digit string and wordggliti@ algorithms which search for digit/character boundaries. Tests of the GSCNer oD standard digit, character and non-character databases are reported.

L htroduction

lfuy different approaches have been used by researchers to solve the problem of machineqn end character recognition [Suen92]. These approaches have included investigations of:t&re seb [Srik93] Man86], classifier algorithms, multiple combinations of classifrerstltogSl and novel statistical methods [Klein93]. There can be much overlap betweenCftreot methods and a precise taxonomy can prove difficutt. It would be safe to say thatthc pecise classification of an algorithm has much to do with the perspective of theLrwstigator during the design of the algorithm.

lvtany different algorithms have been explored by the researchers at CEDARtlee93l. These algorithms have encompassed a wide range of feature and classifier t1pes.Brcry algorithm has characteristics, such as high speed, high accuracy, good thresholdingrbility, and generalization, which are useful for specific applications. Examplas of classifiers&veloped at CEDAR are listed in Table 1. This paper will outline the philosophical andFcticsl deta'ls of one these classifiers: the Gradient, structural, concavity (GSC)cl'sifier.

2" Philosophy

Tbe approach used in designing the GSC was based on the observation that feature sets canbe designed to extact certain tlpes of information from the image. Feature detectors can bebuilt to detect the local, intermediate and global feaares of an image. The basic unit of animage is the pixel and we are interested in both is location (x,y coordinate) and therelationship of the pixel to its neighbors at different ranges from locally and globally. This

57Favata, Srikantan and Srihari, Proc. IWFHR 1994, pp. 57-66.

Page 2: Handprinted Character/Digit Recognition using a Multiple Feature/Resolution Phitosophy · 2013-01-31 · Handprinted Character/Digit Recognition using a Multiple Feature/Resolution

can be expressed by saying we want to determine the relationship of each pixel to evtryother pixel at increasing distances. In a sense, we are taking a multi-resolution approacb rofeature generation. The GSC features approximate a multi-resolution approach by beiqgenerated at three ranges: local, intermediate and global.

The gradient feaares d€tect local features of the image and provide a great desl dinformation about stoke shape at a short distsnce. The structural features extend tbcgradient features to longer distances and grve certain useful information about strokctrajectories. The concavity features are used to detect certain stroke relationships at lmgdistances which can span across the image. In practice, there are computationally impccdlimits to how a particular philosophy can be implemented- In the GSC algorithm, certaindecisions were made in the exact delection and representation ofthe features to result io apractical algorithm. The exact implementation should not distract from the underlyingphilosophy. It should be emphasized that we are presenting one particular implementationof our philosophy and that others are possible. The total feature vector lengttr is 5 12 bits. Itis important to note that the feature vector is binary. The GSC feature vector is verycompact, other algorithms may use a smaller number of multi-valued (or real) features butthe effective number of bits to represent such feature vectors can actually be quite large.

3. Feature Description

The GSC algorithm was designed to work with binadzed images, so it is presumed that theimage has been thresholded using a suitable algorithm [Otsu79]. The image is slantnormalized using a moment based algorittrm to reduce the effecs of skew. A bounding boxis placed around the image and the features are computed (see below). The feature maps aresampled by placing a 4x4 gird on the maps (see Figure 1). The features thembelves arecomputed independently of this sampling grid.

Gradient Features

The gradient features are computed by convolving two Sobel operators on the binaryimage. These operators approximate the x and y derivatives in the image. The vectoraddition of the operators' output is used to compule the gradient of the image. Since thegradient is a vector with magnitude and direction, only the direction is used in thecomputation of the feature vector. The direction of the gradient can range from 0 to 359degrees. This range is split ino 12 non-overlapping regions of 360112 degrees. In eachsampling region (4x4 grd), a histogram is taken of each gradient direction at each pixelwhich lies in the region. A tbreshold is applied to the histogram and the feature bit is set foreach feature count that exceeds a threshold. This subset of the GSC features produces12*4*4=l!2 bits of the total featue vector.

58

Favata, Srikantan and Srihari, Proc. IWFHR 1994, pp. 57-66.

Page 3: Handprinted Character/Digit Recognition using a Multiple Feature/Resolution Phitosophy · 2013-01-31 · Handprinted Character/Digit Recognition using a Multiple Feature/Resolution

grid siz€ of 5x5. Other stoke features detected are four B?es of corners which consist dperpendicular co-occurrences of strokes. These features contribute 4x4xL2=192 bis to tictotal featue vector .

Concaviry Feanres

These features, which are the coarsest of the GSC set" can be broken down into tbreesubclasses of feahres. The total contribution of these features are 4x4x8=128 bits.

Subclass A Coarse Pixel Density Features

These feafires capture the general groupings of pixels in the image. They are computed byplacing the 4x4 sampling grid on the image and counting the number of image pixels thatfatl into each grid. Thresholding converts these area counts into a single bit for each regioo.This feature conEibutes 4x4=t6 bits of the feature vector.

Subclass B. l,arge Stroke Features

These features attempt to capture large horizontal and vertical strokes in the image. Runlengths of horizontal and vertical black pixels across the image are first computed. From thisinformation, the presence of strokes are determined by testing for sffoke lengths above athreshold. This feature contributes 4x4*2=J) bits ofthe feature vector.

Subclass C. U lDlLlRltl Concavity Features

These features are computed by convolving the image with a star-Iike operator. Thisoperator shoots rays in eight directions and determines what each ray hits. A ray can hit animage pixel or the edge of the image. A able is built for the termination status of the raysemitted from each white pixel of the image. A computationally efficient algorithm simihr torunJength encoding is actually used to compute the star operator. The class of the eachpixel is determined by applying rules to the termination status pattems of the pixel.Currently upward/downward, left/right pointrng concavities are detected along with holes.The rules are relaxed to dlow nearly enclosing holes (broken holes) to be detected as holes.This gives a bit more robustness to noisy imqges. These features can overlap, in that, incertain cases more than one feature can be detected at a oixel location. These featuresconfibute 4x4x5=80 bits.

4. Classification

The classification problem can be stated as finding functions which map featue vectors toclasses. Ideally these functions should map all valid regions of the feature space to the classspace. With high dimensional feature spaces, this can be a very difficult problem. There maynever be enough faining data to adequately estimate these functions in certain regions of

60

Favata, Srikantan and Srihari, Proc. IWFHR 1994, pp. 57-66.

Page 4: Handprinted Character/Digit Recognition using a Multiple Feature/Resolution Phitosophy · 2013-01-31 · Handprinted Character/Digit Recognition using a Multiple Feature/Resolution

fuucwai Feanres

tbc structural features capture certain patterns embedded in the gradient map. TbeseP@rDs are "microstrokes"

of the image. several 3x3 operaors are passed over the

Teble 1. CEDAR Classifiers

lihmc Performaace (digts) Speed (digits/sec)

Image 96.02Vo 66.7

Biryoly 96.43Vo 25.0

Cbaincode 98.33Vo 67.0

Gabor* 97.70Vo 6.7

Gradient 98.46Vo 5.3

Histogram* 97.47Vo 62.5

Morphology* 97.92Vo 0.5

GSC 98.87Vo 10.0Notes: 1. (*) indicates classifiers which are no lotrg"r ""tiuAy Ueiogffi@

2. Performance and speed figures are approximate and used for relative comparison

gradient map to locate small strokes pointing up/down ard dirgonally. These strokes arecmbined into a larger features using a rule table. The largest rp* ofthe feature covers.-a

EilJ

t lt l

llt lLIHJ

t-trbt_ltfZl "1 lL/

D h M rMEe

$+P('t+".rS"'#'::

ebX

qg/"w FMl1-fflnrn

.) 512 Cr r51?Crfu

trigure 1. GSC feafures

59

Favata, Srikantan and Srihari, Proc. IWFHR 1994, pp. 57-66.

Page 5: Handprinted Character/Digit Recognition using a Multiple Feature/Resolution Phitosophy · 2013-01-31 · Handprinted Character/Digit Recognition using a Multiple Feature/Resolution

t ft&rc space. In addition, we can also assume that our features may not have enoughdynrg power to distinguish among certain csses of images or there may be an aliasing

FUco where two different image patterns may map into tbe (nearly) same feature [email protected] cra sometimes be due to practical implementation compromises involved in the

{ri6n design such as the need for speed or machine memory size limitations. Shce we* dc-ling with handwritten images, it is also possible Orat certsin cases of dig,b or&rars may be so borderline or poorly written that even humans could misrecognizeb-

Io computing the classification functions, we would like to generate functions thatearcly reflect the training set and generalize well to the testing sets. This means that thefirctioos should accurately recognize members of the taining set and smoothly rolloff astb ftaune vectors move away from the labeled vectors of the raining set. That is, as thedcrtying image is smoothly taasformed into another image of a different class, we would& e soooth transition of the classification function. This property is useful to preventrpnrious responses in those regions ofthe feature space which are inadequately representedh 6c raining set. In addition, we would like good behavior in invalid regions of the space.Tbb last property is important if we use the recognizer to distinguish between valid andinrlid images such as using the system to find valid digts from a segmented. This work hasfoqscd on several different classifiers each with various tradeoffs:

f-j!}I

Thc k-nearest neighbor ft-nn) approach attempts to compute a classification ftnction byx;ng the labeled taining points as nodes or anchor points in the n (n=512) dimensisnalrprcc. In a sense, this is the most detailed description of the space that is possible from theraining samples. Rather than using a l-nearest neighbor classifier, we chose a k-nnclr"sifier to reduce the effect of mislabeled training data and to get a better estimate of tbEprdot)?e density st any particular point in the space. By choosing an appropriate distancemetric a smooth rolloffin response can be obtained as a feature vector is moved away frome cluster center. Since the feature vector is binary, the comparisoo between unknown andhbcled vectors involves bit operations which can be done in parallel at the machines wordbogth. The main disadvantages of k-nn is the memory required to hold the raining datarnd the speed in classification. A clustering technique has been developed to greatly reducethe number of comparisons decessary to implement this classifier and it is now comparableio speed to other classi.fiers.

Neural Nets

Classification functions based on feed-forward neural network architecture have beenexplored at CEDAR with some suc@ss. It is our experience to date that these functions donot perform well with the GSC feature set, especidly with characters. The problems thatoccur are gpically poor accuracy and ill-beheved response to invalid (non-character/digit)

61

Favata, Srikantan and Srihari, Proc. IWFHR 1994, pp. 57-66.

Page 6: Handprinted Character/Digit Recognition using a Multiple Feature/Resolution Phitosophy · 2013-01-31 · Handprinted Character/Digit Recognition using a Multiple Feature/Resolution

lest images' we are cunenfl investigating the reasons for this unsatisfactory behavic.some of the problems may have to d; wi6 the limited raining date available for certaincharacter classes' ft I

oryt*l ru" ror r""ogoition of cursive characters in which thereare no standard taining and testing daabaseslvailable yet. w" ,," *p.rir"enting withtzzr .rarnls techniques. This alproach us"s tne *nna*.o *,pi, aoln anothcrrecognizer algorithm as the target vaiues of the neural *t auriog r"ir*r!. Earry tests haveshown performance improvements in the neural net.

Polynomial Functions

This set of classification are computed as low order polynomiar functions of thefeafure coefficients (bits; gsint " rnti*u;;;-squares error criterion. In generar, theseclassifiers perform poorly compared to k-nn, however, they can be reasonably well behavedwith respect to invatid images, gott _" riryrr ani luaoratic nrnction classifier have been usedwith success in a word recognition algorithm ff"ig:f.

5. Refinements

A number of refinemenb of both leature generation and classification have been fied withvarious results' The most successful i,nprlu"tn"nt to the feature generation was to use avariable 4x4 sampling grid on the image. A iro.iro"t r and vgrticar histogram of the image iscomputed and sampling lines are ptaceo on the equi-mass divisi;;i-the-histogram. Thisresults in higher sampling of regions with the most mass. A significant improvement inH:T.T* Tft

Ael recognition has been obrained with this scheme. A number ofornerent matching metrics trubbggl have been tried with the k_nn arrrirr"r'g"d, mekic, ormore appropriately, matching fi,rnction, blends the number of features that co_occurred withthe n'mb€r of features ttrat aiA oot -"tct o. *oJ oo, pr"."or.

6. Applications

Tbc Gsc crassifier * y:-":* 1u"3ril{ in severar different apprications. The firstapplicatiou is the segmenration of zrp "J"r- from digitiz€d p.ii"r?""ss images.T5i:IIy' a handwritte]r zTp c-odecontains I or g irol"t"d digi6 which usua'y can be readby r a;git recognizer. In some'cases, however, two or more of ttre adjacent digie can betcEting in unpredictable ways. one approach for accurate recognition is to force asegmcotation of the dieit string and r""ognir" eacl Qonenrriri;;il"d ;*, individuary.Tbc Gsc clqssifier is a-r<e1 ele,irent

."f ,td ";;.;;h. Another successtul apprication of theGSc dgai*n is in word recognition. a i"iiJ*irr"n word is " ,"qu"o"" of characrersrtce ibotities must be determined. rrtir c"o u" """o.prished by segmenting the word anddc."mdning character boundaries and identities. i", ry1"a words, segmentatron is usuallycrsy, hnfc crnsively written words it is *u"n ioi" am"Ut.

62

Favata, Srikantan and Srihari, Proc. IWFHR 1994, pp. 57-66.

Page 7: Handprinted Character/Digit Recognition using a Multiple Feature/Resolution Phitosophy · 2013-01-31 · Handprinted Character/Digit Recognition using a Multiple Feature/Resolution

erycrineob with the GSC classifier have shown that it excels in these applicationsbe tbc confidences are reliable enough to use in detecting the best character and digitddd?s after segmentation. Figure 2 shows the polynomial and GSC algorithms testedlrsics of images containing noise, special marks and digit sequences. It can be seen thatl Gsc alsqithsr offers reliable confidences and that nearly wery one of tbese images* be rcjected by picking one threshold value.

br6n Oigir Sbing Segmentation Algorithm

Oita tbc inage of a string of digits, the goal of the segmentation algorithm is to partitiont ioags into regions, each containing an isolated digtt. A recognition aided iterativeGod is used. Adjacent digits can be touching and some of the digits might be broken intosi rhrn one componeft (e.g. S-hats). Therefore the number of digis in a digit sting isu rinply e count of the number of components in the field. components which are

as touching digits must be segmenled appropriately. The module which performsl cgmentation and subsequent recognition of the segmented digrts does the foUowing:Grco e connected component with 2 to 4 digits, the module estimates the number of digil- $c component, performs the appropriate segmentations and recognizes the individualdgis. The module currently performs ataTlyo correct ra!e.

Tbc segmenter can be invoked in two different modes:

(i) estimate the number of digits and(ii) force a glven digit string length.

fls lrrmls1 of digits is initially estimated from the aspect ratio o1 gre .{igt string andcccssive estimates are obtained by a linear regression model. Digits that are recognized?ith high confidence are removed after each iteraiion. The effective contribution of thereoorrcd digib to the density o1 gts digrt sfing is recorded. This information is fed to abast squares linear model [Fen91]. By setting the density to zaro (all digrts are removed),6c least squaxes equation can estimate the number of digib in the sfing. Connected digitcoBpooents are split into required number of digits. The segmenter has a correct*gmentation rate of 92.9Vo when the string length is specifie4 and, is 87 .5Vo correct when6e number of digits has to be estimated.

General llandwrilten lilord Recognitian

The GSC has been applied to general handwritten word recognition in which a wordcan contain any mixture of discrete, cursive or touching discrete characters. A recognitiontcchnique called the hypothesis generation and reduction algorithm (HGR) IFavgal nrstsegme_n6 the word image using a number of candidate segmentation points. The Gsccles5ffis1 is then used to pinpoint the most likely charaster boundaries by extacting the

63

Favata, Srikantan and Srihari, Proc. IWFHR 1994, pp. 57-66.

Page 8: Handprinted Character/Digit Recognition using a Multiple Feature/Resolution Phitosophy · 2013-01-31 · Handprinted Character/Digit Recognition using a Multiple Feature/Resolution

strokes from among pairs of segmentation points (see Figure 3). Later, a lexicon &iveapost-processor finds the most likely word using the GSC segment pair corfideuces mdother available contexl

byrcmh

I O.75

- I o - a t

a I O.7a

I O . V

, I O.AO

7 ? o.z1

7 7 d a

7 . o '52

V zo.E€

a e o . v

f f 7o.€l

Yt ! o.55

G S C

I O . 1 7

o o . s

t o.70

1 0.*

o o.43

7 1 . Q O

7 7 . O O

7 0 . e 5

7 o - a o

o o.34

o o.43

4 0_65

tulFffil.l GSC

4 F . o 4 3 € O . . 3

JrE e o.aa 1 0.31

E , | 9 4 o 4 s a o €

a < - 7 g . r c o o . 3 9

t 7 0 . 4 1 0 . 6 9

: t 4 o . @ 3 0 . 2 6

( 7 4 o 8 O 0 . 7 6

* 2 4 o . 5 o 4 o . *

&, 9 0.65 a o.33

? ? 7 o . 4 a e o - z a

, A ? 7 a . 7 1 a o 3 €

Figure 2 - GSC tbresholding lest - comparison of confidences retumed by polynomial andGSC classifier on non-characterc. Output is arranged as ( input image, poly decision, polyconf, GSC decision, GSC conf ).

Pa, Cuaalvc acgment

4ut*ryL_,_

d E <traqlcd scgmcnls rnd GSC cfirractcr rcspongc

qL-n-

{rL*I

II t l "

Figure 3 - Finding characrcr boundaries

64

Favata, Srikantan and Srihari, Proc. IWFHR 1994, pp. 57-66.

Page 9: Handprinted Character/Digit Recognition using a Multiple Feature/Resolution Phitosophy · 2013-01-31 · Handprinted Character/Digit Recognition using a Multiple Feature/Resolution

?. Erperimentat Results

Tbc GSc classifiq has been wel characterized by extensive testing. Figure 4 compares-rral variations of this classifier on handwritten digits. The r";iing; consisted of24'm0 digi6 taken from various databases, the test set consisted of 2l-aadigits. Figure 56rys the GSC classifier performance tained separately on 60,000 upper and lowerc!'rr&ters chosen from the MST handprinted database. The test ,"t "o*irt"o of 20,000sFpcr and lower case characters.

GSC R.lrct va E.ror

I F

F

l0

to

- F

a $

$

a

t 0

I

figure 4 - GSC a;git results f

NIST Chr r rc tc r Res ! l t .

i

a 5

a c

r 3 0

4 2 5I z o

t 5

1 0

0

- N I S T U p p . r - o / C- - . - - - . . . N t S T U p p . r . v . r O S C- - - . - . N I S T ! o * . r . n .

Figure 5 - NIST upper/lower ct ar"cGierutts-

65

Favata, Srikantan and Srihari, Proc. IWFHR 1994, pp. 57-66.

Page 10: Handprinted Character/Digit Recognition using a Multiple Feature/Resolution Phitosophy · 2013-01-31 · Handprinted Character/Digit Recognition using a Multiple Feature/Resolution

Acknowledgment

The authors would like to thank Dr. venu Govindaraju, Evie Kleinberg Ajay Shekhawar,Phit Kilinskas and D. s. L,ee fon helping in various ways during the development of the Gscclassifier.

8. References

[Fav93] J. Favata, "Recognition of cursive, Discrele, and Mixed Handwritten words usingCharacter, I€xical and Spatial Qsns6aints", Tech Report 93-32, Dept of ComputerScience, SUNY Butrafo, 1993

[Fen91] R. Fenrich "Segmentation of automatically located handwritten words", Proc. Int.workshop in handwriting recognitiog Bonas, FRANCE, pp.33-44, L99l.

tHo93] T.K. Ho, J. Hull, and s. srihari, "Decision combination in Multiple classifierSystems", IEEE P AM I, Y ol I 6, No. L, pp 66-7 5, I an 199 4

[Klein93] E. Kleinberg and r.K. Ho, "Pattern Recognition by Stochastic Modeling" inProceedings of Third International Workshop on Frontiers in Handwriting Recognition(IWFHR III), Buffalo, NY 1993

tl*eg3l D. Ire and S. Srihari "Handprinted Digit Recognition: A comparison ofAlgorithms" in Proceedings of Third International Workshop on Frontiers in HandwritingRecognition (IWFHR m), Buffalo, NY 1993

Man86] J. Mantas, "An overview of character Recognition Methodologies", pattentRecognition, Vol 19, No. 6, pp 425-430,1986

[otsu79] N. otsu, "A theshold selection mettrod from grey-level hi$tograms", IEEE Transon SMC, Vol9, pp. 62-66,Ian. L979

lsrik93l G. Srikantan, "Gradient representation for handwritten character recognition" inProceedings of Third International Workshop on Frontiers in Handvriting RecognitionGWFHR ltr), Buffalo, NY 1993

[Suen92] C.Y. Suen et al, "Computer Recognition of Unconstained HandwrittenNumerals", IEEE hoceedings, Vol 80, No 7, pp 1162-1180, JuIy 1992

[Tubb89] J.D. Tubbs, "A Note on Binary Template Matching", pattern Recognition, yol22, No.4, pp.359-365, 1989

66

Favata, Srikantan and Srihari, Proc. IWFHR 1994, pp. 57-66.