prototyping structural description using an inductive learning program

21
Prototyping Structural Description Using an Inductive Learning Program Adnan Amin* School of Computer Science, University of New South Wales, Sydney, 2052, Australia Character recognition systems can contribute tremendously to the advancement of the automation process and can improve the interaction between man and machine in many applications, including office automation, cheque verification and a large variety of banking, business and data entry applications. The main theme of this paper is the automatic recognition of hand-printed Arabic characters using machine learning. Con- ventional methods have relied on hand-constructed dictionaries which are tedious to construct and difficult to make tolerant to variation in writing styles. The advantages of machine learning are that it can generalize over the large degree of variation between writing styles and recognition rules can be constructed by example. The system was tested on a sample of handwritten characters from several individuals whose writing ranged from acceptable to poor in quality and the correct average recognition rate obtained using cross-validation was 89.65%. 2000 John Wiley & Sons, Inc. 1. INTRODUCTION Character recognition is commonly known as Optical Character Recogni- Ž . tion OCR which deals with the recognition of optical characters. The origin of character recognition can be found as early as 1870 1 while it became a reality in the 1950s when the age of computer arrived. 2 Commercial OCR machines and packages have been available since the mid 1950s. OCR has wide applications in modern society: document reading and sorting, postal address reading, bank cheque recognition, form recognition, signature verification, digital bar code reading, map interpretation, engineering drawing recognition, and various other industrial and commercial applications. 3 11 The products that are currently commercially available for character recog- nition are limited to the recognition of typed text within a restricted number of fonts, or on-line recognition of hand-written characters. Products to perform off-line hand-printed text recognition are not available, although many ap- proaches have been proposed. In fact there has recently been a high level of interest in applying machine learning to solve this problem. 12 14 * Author to whom correspondence should be addressed; e-mail: amin@cse. unsw.edu.au Ž . INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, VOL. 15, 11031123 2000 2000 John Wiley & Sons, Inc.

Upload: adnan-amin

Post on 06-Jun-2016

216 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Prototyping structural description using an inductive learning program

Prototyping Structural Description Usingan Inductive Learning ProgramAdnan Amin*School of Computer Science, University of New South Wales,Sydney, 2052, Australia

Character recognition systems can contribute tremendously to the advancement of theautomation process and can improve the interaction between man and machine in manyapplications, including office automation, cheque verification and a large variety ofbanking, business and data entry applications. The main theme of this paper is theautomatic recognition of hand-printed Arabic characters using machine learning. Con-ventional methods have relied on hand-constructed dictionaries which are tedious toconstruct and difficult to make tolerant to variation in writing styles. The advantages ofmachine learning are that it can generalize over the large degree of variation betweenwriting styles and recognition rules can be constructed by example. The system was testedon a sample of handwritten characters from several individuals whose writing rangedfrom acceptable to poor in quality and the correct average recognition rate obtainedusing cross-validation was 89.65%. � 2000 John Wiley & Sons, Inc.

1. INTRODUCTION

Character recognition is commonly known as Optical Character Recogni-Ž .tion OCR which deals with the recognition of optical characters. The origin of

character recognition can be found as early as 18701 while it became a reality inthe 1950s when the age of computer arrived.2 Commercial OCR machines andpackages have been available since the mid 1950s. OCR has wide applications inmodern society: document reading and sorting, postal address reading, bankcheque recognition, form recognition, signature verification, digital bar codereading, map interpretation, engineering drawing recognition, and various otherindustrial and commercial applications.3 �11

The products that are currently commercially available for character recog-nition are limited to the recognition of typed text within a restricted number offonts, or on-line recognition of hand-written characters. Products to performoff-line hand-printed text recognition are not available, although many ap-proaches have been proposed. In fact there has recently been a high level ofinterest in applying machine learning to solve this problem.12 �14

* Author to whom correspondence should be addressed; e-mail: [email protected]

Ž .INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, VOL. 15, 1103�1123 2000� 2000 John Wiley & Sons, Inc.

Page 2: Prototyping structural description using an inductive learning program

AMIN1104

Much more difficult, and hence more interesting to researchers, is theability to automatically recognize handwritten characters.15 The complexity ofthe problem is greatly increased by noise and by the wide variability ofhandwriting as a result of the mood of the writer and the nature of the writing.Analysis of cursive scripts requires the segmentation of characters within theword and the detection of individual features. This is not a problem unique tocomputers; even human beings, who possess the most efficient optical reading

Ž .device eyes , have difficulty in recognizing some cursive scripts and have anerror rate of about 4% on reading tasks in the absence of context.6

Different approaches covered under the general term ‘character recogni-tion’ fall into either the on-line or the off-line category, each having its ownhardware and recognition algorithms. In on-line character recognition systems,the computer recognizes the symbols as they are drawn.17 � 23 The most commonwriting surface is the digitizing tablet, which operates through a special pen incontact with the surface of the tablet and emits the coordinates of the plottedpoints at a constant frequency. Breaking contact prompts the transmission of aspecial character. Thus, recording on the tablet produces strings of coordinatesseparated by signs indicating when the pen has ceased to touch the tabletsurface.

On-line recognition has several interesting characteristics. First, recognitionis performed on one-dimensional data rather than two-dimensional images as inthe case of off-line recognition. The writing line is represented by a sequence ofdots whose location is a function of time. This has several important conse-quences:

� The writing order is available and can be used by the recognition process.� The writing line has no width.� Temporal information, like velocity can also be taken into consideration.� Additionally, penlifts can be useful in the recognition process.

Off-line recognition is performed after writing or printing is completed.Ž .24 � 27Optical character recognition OCR deals with the recognition of optically

processed characters rather than magnetically processed ones. In a typical OCRsystem, input characters are read and digitized by an optical scanner. Eachcharacter is then located and segmented and the resulting matrix is fed into apreprocessor for smoothing, noise reduction, and size normalization. Offlinerecognition can be considered the most general case: no special device isrequired for writing, and signal interpretation is independent of signal genera-tion as in human recognition.

Many papers have been concerned with Latin, Chinese and Japanesecharacters. However, although almost a third of a billion people worldwide, inseveral different languages, use Arabic characters for writing, little researchprogress, in both on-line and offline has been achieved towards the automaticrecognition of Arabic characters. This is a result of the lack of adequate supportin terms of funding and other utilities such as Arabic text database, dictionaries,etc.

Page 3: Prototyping structural description using an inductive learning program

PROTOTYPING STRUCTURAL DESCRIPTION 1105

There are two strategies which have been applied to printed and handwrit-ten Arabic character recognition.28 These can be categorized as follows:

Ž .i Holistic strategies in which the recognition is globally performed on the wholerepresentation of words and where there is no attempt to identify charactersindividually. These strategies were originally introduced for speech recognitionand can fall into two categories:Ž . 29,30a Methods based on distance measurements using Dynamic Programming.Ž . Ž . 31� 37b Methods based on a probabilistic framework Hidden Markov Models .

Ž .ii Analytical strategies in which words are not considered as a whole, but assequences of small size units and the recognition is not directly performed atword level but at an intermediate level dealing with these units, which can begraphemes, segments, pseudo-letters, etc.38� 43

Rule-based systems are commonly used in character recognitionsoftware.44,45 Unfortunately, many rules must be constructed by hand to achievegood accuracy. For example, 400 rules were used in recognizing the ten Arabic

Ž . 44digits 0..9 , with an average recognition of 91.4%. To increase the recognitionrate more rules have to be added to the rule base in order to have a widercoverage of the variability of writing styles. Obviously, more rules are requiredfor a large character set and the number of rules is not linearly proportional tothe size of the character set. The problem becomes even worse in the case ofArabic character recognition.

This paper proposes a new technique using machine learning for recogniz-ing hand-printed characters. Manually building a dictionary to cover all charac-ters is both time-consuming and error prone. Inductive learning is our attemptto automate this process. In this method, the stroke types and relationshipsbetween the strokes are extracted from the characters in the training set. Fromthis data, the inductive learning program can generate by induction the first-orderHorn clauses representing the characters. These Horn clauses are then used forclassification of running data. Figure 1 shows the block diagram of this system.

The paper is organized as follows: Section 2 outlines the procedure fordigitization and preprocessing of the image; Section 3 introduces the features ofthe image and the algorithm to trace it; Section 4 outlines the algorithms toextract the features from the image; Section 5 describes the character classifica-tion technique using Inductive Logic Programming; Section 6 presents theresults of the experiments performed; and finally, some concluding remarks aremade in Section 7.

2. DIGITIZATION AND PREPROCESSING

2.1. Digitization

The first phase in our character recognition system is digitization. Docu-ments to be processed are first scanned and digitized. A 300 dpi scanner is usedto digitize the image. This generates a TIFF file which is then converted to 1-bitplane PBM file. The PBM format contains a small header which incorporates a

Page 4: Prototyping structural description using an inductive learning program

AMIN1106

Figure 1. System diagram of OCR.

file stamp followed by the dimensions of the image in pixels. The remainder ofthe file contains the image data.

2.2. Pre-thinning and Thinning

This step aims to reduce the noise due to the binarization process. Thepre-thinning algorithm used in this paper is as follows:

Input. A digitized image I in PBM format.Output. A pre-thinned image I �, also in PBM format.begin

1. For each pixel P in image I, let P0 to P7 be its eight neighbors, start-Žing from the east neighbor and counted in an anti-clockwise fashion see

.Fig. 2 .( ) �2. Let B P �P0�P2�P4�P6. Let P be the corresponding pixel of P

in the pre-thinned image I �.( ) �3. If B P � 2 then set P to white

( ) �Else If B P � 2 then set P to blackElse set P� to the value of P;

end

Page 5: Prototyping structural description using an inductive learning program

PROTOTYPING STRUCTURAL DESCRIPTION 1107

Figure 2. P and the labeling of its eight neighbors.

2.3. Thinning

The thinning of elongated objects is a fundamental preprocessing operationin image analysis defined as the process of reducing the width of a line-likeobject from several pixels to a single pixel. The resultant image is called ‘‘theskeleton.’’

There is no general agreement in literature on the exact definition ofthinness. A study reported in Ref. 46, however, examines the connectednesscriteria to arrive at a definition. This study concludes that the thinning algorithmshould be able to satisfy the following connectedness conditions:

Ž . Ž .i Connectedness is preserved, for both objects the black pixels and theirŽ .complements white pixels .

Ž .ii Curves, arcs, and isolated points remain unchanged.Ž .iii Upright rectangles, whose length and width are both greater than 1, can be

changed.

Ž .Iterative thinning algorithms delete border points pixels if their removaldoes not affect the connectivity of the original object. The original image cannotbe recovered from the skeleton in most of these algorithms, however, somethinning algorithms permit the reconstruction of the original image from theskeleton.47,48 These algorithms, called reconstructible thinning algorithms, searchfor the set of centers and radii of blocks contained in the image while preservingall details of original objects within the resulting skeleton.

The iterative thinning algorithms can be implemented using either parallelor sequential strategies. In the parallel thinning strategy, the new value given toa pixel in the image during the nth iteration depends on its own value as well as

Ž .those of its eight neighbors at the n� 1 th iteration. Hence, all pixels can beprocessed simultaneously.49 � 51 Sequential thinning algorithms, on the otherhand, assign a new value to the pixel in the binary image during the nthiteration depending on its own value and the values of the eight neighbors at the

Ž .nth and n� 1 th cycles, which in turn depend on whether these pixels havealready been scanned or not. This pixel processing is done in a sequentialmanner.52 � 54

For conducting the work reported in this paper, a parallel thinning algo-rithm was used. The parallel thinning algorithm operates by repeatedly remov-ing border points which satisfy certain removal conditions. This removal process

Ž .is an iterative process performed in steps or cycles . Assume that the object tobe thinned is a 2� 2 square of ones surrounded by zeros. According to thecondition ‘‘c’’ above, the object is to be thinned; hence the removal conditions

Page 6: Prototyping structural description using an inductive learning program

AMIN1108

Figure 3. An example for the Arabic characters before and after thinning.

are applied to all four 1s simultaneously. Since the result of the removal stepwould be the disappearance of the square, the removal step is divided into

Ž .smaller steps called subcycles . If four subcycles are used, then every subcycleŽ .will remove one type of border point North, South, East, West .

The thinning algorithm adopted in this paper is Jang and Chin’s one passparallel thinning algorithm55 because it gives skeletons with fewer spuriousbranches. Figure 3 illustrates an original scanned image and the resultingskeleton after applying the thinning algorithm.

3. FEATURE PREPROCESSING

3.1. Feature Point Extraction

A feature point is a black pixel in the thinned image which has a number ofblack neighbors not equal to 0 or 2. The types of feature points are End points,Branch points and Cross points. An End point, as its name suggests, is theend�start of a line segment, a Branch point connects three branches while aCross Point connects four branches. Figure 4 depicts these three types.

The number of branches connected to a feature point is used to determinethe connection of nodes in skeleton tracing. The traditional method of detectingBranch points is by counting the number of black neighbors in a 3� 3 pixel

Page 7: Prototyping structural description using an inductive learning program

PROTOTYPING STRUCTURAL DESCRIPTION 1109

Figure 4. The types of feature point.

window. Instead of using this method, the algorithm uses a new method todetect and classify a feature point. The algorithm counts the number oftransitions from black to white pixels in the surrounding 3� 3 window in orderto determine the type of feature point. This method does not have the problemof multiple defined Branch or Cross points. To illustrate this technique Figure 5shows three points: a, b and c. If the sum of black neighbors method is used,then points a, b and c will have a value of 3. Therefore, all these points will bedefined as Branch points. On the other hand, if the number of transitionsmethod is used then only point b will be defined as a feature point and classifiedas a Branch point.

The feature points detection algorithm horizontally scans the input imagepixel by pixel and gets the corresponding 3� 3 window for each pixel. Thealgorithm scans the window boundary for pixel P starting from the East pixelŽ .P0 and follows a clockwise direction, the number of transitions is thencalculated. If the number of crossings is one, this point is classified as an Endpoint and its coordinate is then recorded. Similarly, the number of crossings isthree for a Branch point; and four for a Cross point. Figure 6 depicts thisalgorithm.

3.2. Skeleton Tracing

The next stage of the feature extraction process is to build a graphrepresentation of the input image. The data structure used is an adjacencymatrix. The maximum number of a particular type of feature point that can exist

Figure 5. Difference between using sum of neighbor and number of transitions methods.

Page 8: Prototyping structural description using an inductive learning program

AMIN1110

ŽFigure 6. Number of transitions algorithm for detecting the feature point x represents.a boundary crossing attribute represents the relative. The position attribute of a feature

point has three possible values: Upper or Lower. This position of the feature point withinthe image.

in an Arabic character is five. Therefore, the first five indices of the matrixcorrespond to End points, the next five indices correspond to Branch points, andthe last five indices correspond to Cross points. An extra index is included in thematrix data structure and serves as a flag to determine an input image that hasno feature point.

The algorithm traces the thinned character and creates a 16� 16 matrix to� �� �represent the input character. Each element i j of the matrix contains four

fields of data:

Ž .a Number of connection lines from feature point i to feature point j. Hence, thisfield contains the adjacency information of the graph. This field is the same for

� �� � � �� � � ��element i j and element j i . This field will be 1 in element i il in the caseof a loop.

Ž . 56b A Freeman code string describing the line connecting point i to point j and its� �� � � �� �length. The values of this field in elements i j and j i will be complements.

However, the lengths will be the same.Ž .c The maximum and minimum values of the x and y coordinates of the line

� �� �connecting point i to point j. This field is the same for element i j and� �� �element j i . These values are used for determining the position and area

information that will be used in later stages of the algorithm.Ž .d An optional pointer that points to another element in order to describe another

line connecting point i to point j. If the number of connections between the twopoints is 1, this pointer will be null.

It is worth noting here that this image tracing algorithm is basically a depthfirst search algorithm. The algorithm is described in Figure 7.

The trace and delete step in the above algorithm will erase the tracedbranch so that it will not be traversed again. Special care must be taken toimplement such a function when going through a turning point or near anotherfeature point. These cases will be explained in the following paragraph.

The Trace and Delete function utilizes a 3� 3 window to trace from astarting point until it reaches a feature point or the original starting point.

Page 9: Prototyping structural description using an inductive learning program

PROTOTYPING STRUCTURAL DESCRIPTION 1111

Figure 7. Image tracing algorithm.

During tracing, it generates a linked list of directional codes. In addition itrecords the maximum and minimum coordinates of the traced segment. Tracedpixels are deleted to prevent back tracing. One special case is when reaching aturning point, from which multiple directions are reachable, as depicted inFigure 8. In order to solve this problem, eight different window priority tem-plates are used so that the trace will follow the previous moment of direction, asshown in Figure 9. In these priority templates the highest priority is given to thetrace’s previous direction. Thereafter, higher priority is given to the ones that

Figure 8. Turning point illustration.

Page 10: Prototyping structural description using an inductive learning program

AMIN1112

Figure 9. Priority Templates. The arrow represents the previous direction. Priority 1Ž . Ž .High to priority 5 Low .

are closer to the previous direction. For example, in Figure 8 the algorithmtraces from a to b and has to make the decision whether go to c or d next.According to the priority template, it will choose to go to c because it has thesame trace direction as the one from point a to point b.

Another special case that the Trace and Delete function has to take care ofis the pixel just before reaching a feature point. This case is demonstrated inFigure 8, points a, b and d. Since the number of black neighbors at the pixelwindow will be greater than one, a decision has to be made in order to reach thedestination feature point. The solution is a one-pixel-lookahead checking; foreach reachable black pixel in the 3� 3 window, the algorithm checks whetherthe pixel is a feature point or not. If it is, the trace goes in its direction. Thelookahead checking also follows the priority templates, depicted in Figure 9, sothat the tracing always follows the same direction moment.

For multiply segmented characters, we can repeat the skeleton tracing untilall segments in the character image are deleted.

4. FEATURE EXTRACTION

4.1. Complementary Characters Extraction

ŽSome Arabic characters are accompanied by complementary characters A.portion of a character that is needed to complement an Arabic character .

Ž .These complementary characters are dots and hamza a zig-zag shape . This mayappear on, above, or below the baseline, and can be positioned differently, forinstance, above, below or within the confines of the character.

The number of dots and their location relative to the main skeleton of thecharacter have to be identified. The number of dots can be one, two or three.The dots can be below or above the main skeleton of the character.

The complementary characters can be identified easily from the 16� 16matrix formed earlier. These characters are small in size relative to the overallimage. Furthermore, there are few feature points within them.

Page 11: Prototyping structural description using an inductive learning program

PROTOTYPING STRUCTURAL DESCRIPTION 1113

4.2. Loop Extraction

4.2.1. Loop Positioning

Three types of loops are defined: Large, Small Upper and Small Lower. ALarge loop occupies nearly all of the input image. For Small loops, its positionrelative to the input image determines whether it is an Upper or Lower loop.

The algorithm determines the loop type by first separating the input imageŽ .into three regions upper, middle and lower . By definition, a Large loop must

have its minimum y-coordinate located at the upper region and its maximumŽ .y-coordinate located at the lower region note that y increases downwardly . If

the loop is not a Large loop another detection method is applied. In this case,the input image region is divided into an upper half and a lower half. Themid-point of the maximum and minimum y-coordinates of a loop is calculatedand the loop is classified according to the mid-point location.

4.2.2. Direct Loop Feature

We have to detect a loop before positioning it. The simplest form of loop isthe one directly connected by only one feature point, and is defined as a directloop. In the 16� 16 adjacency matrix, it can be directly detected by checkingwhether there are any connections from node i to node i. Once a direct loop isdetected, its connection in the adjacency matrix will be deleted to prevent itfrom being detected again.

4.2.3. Indirect Loop Feature

Another form of loop is that which is connected by more than one node. Inother words, the loop is composed of multiple segments. Two main problems areinvolved here:

Ž .i How to traverse the whole image to search for an indirect loop?Ž .ii Which paths should be counted if multiple connections exist as parts of the

loop?

The first problem can be solved by using a modified depth first searchalgorithm. The algorithm is similar to that of skeleton tracing but with the stackstoring the matrix column indices which have more than one connection. Everytime an element is popped from the stack the search process will start. Theprocess terminates when the end of a branch is reached or a loop is detected.When starting from a node, every node reached is stored in a path list and thetraced connection is removed. The node just reached is compared with theprevious nodes in the path list. If that node is already in the previous path list anindirect loop is detected. The path for that loop is then transferred to anotherlinked list data structure for further processing. When the search process

Page 12: Prototyping structural description using an inductive learning program

AMIN1114

Figure 10. Example for calculating loop coordinates.

Žreaches the end of a branch or a loop is detected, the top of the stack another.branch is then popped to begin another search process.

After the paths of loops are recorded, the next step is to get the maximumŽ .and minimum coordinates values of the loops. For instance, the loop 2� 3� 2

Ž . Ž . Ž .shown in Figure 10 can be represented by the walks a, b , a, c or b, c . It ispreferable to identify the loop that is not part of another loop, especially in digit

Ž .recognition. Therefore, the desired walks for the above loop are a ,b andŽ .b, c . How to choose the loop walks in a multiple connection situation stillremains to be answered.

To ensure that a loop will not contain another loop the set of walks must bechosen such that the size is minimized. The proposed solution is to pick the nextconnection with the smallest length. In the example of Figure 8, the algorithm

Ž .will choose ‘‘b’’ the smallest of the three connections , thereafter, it will chooseŽ .‘‘a’’ assuming length of a� length of c for the first indirect loop. To identify

the second indirect loop it first goes back to node 3 through b and then goesthrough the only remaining connection, ‘‘c’’.

After the algorithm determines the walks for every loop, it calculates themaximum and minimum coordinates to determine the type of each loop, as inthe case of a direct loop. Since the maximum and minimum coordinate valuesfor all connections were already recorded during the skeleton tracing, themaximum and minimum coordinates of the loop can be obtained easily bycomparing the size of the connections in loop walks.

4.3. Curve and Line Extraction

The primitives that are left in the 16� 16 adjacency matrix after the loopextraction are straight and curved line segments. These line segments varyconsiderably in shape and length. A line segment may be composed of curvesand straight lines. Moreover, the degree of straightness of lines and curvature ofcurves also vary with different inputs.

The method applied in this curve detection stage is an algorithm ofcumulative angle change in feature detection, reported in Ref. 57. Furthermorethe corner-finding algorithm, reported in Ref. 58, is used to calculate the angle.The basic concept is the calculation of the curvature by the cumulative angle-change.

The angle-change is defined as the increment or decrement in the anglethat adjacent straight lines make with the x-axis. In other words, it is the

Page 13: Prototyping structural description using an inductive learning program

PROTOTYPING STRUCTURAL DESCRIPTION 1115

Figure 11. The four directions of a line.

difference in angles between adjacent linear approximations. For a curve fea-ture, the angle-change will always have the same sign. The sign of an angle-

Ž .change is defined by whether it is a clockwise positive or anti-clockwiseŽ .negative change.

As for other features, the direction and position information of lines andcurves is required for recognition. For a straight line feature, four directions aredefined as illustrated in Figure 11.

The directional types of the straight line are classified by the angle it makeswith the x-axis. During the curve and line detection step, the start and endpoints are recorded at the same time. The slope of the straight line joining thetwo points is used for the angle calculation.

Four types of curve direction are also defined according to the curve’sopening as defined in Figure 12.

After all the strokes have been extracted from the image, the final stage offeature extraction is to determine the relationships between the primitives.These are shown in Figure 13.

5. THE INDUCTIVE LEARNING PROGRAM—FOIL5

The method we have used to learn the descriptions of the classes is basedŽ . 59,60on Quinlan’s FOIL algorithm for Inductive Logic Programming ILP . ILP is

a supervised concept learning system which can induce classification rules frominformation about entities whose class membership are known. FOIL is anadvanced machine learning system. It generates classification rules as extended

Ž .Horn clauses a restricted first-order logic using the coverage algorithm whichhas been proved successful in propositional attribute-value inductive learningsystems. Therefore the relations are actually output in the tuple format acceptedby FOIL. An ILP system is given a logic program B, which encodes the eventualbackground knowledge about the problem, and a set T of examples, represented

Figure 12. The four directions of a curve.

Page 14: Prototyping structural description using an inductive learning program

AMIN1116

Figure 13. The relationships between two strokes.

Ž �by logic facts, partitioned into positive and negative examples subset T and�.T , which in our case correspond to characters of the class being defined and

of the other classes respectively. The system attempts to find a new logicŽ .program P a class description expressed as a set of rules such that:

� t�T�, B�P� t and � t�T�, B�Pt 1Ž .

If the samples in T are sufficiently representative of the concept beinglearned, an ILP algorithm is expected to find a concept definition which is moregeneral than the supplied examples.

The algorithm we used is able to learn only one predicate at a time,requiring different learning phases to learn a logic program P composed ofmore than one predicate; this is not a serve restriction for our application.

Page 15: Prototyping structural description using an inductive learning program

PROTOTYPING STRUCTURAL DESCRIPTION 1117

Hence the program being learned is formed by one or more clauses. FOIL5expresses the output as extended function-free Horn clause logic of the form:

P�L , L , . . . , L1 2 n

which is interpreted as ‘‘if L and L and . . . L then P ’’. To increase the1 2 nexpressive power of the output language, FOIL5 adopted an extension to normalHorn clause by allowing negated literal on the right hand side of the expression.

The relations we used to input to FOIL5 is basically those described in theprevious section. Each pair of stroke is taken in turn and checked against the setof possible relations. If there is a match, a symbolic representation of therelation is output. These relations can be written in first order logic as shownbelow.

Domains typesŽ .stroke, character

Relations

part of stroke character stroke is one of the strokes in character.Ž .horizontal stroke stroke is a horizontal type 1 primitive .Ž . Ž .�ertical stroke stroke is a vertical type 2 primitive .Ž . Ž .slash stroke stroke is a slash type 3 primitive .Ž . Ž .backslash stroke stroke is a backslash type 4 primitive .Ž . Ž .east stroke stroke is an east curve type 5 primitive .Ž . Ž .south stroke stroke is a south curve type 6 primitive .Ž . Ž .west stroke stroke is a west curve type 7 primitive .Ž . Ž .north stroke stroke is a north curve type 8 primitive .Ž . Ž .loop stroke stroke is a closed curve type 9 primitive .Ž . Ž .one dot stroke is a group of one dot type 10 primitive .Ž .two dots stroke is a group of two dots type 11 primitive .Ž .three dots stroke is a group of three dots type 12 primitive .Ž .hamza stroke is a hamza type 13 primitive .Ž .

Ž .start start stroke 1, stroke2 the relationship between stroke1 and stroke2 is aŽ .start-start relation A .

Ž .endt snd stroke 1, stroke2 the relationship between stroke1 and stroke2 is anŽ .end-start relation B .Ž .start end stroke1, stroke2 the relationship between stroke1 and stroke2 is aŽ .start-end relation C .Ž .end middle stroke1, stroke2 the relationship between stroke1 and stroke2 is anŽ .end-middle relation D .

Ž .up down stroke1, stroke2 the relationship between stroke1 and stroke2 is anŽ .up-down relation E .

Page 16: Prototyping structural description using an inductive learning program

AMIN1118

Ž .down up stroke1, stroke2 the relationship between stroke1 and stroke2 is aŽ .down-up relation F .

Ž .left right stroke1, stroke2 the relationship between stroke1 and stroke2 is aŽ .left-right relation G .

Ž .intersection stroke1, stroke2 the relationship between stroke1 and stroke2 is anŽ .intersection relation H .

Ž .An example of the Arabic character ‘Ghane’ Fig. 14 is shown below, inthis format.The expression of the character ‘Ghane’ in the propositional logicforms will be:

Žstroke 1 � dot & relation 1 2 � up down & stroke 2 � east &.relation 2 3� end start & stroke 3� east

Žstroke 1 � dot & relation 1 2 � up down & stroke 2 � east &.relation 2 3� start start & stroke 3� east

FOIL5 has much more expressive input and output language. In FOIL5,inputs are represented as a collection of relations. A relation is associated with ak-ary predicate and consists of the set of k-tuples of constants that satisfy thepredicate. In the above example, the input may be like this:

Inputs

� 4character� Ghane 1, Ghane 2�stroke� Ghane 1 s1, Ghane 1 s2, Ghane 1 s3,

4Ghane 2 s1, Ghane 2 s2, Ghane 2 s3� ² : ² :part of� Ghane 1 s1, Ghane 1 , Ghane 1 s2, Ghane 1 ,

² : ² :Ghane 1 s3, Ghane 1 , Ghane 2 s1, Ghane 2 ,² : ² :4Ghane 2 s2, Ghane 2 , Ghane 2 s3, Ghane 2

� 4one dot� Ghane 1 s1, Ghane 2 s1� 4east� Ghane 1 s2, Ghane 2 s3� ² 4east� Ghane 1 s3, Ghane 2 s2

� ² : ² :4up down� Ghane 1 s1, Ghane 1 s2 , Ghane 2 s1, Ghane 2 s3

Figure 14. An Arabic character ‘Ghane’.

Page 17: Prototyping structural description using an inductive learning program

PROTOTYPING STRUCTURAL DESCRIPTION 1119

� ² :4start start� Ghane 1 s2, Ghane 1 s3� ² :4end start� Ghane 2 s2, Ghane 2 s3

� 4Ghane� Ghane 1, Ghane 2

The output from FOIL is a Horn clause representation of the induced concept.Horn clause descriptions of ‘Ghane’, are:

Outputs

Ghane A :- Ghane A :-Ž . Ž .part of B, A , part of B, A ,Ž . Ž .dot B , dot B ,Ž . Ž .up down B, C up down B, CŽ . Ž .east C east CŽ . Ž .end start C, D start start C, DŽ . Ž .east D east DŽ . Ž .

6. EXPERIMENTAL RESULTS

Our induction experiments are part of a study for a subset of handprintedArabic characters to test the feasibility of automatically building a recognitionsystem. Forty samples of one hundred and twenty different characters werewritten by different writers, ranging from acceptable to poor in quality. Thesamples were preprocessed and then the primitives were extracted using astructural approach, as described earlier. For most of the experiments, thirtysamples from each character were randomly selected and used as the trainingset. The remaining ten samples of each character were used as test data.

FOIL has a number of parameters that can be varied by the user to controlthe accuracy and complexity of the induced clauses. We have used the defaultsettings in all of our experiments, so far. These settings are: 89.65% for the

Žminimum cluse accuracy this is the minimum percentage of the training set that.can be covered by a clause before that clause is accepted , 4 for the maximum

Žvariable depth this is the maximum number of unbound variable allowed during.the generation of clauses .

In the experiments below, the recognition rate refers to the percentage ofpositive test examples that the learned rule can correctly recognize. The rejec-tion rate refers to the percentage of negative test examples that the learned rulecan correctly reject.

We had 3600 samples in the training set. These samples were divided intoone positive training set and one negative training set. The positive training set

Žconsists of samples of the target character i.e., the character we wanted to find.the classification rule . The negative training set consists of samples other than

the target character.

Page 18: Prototyping structural description using an inductive learning program

AMIN1120

Table I. Effect of varying positive and negative training set sizes on the recognition andrejection rates. The percentage apply to test set.

� of Negative Training Samples per Character� of Positive TrainingSamples per Character 595 1190 1785 2380 2975 3540

5 recognition rate 65.65% 61.77% 57.83% 52.50% 50.74% 37.85%rejection rate 98.29% 98.68% 98.40% 98.35% 98.63% 98.57%

10 recognition rate 78.88% 77.76% 70.47% 69.73% 66.98% 63.84%rejection rate 97.92% 98.28% 98.35% 98.57% 98.50% 98.62%

15 recognition rate 80.78% 78.54% 75.76% 73.55% 71.35% 71.90%rejection rate 97.77% 98.25% 98.32% 98.43% 98.53% 98.68%

20 recognition rate 84.33% 80.45% 77.75% 75.98% 74.70% 73.88%rejection rate 97.47% 97.90% 98.25% 98.39% 98.33% 98.75%

25 recognition rate 87.15% 86.27% 85.95% 83.54% 82.85% 81.89%rejection rate 97.79% 97.89% 98.21% 98.25% 98.27% 98.58%

30 recognition rate 89.65% 88.95% 88.72% 87.62% 86.35% 85.78%rejection rate 97.95% 97.88% 97.83% 97.65% 97.35% 98.95%

The results are summarized in Table I. The size of the positive training setwas varied from five to thirty and the size of negative training set was varied

Žfrom 595 119� 5, five samples from each character not the same as target. Žcharacter to 3540 119� 30, thirty samples from each character not the same

.target character . The recognition rates and the rejection rates are the averageof the 80 characters. Instances were selected at random.

The best average recognition rate achieved in these experiments was89.65%, with a rejection rate of 97.95%. Moreover, we found that the generaltrend was that the recognition rate increased and the rejection rate decreased asthe size of positive training set decreased. The best combination in our experi-

Žment is the positive training size of 30 and negative training size of 595 5.characters from each non-target characters .

7. CONCLUSION

This paper presented a new technique for recognizing hand printed Arabiccharacters using induction learning and as indicated by the experiments per-formed, the best recognition rate was found to be 89.65% with a rejection rateof 97.95%.

Moreover, the system used a structural approach for feature extractionŽbased on structure primitives such as curves, straight lines and loops in similar

.manner to which human begins describe characters geometrically . This ap-proach is more efficient for feature extraction and recognition.

In conclusion, the automatic construction of character recognition systemsby induction is still in its infancy and has a long way to go before this techniqueis mature enough for practical use. However, in comparison with other methods,induction has an advantage that most of the construction of recognition rules isdone by computer rather than manually. This is a very attractive feature and

Page 19: Prototyping structural description using an inductive learning program

PROTOTYPING STRUCTURAL DESCRIPTION 1121

therefore further exploration of this application of Inductive Logic Program-Ž .ming ILP is well worthwhile.

In the area of recognition, a structural approach has been previously used.This approach is sufficient to deal with ambiguity without using contextualinformation. This area remains underdeveloped due to the immaturity of vitalcomputational principles for Arabic character recognition.

References

1. Govindan V, Shivaprasad A. Character recognition�a review. Pattern RecognitionŽ .1990;23 7 :671�83.

2. Mori S, Suen CY, Yamamoto K. Historical review of OCR research and develop-ment. Proceedings of the IEEE 1992;80:1029�58.

3. Harmon LD. Automatic recognition of printed and script. Proc IEEEŽ .1972;60 10 :1165�77.

4. Spanjersberg AA. Experiments with automatic input of handwritten numerical dataŽ .into a large administrative system. IEEE Trans Man Cybern 1978;8 4 :286�88.

5. Focht LR, Burger A. A numeric script recognition processor for postal zip codeapplication. Int Conf Cybenetics and Society 1975;486�92.

6. Plamondon R, Baron A. On-line recognition of hand-print schematic pseudocode forautomatic Fortran code generator. 8th Int. Conf on Pattern Recognition 1986; Paris.p 741�45.

7. Guillevic D, Suen CY. Cursive script recognition: A fast reader scheme. 2nd Int.Conf. on Document Analysis and Recognition 1993; Japan. p 311, 314.

8. El-Yacoubi A, Bertille J-M, Gilloux M. Conjoined location and recognition of streetnames within a postal address delivery line. 3rd International Conference on Docu-ment Analysis and Recognition, ICDAR’95 1995. Montreal, Canada. p 1124�1027.

9. Dedel J-P, Shinghal R. Symbolic�neural recognition of cursive amounts on bankcheques. 3rd International Conference on Document Analysis and Recognition,ICDAR’95, 1995. Montreal, Canada. p 15�18.

10. Vaxiviere P, Tombre K. Celesstin CAD conversion and mechanical drawings. IEEEŽ .Computer Magazine 1992;25 7 :46�54.

11. Dori D, Wenyin L. Vector-based segmentation of text connected to graphics inengineering drawings. Advances in Structural and Syntactical Pattern Recognition-6thInternational Workshop SSPR’96 1996. Leipzig, Germany. p 322�31.

12. Amin A, Chen CP, Sammut C, Sum KC. Learning to recognize Hand-PrintedChinese Characters using Inductive Logic Programming. Fourth International Work-shop on Inductive Logic Programming 1994. Bad Honnef�Bonn, Germany. p 263�71.

13. Amin A, Compton P. Hand printed character recognition using machine learning. In:Impedovo, S, editor. Progress in handwriting recognition. World Scientific; 1997.

14. Hsu J, Hwang S. A machine learning approach for acquiring descriptive classificationŽ .rules of shape contours. Pattern Recognition 1997;3 2 :245�52.

15. Lecolinet E, Baret O. Cursive word recognition: Methods and strategies. In: Impe-dovo, S, editor. Fundamentals in handwriting recognition; 1994. p 235�63.

16. Suen CY, Shingal R, Kwan CC. Dispersion factor: A quantitative measurement ofthe quality of handprinted characters. Int. Conference of Cybernetics and Society1977. p 681�85.

17. Burr DJ. Designing a handwritten reader. 5th Int. Conference on Pattern Recogni-tion 1980. p 715�722.

18. Amin A, Shoukry A. Topological and statistical analysis of line drawing. PatternRecognition Letter 1983;1:365�374.

19. Amin A. Machine recognition of handwritten Arabic word by the IRAC II system.6th Int. Conference on Pattern Recognition 1982. p 34�36.

Page 20: Prototyping structural description using an inductive learning program

AMIN1122

20. Kim J, Tappert CC. Handwriting recognition accuracy versus tablet resolution andsamplng rate. 7th Int. Conference on Pattern Recognition 1984. p 917�918.

21. Ward JR, Kuklinski T. A model for variability effects in handprinted with implicationfor the design of handwritten character recognition system. IEEE Trans ManCybernetics 1988;18:438�451.

22. Nouboud F, Plamondon R. On-line recognition of handprinted characters: SurveyŽ .and beta tests. Pattern Recogn 1990;25 9 :1031�1044.

23. Guyon I, Schenkel M, Denker J. Overview and synthesis of on-line cursive handwrit-ing recognition techniques. In: Bunke H, Wang PSP, editors. Handbook of characterrecognition and document image analysis. World Scientific; 1997. p 183�225.

24. Govindan VK, Shivaprasad AP. Character recognition: A-review. Pattern RecognŽ .1990;23 7 :671�683.

25. Impedovo S, Ottaviano L, Occhinegro S. Optical character recognition-A survey. IntŽ .J Pattern Recogn Artif Intell 1991;5 142 :1�24.

Ž .26. Bokser M. Omnidocument technologies. Proc IEEE 1992;80 7 :1066�1078.27. Fujisawa H, Nakano Y, Kurino K. Segmentation methods for character recognition:

Ž .From segmentation to document structure analysis. Proc IEEE 1992;80 7 :1079�1091.28. Amin A. Off-line Arabic characters-A survey. Proceedings of ICDAR’97, Ulm

Germany 1997. p 596�599.29. Khemakhem M. Reconnaissance de caracters imprimes par comparaison dynamique.

These de Doctorate de 3 e’me cycle. University of Paris XI, 1987.30. Khemakhem M, Fehri MC. Recognition of printed Arabic characters by comparaison

dynamique. Proc. First Kuwait Computer Conference 1989. p 448�462.31. Abdelazim HY, Hashish MA. Interactive font learning for Arabic OCR. Proc. First

Kuwait Computer Conference 1989. p 464�482.32. Abdelazim HY, Hashish MA. Automatic recognition of handwritten Hindi numerals.

Proc. of the 11th National Computer Conference Dhahran 1989. p 287�299.33. Emam Z, Hashish MA. Application of Hidden Markov Model to the recognition of

isolated Arabic word. Proc. of the 11th National Computer Conference Dhahran1989. p 761�774.

34. Schwartz R, LaPre C, Makhoul J, Raphael C, Zhao Y. Language independent OCRusing a continuous speech recognition system. 13th International Conference onPattern Recognition, vol. C, 1996. Vienna, Austria. p 99�103.

35. Mahjoub MA. Choix des parametres lies a l’apprentissage dans la reconnaissance enligne des caractercs arabes par les chaines de markov cachees. In Forum de laRecherche en Informatique Tunis, Juillet 1996.

36. BenAmara N, Belaid A. Printed PAW recognition based on planar hidden markovmodels. In 13th International Conference on Pattern Recognition, vol. B, 1996.Vienna, Austria.

37. Miled H, Cheriet M, Olivier C. Multi-level Arabic handwritten words recognition. In:Amin A, Dori D, Pavel P, Freeman H, editors. Advances in pattern recognition.Lecture notes in computer science. Springer-Verlag, 1451; 1998. p 944�951.

38. Almuallim H, Yamaguchi S. A method of recognition of Arabic cursive handwriting.IEEE Trans Pattern Anal Machine Intell PAMI-9 1987:715�722.

39. Amin A, Mari JF. Machine recognition and correction of printed Arabic text. IEEEŽ .Trans Man Cybernetics 1989;9 1 :1300�1306.

40. Al-Yousefi H, Udpa SS. Recognition of Arabic characters. IEEE Trans on PatternAnalysis Machine Intell PAMI-14 1992:853�857.

41. Al-Badr B, Haralick R. Segmentation-free word recognition with application toArabic. 3rd Int. Conf. on Document Analysis and Recognition Montreal 1995. p355�359.

42. Al-Sadoun HB, Amin A. A new structural technique for recognizing printed ArabicŽ .text. Int J Pattern Recogn Artif Intell 1995;9 1 :101�125.

43. Amin A. Arabic character recognition. In: Bunke H, Wang PSP, editors. Handbookof character recognition and document image analysis. Scientific; 1997. p 397�420.

Page 21: Prototyping structural description using an inductive learning program

PROTOTYPING STRUCTURAL DESCRIPTION 1123

44. Preece AD, Suen CY, Yu CL. Performance accessment of a character recognitionexpert system. International Expert System Application. EXPERSYS 90 1990. p295�300.

45. Likfooman-Solem L, Maitre H, Sirait C. An expert and vision system for analysis ofHebrew characters and authentication of manuscripts. Pattern Recogn1991;24:121�137.

46. Rosenfeld A. A characterization of parallel thinning algorithms. Inform Control1975;29:286�291.

47. Arcelli C, Cordella LP, Levialdi S. From local maxima to connected skeletons. IEEETrans Pattern Anal Mach Intell PAMI-3 1981. p 134�143.

48. Pavlidis T, A flexible parallel thinning algorithm. Proc. Pattern Recog. and ImageProcessing Conference 1981. p 162�167.

49. Guo Z, Hall RW. Parallel thinning with two subiteration algorithm. CommunACM-32 1989;3:359�373.

50. Hall RW. Fast parallel thinning algorithms: Parallel speed and connectivity preserva-tion. Commun ACM-32 1989;3:124�131.

51. Jolt CM, Stewart A, Clint M, Perrott RH. An improved parallel thinning algorithm.Commun ACM-29 1987;3:239�242.

52. Chu YK, Suen CY. An alternate smoothing and stripping algorithm for thinningdigital binary patterns. Signal Processing 1986;11:207�222.

53. Naccache NJ, Shinghal R. SPTA: A proposed algorithm for thinning binary patterns.IEEE Trans Sys Man Cybern 1984;14:409�419.

54. Xia Y. Skeletonization via the realization of the fire front’s propagation and extinc-tion in digital binary shapes. IEEE Trans Pattern Analysis Machine Intell1989;11:1076�1086.

55. Jang BK, Chin RT. One-pass parallel thinning: Analysis, properties, and quantitativeevaluation. IEEE Trans Pattern Anal Mach Intell 1992;14:1129�1140.

56. Freeman H. On the encoding of arbitrary geometric configurations. IEEE TransElectronic Computers 1968;10:260�268.

57. Nadal C, Legault R, Suen CY. Complementary algorithms for the recognition oftotally unconstrained handwritten numerals. Pattern Recogn International Confer-ence 1990. p 443�449.

58. Freeman H, Davis LS. A corner algorithm for chain-coded curves. IEEE Trans onComputers 1977:297�303.

59. Quinlan JR. Learning logical definition from relations. Machine Learning1990;5:239�266.

60. Quinlan JR. Determinate literals in inductive logic programming. Proc. 12th Int.Joint Conf. Artificial Intell. p 746�750.