radial sector coding at scis & isis 08
DESCRIPTION
This is the presentation on Translation, Rotation, and Scaling Invariant Character Recognition at SCIS & ISIS 08, JapanTRANSCRIPT
A Novel Algorithm for Translation, Rotation and Scale Invariant Character Recognition
Asif Iqbal, A. B. M. Musa, Anindya Tahsin, Md. Abdus Sattar, Md. Monirul Islam, and K. Murase
SCIS & ISIS 08
Overview• Introduction• Advantages over existing methods• Radial Sector Coding
– Center of Mass– Axis of Reference– Line of Reference– Feature vector generation
• Classifier• Experimental Results• Analysis• Conclusion
Introduction
• Invariant Character Recognition (ICR) is recognition of characters independent of translation, rotation and scaling
• It is still a hard problem in computer vision• Most of the existing algorithms are
computationally too expensive or cannot perform well under all three transformations
• Here we propose a simple and inexpensive algorithm for ICR which performs well under all three transformations
Advantages over existing methods
Our Radial Sector Coding (RSC) is simple and inexpensive
• Moment Based Methods:– Invariant moments are used– Computationally too expensive– Examples: Cartesian moment, Zernike
moment, Pseudo-Zernike, Orthogonal Fourier-Mellin moments
Advantages over existing methods
RSC do not sample whole character for feature extraction
• Projection Methods:– Projection is taken for whole character– Data redundancy exists
Advantages over existing methods (contd.)
RSC consider whole area of characterfor feature extraction
• Boundary Methods:– Sample boundary of character only– Good for solid object recognition– Not good for character recognition as they
have much topological information inside
Advantages over existing methods (contd.)
RSC uses only single circle and its radiifor feature extraction
• Radial Coding and SAFER:– Use multiple concentric circles – Small circles create erroneous features
Radial Sector Coding
Center of Mass
x y
qppq yxfyxm ),(A
CoM locates the character independent of location in the Image
Center of Mass (CoM)If is the CoM then ),( yx CC
00
01
00
10 and m
mC
m
mC yx
Here is the Cartesian Moment of order (p+q)
pqm
Axis of Reference (Symmetric Characters)
AAxis of Reference is the Axis of Symmetry of symmetric characters
Enclosing Circle
Radii deviding the circle into n sectors
Cutpoints
Cutpoints with maximum distance
Is it Axis of Reference (AoR) ?
Not Equal Almost Equal ! Not Equal
It is not AoR as pair wise max cutpoint distances are not
equal in most of the cases
Is it Axis of Reference (AoR) ?
Almost Equal !
It is AoR as pair wise max cutpoint
distances are equal in most of the cases
Hence the name Radial Sector CodingCutpoint is the pixel of intensity change
Axis of Reference (Symmetric Characters) [contd.]• For Symmetric characters the summation of
absolute difference of maximum cutpoints distances for each pair of lines having same angular distance from Axis of Symmetry/Axis of Reference will be very small
• We can exploit this fact to find AoR/AoS• As we do not know the actual AoR/AoS we can
consider each axis as a potential AoR/AoS and the one having minimum summation is the actual AoR/AoS
Axis of Reference (Symmetric Characters) [contd.]
111 , . . . ,, , . . . ,, inii ddddd
),( ii yx
odd) is(n , . . . ,,, 21 no dddd
2
1
1 )1)%(2
1(
||n
i
ik nn
kk dd
Let denotes maximum cut-point distance along each radius and initial sampling starting at 0 degree is
Now there exists an ordering
where
is minimum with respect to all other orderings
id
Now let the points for id2
1n
idand are ),( ii yx and ),(
2
1
2
1
ni
ni
yx
The line connecting ),(2
1
2
1
ni
ni
yxand is the AoR
Axis of Reference (Non-symmetric Characters)• The Axis found with the minimum
summation criteria is a rotation invariant feature for non-symmetric characters also
• So with minimum sum criteria we are getting Axis of Reference which is the Axis of Symmetry for symmetric characters and a rotation invariant feature for all characters
Axis of Reference Examples
AoR of Symmetric Character A
AoR of Non-symmetric Character F
Line of Reference
• Line of Reference is one of the two radii on Axis of Reference which has the largest cutpoint distance compared to other one
),( yx CC ),( ii yx ),(2
1
2
1
ni
ni
yxIf is the CoM and ,
are end points of AoR then line connecting ),( yx CC
),( ii yx
,
Is the LoR if it has greater cutpoint distance
Line of Reference Examples
LoR of Symmetric Character A
LoR of Non-symmetric Character F
Feature vector generation
• Feature vector size 18 is used
• Line of Reference is considered as 0° line
• Average distances of cutpoints on 18 radii is calculated starting with LoR
• Feature vector consists of these 18 values
Radial Sector Coding in Brief • Step 1: Find Center of Mass (CoM)• Step 2: Find radius r of enclosing circle• Step 3: Draw n radii at equal angular distance to
divide the circle into n sectors• Step 4: Find cutpoints on each radius• Step 5: Calculate maximum and average
cutpoint distances• Step 6: Find Axis of Reference (AoR)• Step 7: Fine Line of Reference (LoR)• Step 8: Consider LoR as 0° line and generate
feature vector of size n/2
Classifier
• Multilayer feed-forward ANN is used as classifier
• ANN has good noise tolerance
• ANN has good generalization ability
Experimental Results
• Experimental Setup– Matlab is used for feature generation and
experimental evolution– Three layer feed-forward ANN is used for
experimentation– Two widely used fonts Arial and Tahoma is
used– Large sample of 26 uppercase English
characters from both fonts are used
Experimental Results (contd.)Character Accuracy Character Accuracy
A 100 N 100
B 100 O 100
C 100 P 100
D 100 Q 100
E 100 R 100
F 94.44444 S 100
G 91.66667 T 100
H 100 U 100
I 100 V 100
J 100 W 100
K 97.22222 X 100
L 100 Y 100
M 100 Z 100
Average 99.35897
Recognition rate for Arial font. 40x40 pixel 0° to 90° rotated characters at 10° gap are used for training. 40x40 pixel 0° to 350° rotated characters at 10° gap are used for testing. Total number of training characters is 26x10 = 260. Total number of test characters is 26x36 = 936
Experimental Results (contd.)Character Accuracy Character Accuracy
A 98.611111 N 100
B 100 O 100
C 100 P 100
D 100 Q 97.222222
E 100 R 100
F 88.888889 S 100
G 98.611111 T 100
H 100 U 98.611111
I 100 V 100
J 100 W 97.222222
K 94.444444 X 97.222222
L 100 Y 97.222222
M 100 Z 100
Average 98.77137
Recognition rate for Arial font. 40x40 pixel 0° to 135° rotated characters at 15° gap are used for training. 40x40 pixel 0° to 355° rotated characters at 5° gap are used for testing. Total number of training characters is 26x10 = 260. Total number of test characters is 26x72 = 1872
Experimental Results (contd.)Character Accuracy Character Accuracy
A 100 N 100
B 91.666667 O 100
C 100 P 100
D 100 Q 97.222222
E 100 R 100
F 97.222222 S 97.222222
G 100 T 100
H 100 U 100
I 100 V 100
J 100 W 100
K 100 X 100
L 100 Y 100
M 100 Z 83.333333
Average 98.71795
Recognition rate for Arial font. 50x50 pixel 0° to 90° rotated characters at 10° gap are used for training. 50x50 pixel 0° to 350° rotated characters at 10° gap are used for testing. Total number of training characters is 26x10 = 260. Total number of test characters is 26x36 = 936
Experimental Results (contd.)Character Accuracy Character Accuracy
A 100 N 100
B 100 O 100
C 94.444444 P 100
D 100 Q 97.222222
E 100 R 100
F 100 S 83.333333
G 88.888889 T 100
H 97.222222 U 100
I 97.222222 V 100
J 100 W 100
K 97.222222 X 97.222222
L 94.444444 Y 100
M 100 Z 100
Average 97.97009
Recognition rate for Arial font. 30x30 pixel 0° to 90° rotated characters at 10° gap are used for training. 30x30 pixel 0° to 350° rotated characters at 10° gap are used for testing. Total number of training characters is 26x10 = 260. Total number of test characters is 26x36 = 936
Experimental Results (contd.)Character Accuracy Character Accuracy
A 100 N 100
B 100 O 100
C 100 P 100
D 100 Q 100
E 100 R 100
F 100 S 100
G 100 T 100
H 97.222222 U 100
I 97.222222 V 100
J 100 W 100
K 100 X 100
L 100 Y 100
M 100 Z 100
Average 99.78632
Recognition rate for Arial font. 40x40 pixel 0° to 90° rotated characters at 10° gap are used for training. 50x50 pixel 0° to 350° rotated characters at 10° gap are used for testing. Total number of training characters is 26x10 = 260. Total number of test characters is 26x36 = 936
Experimental Results (contd.)Character Accuracy Character Accuracy
A 100 N 100
B 55.555556 O 100
C 100 P 100
D 100 Q 100
E 100 R 91.666667
F 91.666667 S 100
G 100 T 100
H 100 U 97.222222
I 100 V 100
J 100 W 88.888889
K 100 X 97.222222
L 100 Y 100
M 100 Z 100
Average 97.00855
Recognition rate for Tahoma font. 40x40 pixel 0° to 90° rotated characters at 10° gap are used for training. 50x50 pixel 0° to 350° rotated characters at 10° gap are used for testing. Total number of training characters is 26x10 = 260. Total number of test characters is 26x36 = 936
Experimental Results (contd.)Character Accuracy Character Accuracy
A 99.7685185 N 100
B 91.20370383 O 100
C 99.074074 P 100
D 100 Q 98.611111
E 100 R 98.61111117
F 95.37037033 S 96.75925917
G 96.52777783 T 100
H 99.074074 U 99.3055555
I 99.074074 V 100
J 100 W 97.68518517
K 98.148148 X 98.611111
L 99.074074 Y 99.537037
M 100 Z 97.22222217
Average 98.60221
Average recognition rate for all characters considering previous tables
Analysis
• Correlation of Features
RSC generates highly correlated features under different rotation
Analysis (contd.)• Discrimination capability for similar characters
RSC generates enough distinctive features for similar characters
Analysis (contd.)• Double Mirror Symmetry
– Characters like H, I, O has double Axis of Symmetry
Double Mirror Symmetry can be exploited in future
Analysis (contd.)• Double Reverse Mirror Symmetry
– Characters like N, S, Z are symmetric if we reverse the mirror reflected part
Double Reverse Mirror Symmetry can be exploited in future
Horizontal Reverse
Mirror Symmetry
Vertical Reverse
Mirror Symmetry
Analysis (contd.)
• Inherent Difficulties– Finite Resolution
• Sampling is limited by finite resolution of image
– Round Up Error• Any measure required to be mapped to image
requires rounding up
– Boundary Distortion• Rotation introduces unavoidable boundary
distortion
Conclusion
• RSC is simple and inexpensive
• Experimental results prove its effectiveness
• Use of more sophisticated classifier in future may improve its performance
• Double mirror and reverse mirror symmetry can be exploited in future
Thanks !