an efficient classification approach based on grid code transformation and mask-matching method
DESCRIPTION
An Efficient Classification Approach Based on Grid Code Transformation and Mask-Matching Method. Presenter: Yo-Ping Huang Tatung University. Outline. Introduction The proposed classification approach The coarse classification scheme The fine classification scheme Experimental results - PowerPoint PPT PresentationTRANSCRIPT
1
An Efficient Classification Approach Based on Grid Code Transformation and
Mask-Matching Method
Presenter: Yo-Ping Huang
Tatung University
2
Outline1. Introduction2. The proposed classification
approach 3. The coarse classification scheme4. The fine classification scheme5. Experimental results 6. Conclusion
3
1. Introduction Paper documents -> Computer
codes OCR(Optical Character Recognition) The design of classification systems
consists of two subproblems: Feature extraction Classification
4
Feature extraction Features are functions of the
measurements that enable a class to be distinguished from other classes.
It has not found a general solution in most applications.
Our purpose is to design a general classification scheme, which is less dependent on domain-specific knowledge.
5
Discrete Cosine Transform (DCT) It helps separate an image into parts of
differing importance with respect to the image's visual quality.
Due to the energy compacting property of DCT, much of the signal energy has a tendency to lie at low frequencies.
6
Two stages of classification Coarse classification
DCT Grid code transformation (GCT)
Fine classification Statistical mask-matching
7
Figure 1. The framework of our classification approach.
Prepro-cessing
FeatureExtractionvia DCT
Quanti-zation
Grid CodeTransfor-mation
SortingCodestraining
pattern
Prepro-cessing
FeatureExtractionvia DCT
SearchingCandidatestest
pattern
Training
Coarse Classification
Elimination of Duplicated
Codes
candidates
Quanti-zation
Grid CodeTransfor-mation
Calculate Mask
Probability
Statistical Mask
Matching finaldecision
Fine Classification
8
In the training mode: GCT Positive mask Negative mask Mask probability
In the classification mode: GCT (coarse classification) Statistical mask matching (fine
classification)
9
Grid code transformation (GCT) Quantization
The 2-D DCT coefficient F(u,v) is quantized to F’(u,v) according to the following equation:
The most D significant of image Oi are quantized and transformed to a code, called grid code (GC), which is in form of [qi1, qi2, .., qiD].
10
Grid code sorting and elimination
The list has to be sorted ascendingly according to the GCs.
Redundancy might occur as the training samples belonging to the same class have the same GC.
In the test phase, on classifying a test sample, a reduced set of candidate classes can be retrieved from the lookup table according to the GC of the test sample.
11
4. The fine classification scheme
Mask Generation A kind of the template matching method The border bits are unreliable Find out those bits that
are reliably black (or white).
12
(a) (b) (c)
Figure 3. Mask generation
(a) Superimposed characters of “佛” , (b) the positive mask of “佛” , and(c) the negative mask of “佛” .
13
Bayes’ classification
P(ci | x): the probability of x in class i when x is observed.
P(x | ci): the probability of the feature being observed when the class is present.P(ci): the probability of that class being present.P(x): the probability of feature x.
)(
)()|()|(
xP
cPcxPxcP ii
i
14
Measures for mask matching
)(
),(),(
ib
ibi
mN
mxMmxd
)(
),(),(
iw
iwi
mN
mxMmxd
The degree of matching between an unknown character x and the positive mask of class i, , can be defined by:
im
Similarly,
Nb( f ): the number of black bits in bitmap f.Mb(f, g): the number of black bits with the same positions in both f and g.
15
Def. 1. If x matches to the positive mask of class i at the degree of , i.e.,
It is called x -match the positive mask of
class i, and denoted by . Def. 2. If x matches to the negative mask
of class i at the degree of , i.e., It is called x -match the negative mask of
class i, and denoted by .
),( imxd
ix
),( imxd
ix
16
Statistical mask-matching
The probability of x in class i when is observed can be described by
Similarly, we get
)(
)()|()|(
iii
ii
i xP
cPcxPxcP
)(
)()|()|(
i
iii
ii xP
cPcxPxcP
ix
17
Statistical decision rule
Rule AMP (Average Matching Probability)
} 2/)|()|( { max arg)( 1i
ii
iNi xcPxcPxE
18
5. Experimental Results
A famous handwritten rare book, Kin-Guan bible (金剛經 ) 18,600 samples. 640 classes.
19
Figure 4. Reduction and accuracy rate using our coarse classification scheme.
The best value of D is 6.
20
Figure 5. Accuracy rate using both coarse and fine classification.
Good reduction rate would not sacrifice the performance of fine classification.
21
Figure 6. Accuracy rate using both coarse and fine classification under different values of AMP.
22
6. Conclusions The experimental results show that:
The statistical mask-matching method is effective in recognizing the Chinese handwritten characters.
The good reduction rate provided by coarse classification would not sacrifice the performance of fine classification.
The more confident the decision, the better the accuracy rate is.
By selecting features of strong confidence, classification accuracy could be further improved.