classification of fonts and calligraphy styles based on complex wavelet transform

35
Alican Bozkurt Pınar Duygulu Şahin GRC 2013 Bilkent University

Upload: alican-bozkurt

Post on 26-Jun-2015

717 views

Category:

Technology


2 download

DESCRIPTION

Classification of Fonts and Calligraphy Styles based on Complex Wavelet Transform, by Alican Bozkurt

TRANSCRIPT

Page 1: Classification of Fonts and Calligraphy Styles based on Complex Wavelet Transform

Alican BozkurtPınar Duygulu ŞahinA. Enis Çetin

GRC 2013Bilkent University

Page 2: Classification of Fonts and Calligraphy Styles based on Complex Wavelet Transform

OFR as a mean: Optical Character Recognition (OCR)

• As of August 2010, there are 129.864.880 books in the world1.

• Only 20 million of them have been digitized.

• Digitization ≠ Scanning– Image vs Context– Additional processing

• Optical Character Recognition

1http://booksearch.blogspot.com/2010/08/books-of-world-stand-up-and-be-counted.html

Page 3: Classification of Fonts and Calligraphy Styles based on Complex Wavelet Transform

OFR as a mean: Optical Character Recognition (OCR)

• Inter-typeface variability– Vast number of typefaces

(>50000)

• OCR is like an finding needle in haystack

• Knowing the font significantly reduces the size of haystack

Page 4: Classification of Fonts and Calligraphy Styles based on Complex Wavelet Transform

OFR as an end: Dead Sea Scrolls

• Digitized by Google• Currently 5 scrolls

are available• Classification of

new scripts

Page 5: Classification of Fonts and Calligraphy Styles based on Complex Wavelet Transform

OFR as an end: Identifont

• Font search service• Font are expensive! ($25-$1000)• Finding cheaper alternatives:

Museo (free) Adelle ($599)

Page 6: Classification of Fonts and Calligraphy Styles based on Complex Wavelet Transform

How to Recognize Fonts?

Local• Information from individual letters• Higher resolution (decision per

word/letter)• Needs OCR as preprocessing

Global• Information from blocks of words• Faster• Lower resolution (decision per

block)

Page 7: Classification of Fonts and Calligraphy Styles based on Complex Wavelet Transform

Dual Tree Complex Wavelet Transform (DT-CWT)

Page 8: Classification of Fonts and Calligraphy Styles based on Complex Wavelet Transform

Dual Tree Complex Wavelet Transform (DT-CWT)

• Why CWT?– Directional selectivity

DWT CWT

Directionally selective

90 45(?) 0(deg)

Real

Page 9: Classification of Fonts and Calligraphy Styles based on Complex Wavelet Transform

Dual Tree Complex Wavelet Transform (DT-CWT)

• Why CWT?– Directional selectivity– Shift invariance

DWT CWT

Directionally selective

Shift invariant

Page 10: Classification of Fonts and Calligraphy Styles based on Complex Wavelet Transform

Demonstration• Train images

– Printscreens– No noise– White background– ~1900x750 px image size– 168x480 px sample size– One paragraph per font

• Test image– Random image for “typewriter”– Real noise– Colored background– 1169x1142 px image size– 96x96 sample size

Page 11: Classification of Fonts and Calligraphy Styles based on Complex Wavelet Transform

Demonstration• Smaller subsample size

– Different height/width ratio

• Noise• Different background• Not exact font• %96 success rate

– (125/130)– Blue: Courier New Regular– Red: Bookman Regular

Page 12: Classification of Fonts and Calligraphy Styles based on Complex Wavelet Transform

Demonstration

Test image

Train image for “Courier New regular”

Train image for “Bookman regular”

Page 13: Classification of Fonts and Calligraphy Styles based on Complex Wavelet Transform

Feature extraction

Step 0• Input Image

Page 14: Classification of Fonts and Calligraphy Styles based on Complex Wavelet Transform

Feature extraction

Step 0• Input Image

Step 1

• Convert Image to binary using Otsu’s method

Page 15: Classification of Fonts and Calligraphy Styles based on Complex Wavelet Transform

Feature extraction

Step 0• Input Image

Step 1

• Convert Image to binary using Otsu’s method

Step 2

• Divide the image into subsamples

Page 16: Classification of Fonts and Calligraphy Styles based on Complex Wavelet Transform

Feature extraction

Subsample Level 1 Level 2 Level 3

level 1 angle 75

level 1 angle 45

level 1 angle 15

level 2 angle 75

level 2 angle 45

level 2 angle 15

level 3 angle 75

level 3 angle 45

level 3 angle 15

Step 0• Input Image

Step 1

• Convert Image to binary using Otsu’s method

Step 2

• Divide the image into subsamples

For each subsample

• 3 level DTCWT

Page 17: Classification of Fonts and Calligraphy Styles based on Complex Wavelet Transform

Level 1

Level 2

Level 3

Feature Extraction

: 0,082091 0,084891 0,060045 0,080689 0,085836 0,060873

: 0,14791 0,15201 0,11201 0,14617 0,15402 0,11424

: 0,22597 0,24064 0,11976 0,23731 0,24072 0,12753

: 0,36203 0,35692 0,17401 0,37765 0,34842 0,19024

: 0,49943 0,54883 0,35954 0,55623 0,56736 0,30949

: 0,6949 0,65361 0,46078 0,72141 0,68851 0,39779

Φ = [μ1, μ2, μ3, σ1, σ2, σ3]

μ1σ1

μ2σ2

σ3μ3

(1x36 feature vector)

Step 0• Input Image

Step 1

• Convert Image to binary using Otsu’s method

Step 2

• Divide the image into subsamples

For each subsample

• 3 level DTCWT

Step 4• Mean and std

Step 5• Concatenate

Page 18: Classification of Fonts and Calligraphy Styles based on Complex Wavelet Transform

Results:English Font Recognition• Dataset

– Printscreen, Small natural noise, Artificial noise, Large natural noise

– 1 paragraph per font/emphasis pair

– 8 fonts:• Arial, Bookman, Century

Gothic, Comic Sans, Courier, Computer Modern, Impact,Times New Roman

Page 19: Classification of Fonts and Calligraphy Styles based on Complex Wavelet Transform

Results: English Font Recognition

• Competition

Algorithm Preprocessing? Subsampling Feature Classifier

Proposed Otsu’s method Variable Mean, std of CWT

SVM (one againist one)

Aviles-CruzText line

detection, normalization,

texture formation

100 random 64x64

subsamplesSkewness &

kurtosisEM trained

Bayes classifier

Ramanathan Normalization, Otsu’s method 3x3 grid

Mean,std, max of Gabor

responsesSVM (one against all)

Page 20: Classification of Fonts and Calligraphy Styles based on Complex Wavelet Transform

Results: English Font Recognition

Font

Low Natural Noise

Proposed Avilez-Cruz Ramanathan

A 96,88 81,75 100

B 100 87 100

CG 98,45 69,75 97,22

CS 100 75,5 100

C 100 96,25 100

I 100 99 100

M 100 97 100

T 100 91 100

Mean: 99,41625 87,15625 99,6525

A

B

CG

CS

CI

M

T

Mean:

65

85

Low Natural NoiseProposed Avilez-Cruz Ramanathan

Page 21: Classification of Fonts and Calligraphy Styles based on Complex Wavelet Transform

Results: English Font Recognition

Font

Low Natural Noise + Artifical Noise

Proposed Avilez-Cruz Ramanathan

A 95,31 78,25 97,22

B 100 83 100

CG 98,44 67,5 97,22

CS 100 73 100

C 100 91,5 97,22

I 98,44 98,5 100

M 100 91,25 100

T 98,44 79,25 97,22

Mean: 98,82875 82,78125 98,61

A

B

CG

CS

CI

M

T

Mean:

65

85

Low Natural Noise + Artificial NoiseProposed Avilez-Cruz Ramanathan

Page 22: Classification of Fonts and Calligraphy Styles based on Complex Wavelet Transform

Results: English Font Recognition

Font

High Natural Noise

Proposed Avilez-Cruz Ramanathan

A 98,44 - 91,67

B 98,44 - 88,89

CG 92,19 - 94,44

CS 100 - 97,22

C 100 - 94,44

I 100 - 94,44

M 98,44 - 88,88

T 98,44 - 100

Mean: 98,24375 - 93,7475

A

B

CG

CS

CI

M

T

Mean:

80

90

100

High Natural NoiseProposed Avilez-Cruz Ramanathan

Page 23: Classification of Fonts and Calligraphy Styles based on Complex Wavelet Transform

Results: English Font Recognition

Printscreen Low Natural Noise Low Natural Noise + artificial noise High Natural Noise

10099.4162500000001

98.8287598.24375

100

87.15625

82.7812500000001

100 99.652598.61

93.7475000000001

Recognition MeansProposed Avilez-Cruz Ramanathan

Page 24: Classification of Fonts and Calligraphy Styles based on Complex Wavelet Transform

Results: Farsi Font Recognition• Dataset

– Small natural noise– 1 paragraph per font/emphasis pair– 8 fonts:

• Homa, Lotus, Mitra, Nazanin, Tahoma, Times New Roman, Titr, Traffic, Yaghut, and Zar

[a][b][c]

a: Lotus italic

b:Homa bold italicc:Times New Roman bold

Page 25: Classification of Fonts and Calligraphy Styles based on Complex Wavelet Transform

Results: Farsi Font Recognition

• Competition

Algorithm Preprocessing? Subsampling Feature Classifier

Proposed Otsu’s method Variable Mean, std of CWT

SVM (one againist one)

Khosravi and Kabir

Text line detection,

normalization, texture formation

4x4 grid Mean,std of Sobel-Roberts AdaBoost

Senobari and Khosravi

Yes, but not explai ned

128x128 size subsamples

PCA of Sobel, Roberts, Symlet

Wavelets MLP classifer

Page 26: Classification of Fonts and Calligraphy Styles based on Complex Wavelet Transform

Results: Farsi Font Recognition

Font Proposed Khosravi Senobari

L 92,2 92,2 90,7

M 95,3 93,4 93,7

N 90,6 85,2 92

TR 98,4 97,6 95,9

Y 96,9 97,6 98,5

Z 92,2 87,4 90,9

H 100 99,2 99,8

TI 100 95,2 97

T 100 96,6 98,3

TN 98,4 97,2 98,8

Mean 96,41 94,16 95,56

L

M

N

TR

Y

ZH

TI

T

TN

Mean

60

80

100

Low Natural NoiseProposed Khosravi Senobari

Page 27: Classification of Fonts and Calligraphy Styles based on Complex Wavelet Transform

Results: Arabic Font Recognition• Dataset

– ALPH-REGIM database– 749 different sized/long

samples– 10 fonts:

• Ahsa, Andalus, Arabic_transparant, Badr, Buryidah, Dammam, Hada, Kharj, Koufi, Naskh

[a][b][c][d]

a: Ahsab: Badr c: Naskhd: Dammam

Page 28: Classification of Fonts and Calligraphy Styles based on Complex Wavelet Transform

Results: Arabic Font Recognition

• Competition

Algorithm Preprocessing? Subsampling Feature Classifier

Proposed Otsu’s method Variable Mean, std of CWT

SVM (one againist all)

Ben Moussa No No Fractal based NN

Page 29: Classification of Fonts and Calligraphy Styles based on Complex Wavelet Transform

Results: Arabic Font Recognition

Font Proposed Ben Moussa

AH 99,633 94

AN 98,1595 94

AT 99,734 92

B 99,5968 100

BU 98,2955 100

D 99,8592 100

H 90,4424 100

K 90,4037 88

KO 99,3478 98

N 98,2418 98

Mean 97,3714 96,4

AH

AN

AT

B

BU

DH

K

KO

N

Mean

80

90

100

ALPH-REGIM DatabaseProposed Ben Moussa

Page 30: Classification of Fonts and Calligraphy Styles based on Complex Wavelet Transform

Results: Speed Test

Page 31: Classification of Fonts and Calligraphy Styles based on Complex Wavelet Transform

Results: Ottoman Style Recognition

• Dataset– Ottoman Archives– 6 pages per style– Different

backgrounds– 5 styles:

• Divani, Nesih, Matbu, Talik, Rika

a: Divanib: Matbu

c: Nesihd: Rikae: Talik

[a][b][c][d][e]

Page 32: Classification of Fonts and Calligraphy Styles based on Complex Wavelet Transform

Results: Ottoman Font Recognition

Page 33: Classification of Fonts and Calligraphy Styles based on Complex Wavelet Transform

Conclusion

• New feature for font recognition:– Mean and std of 3 level CWT– Higher accuracy than states of art on English, Farsi,

Arabic fonts– Faster than state of art– Robust to noise– Performs well on Ottoman texts

Page 34: Classification of Fonts and Calligraphy Styles based on Complex Wavelet Transform

References[1] Abuhaiba, I., 2004. Arabic font recognition using decision trees builtfrom common words. Journal of Computing and Information Technology13 (3), 211–224.[2] Amin, A., 1998. Off-line arabic character recognition: the state of theart. Pattern recognition 31 (5), 517–530.[3] Aviles-Cruz, C., Rangel-Kuoppa, R., Reyes-Ayala, M., Andrade-Gonzalez, A., Escarela-Perez, R., 2005. High-order statistical textureanalysis-font recognition applied. Pattern Recognition Letters 26 (2),135 – 145.[4] Ben Moussa, S., Zahour, A., Benabdelhafid, A., Alimi, A., 2008. Fractalbasedsystem for arabic/latin, printed/handwritten script identification.In: Pattern Recognition, 2008. ICPR 2008. 19th International Conferenceon. IEEE, pp. 1–4.[5] Borji, A., Hamidi, M., 2007. Support vector machine for persian fontrecognition. International Journal of Intelligent Systems and Technologies,184–187.[6] Boser, B., Guyon, I., Vapnik, V., 1992. A training algorithm for optimalmargin classifiers. In: Proceedings of the fifth annual workshop onComputational learning theory. ACM, pp. 144–152.[7] Cai, S., Li, K., Selesnick, I., ???? Matlab implementation of wavelettransforms. Tech. rep., Polytechnic University.[8] Chang, C., Lin, C., 2011. Libsvm: a library for support vector machines.28ACM Transactions on Intelligent Systems and Technology (TIST) 2 (3),27.[9] Chaudhuri, B., Garain, U., 1998. Automatic detection of italic, bold andall-capital words in document images. In: Pattern Recognition, 1998.Proceedings. Fourteenth International Conference on. Vol. 1. IEEE, pp.610–612.[10] Cortes, C., Vapnik, V., Sep. 1995. Support-vector networks. Mach.Learn. 20 (3), 273–297.[11] Duan, K., Keerthi, S., 2005. Which is the best multiclass svm method?an empirical study. Multiple Classifier Systems, 732–760.[12] Hsu, C., Chang, C., Lin, C., et al., 2003. A practical guide to supportvector classification.[13] Jung, M., Shin, Y., Srihari, S., 1999. Multifont classification using typographicalattributes. In: Document Analysis and Recognition, 1999.ICDAR’99. Proceedings of the Fifth International Conference on. IEEE,pp. 353–356.

[14] Khosravi, H., Kabir, E., 2010. Farsi font recognition based on sobelrobertsfeatures. Pattern Recognition Letters 31 (1), 75 – 82.[15] Kingsbury, N., 1997. Image processing with complex wavelets. Phil.Trans. Royal Society London A 357, 2543–2560.[16] Kingsbury, N., 1998. The dual-tree complex wavelet transform: a new ef-29ficient tool for image restoration and enhancement. In: Proc. EUSIPCO.Vol. 98. pp. 319–322.[17] Kingsbury, N., 2000. A dual-tree complex wavelet transform with improvedorthogonality and symmetry properties. In: Image Processing,2000. Proceedings. 2000 International Conference on. Vol. 2. IEEE, pp.375–378.[18] Ma, H., Doermann, D., 2003/// 2003. Gabor filter based multi-classclassifier for scanned document images. In: 7th International Conferenceon Document Analysis and Recognition (ICDAR). pp. 968 – 972.[19] Otsu, N., 1979. A threshold selection method from gray-level histograms.IEEE Transactions on Systems, Man and Cybernetics 9 (1), 62–66.[20] Petkov, N., Wieling, M., 2008. Gabor filter for image processing andcomputer vision. Tech. rep., University of Groningen.[21] Ramanathan, R., Soman, K., Thaneshwaran, L., Viknesh, V., Arunkumar,T., Yuvaraj, P., oct. 2009. A novel technique for english fontrecognition using support vector machines. In: Advances in RecentTechnologies in Communication and Computing, 2009. ARTCom ’09.International Conference on. pp. 766 –769.[22] Rashedi, E., Nezamabadi-pour, H., Saryzadi, S., 2007. Farsi font recognitionusing correlation coefficients (in farsi). In: 4th Conf. on MachineVision and Image Processing, Ferdosi Mashhad.[23] Selesnick, I., Baraniuk, R., Kingsbury, N., 2005. The dual-tree complexwavelet transform. Signal Processing Magazine, IEEE 22 (6), 123–151.30[24] Villegas-Cortez, J., Aviles-Cruz, C., 2005. Font recognition by invariantmoments of global textures. In: Proceedings of international workshopVLBV05 (very low bit-rate video-coding 2005). pp. 15–16.[25] Zhu, Y., Tan, T., Wang, Y., Oct. 2001. Font recognition based on globaltexture analysis. IEEE Trans. Pattern Anal. Mach. Intell. 23 (10), 1192–1200.[26] Zramdini, A., Ingold, R., 1998. Optical font recognition using typographicalfeatures. IEEE Transactions on Pattern Analysis and MachineIntelligence 20, 877–882.

Page 35: Classification of Fonts and Calligraphy Styles based on Complex Wavelet Transform