region analysis of business card image

Upload: viet-quang-vo

Post on 07-Apr-2018

215 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/3/2019 Region Analysis of Business Card Image

    1/9

    Region Analysis of Business Card Images

    Acquired in PDA Using DCT and Information

    Pixel Density

    Ick Hoon Jang1, Chong Heun Kim2, and Nam Chul Kim3

    1 Kyungwoon University, Dep. of Electronic Engineering, Gumi 730-850, [email protected]

    2 LG Electronics Co., Ltd., Products Research Institute, Gumi 730-820, [email protected]

    3 Kyungpook Natl University, Dep. of Electronic Engineering, Daegu 702-701, Korea

    [email protected]

    Abstract. In this paper, we present a method of region analysis forbusiness card images acquired in a PDA (personal digital assistant) usingDCT and information pixel (IP) density. The proposed method consistsof three parts: region segmentation, information region (IR) classifica-tion, and character region (CR) classification. In the region segmenta-tion, an input business card image is partitioned into 8 8 blocks and

    the blocks are classified into information blocks (IBs) and backgroundblocks (BBs) by a normalized DCT energy. The input image is then seg-mented into IRs and background regions (BRs) by region labeling on theclassified blocks. In the IR classification, each IR is classified into CR orpicture region (PR) by using a ratio of DCT energy of edges in horizon-tal and vertical directions to DCT energy of low frequency componentsand a density of IPs. In the CR classification, each CR is classified intolarge CR (LCR) or small CR (SCR) by using the density of IPs and anaveraged run-length of IPs. Experimental results show that the proposedregion analysis yields good performance for test images of several types of

    business cards acquired in a PDA under various surrounding conditions.In addition, error rates of the proposed method are shown to be 2.210.1% lower in region segmentation and 7.7% lower in IR classificationthan those of the conventional methods.

    1 Introduction

    Business cards have been widely used by career men as a means of an advertise-ment. Recently as ones own p.r. is regarded as of great importance, the class ofbusiness card users is being extended to common people. Accordingly, people getmore business cards of others and need efficient management of them instead ofcarrying all of them. Up to now people usually manage business cards by puttingit in a book of business cards directly or making a note of its information in amemo pad. A hand-held PDA widely used in recent days can easily obtain animage of a business card by digitizing it with its built-in camera. It can also

    J. Blanc-Talon et al. (Eds.): ACIVS 2005, LNCS 3708, pp. 243251, 2005.c Springer-Verlag Berlin Heidelberg 2005

  • 8/3/2019 Region Analysis of Business Card Image

    2/9

    244 I.H. Jang, C.H. Kim, and N.C. Kim

    recognize characters in an image and store the recognized characters. So such amanagement of the information of a business card using the PDA may be moreefficient.

    Business cards are generally composed of characters such as logotype, name,affiliation, address, phone number, e-mail address, etc., pictures such as pho-tograph, symbol mark, line, etc., and background. So if a region analysis thatdivides a business card image into CRs, PRs, and BRs is performed, then anyfollowing processing for business card management may be much more efficient.Until now many region analysis methods have been proposed. Most of the meth-ods are for document images [1][10]. In [4][6], a document image is first parti-tioned into blocks, the blocks are then classified into IBs containing characters orpictures and BBs using a block activity, and the image is finally divided into IRsand BRs. As a block activity, variance of a block [4], edge information in a block[5], or DCT energy in a block [6] is used. In addition, a document image is firstbinarized as IPs and BPs, and then divided into IRs and BRs using a run-lengthsmoothing [1] or projection profiles of the binarized image [7]. Besides, an IR isclassified into CRs and PRs by using adjacency of character strings [8], repeti-tion of character strings [9], or distribution of IPs in its binarized region [10].In [11], an extraction of text lines for business card images acquired in scannershas been proposed.

    Since document images are usually acquired by high resolution scanners, theyusually have regular illumination and intensity distributions in their local regionsare nearly uniform. They also have many character strings of regular positionsand pictures somewhat isolated from their adjacent characters. On the otherhand, business card images acquired in a PDA with its built-in camera usuallyhave lower resolution. In addition, they may often have irregular illumination andshadow due to acquisition under unstable hand-held situation. So their intensitydistributions in local regions may not be uniform but severely varied. Moreover,they have low average density of characters and sizes of their characters mayoften vary in a few lines of irregular positions, and often have pictures lie closeto their adjacent characters. Thus the performance of region analysis on businesscard images acquired in a PDA using the conventional region analysis methodsfor document images may be deteriorated. In this paper, we present a methodof region analysis for business card images acquired in a PDA considering thecharacteristics of business card images.

    2 Proposed Region Analysis

    2.1 Region Segmentation Using DCT

    In the region segmentation, an input image is first partitioned into blocks andthe blocks are classified into IBs and BBs based on a block activity using DCT.We determine the block size as 8 8 by considering the averaged density andsize of characters in business card images and define the block activity as theblock energy with the absolute sum of low frequency DCT coefficients in theblock. We also normalize the block energy by the RMS (root mean square) of

  • 8/3/2019 Region Analysis of Business Card Image

    3/9

    Region Analysis of Business Card Images Acquired in PDA 245

    block signal, which is for compensation of the severe intensity variation in localregions. Thus the block activity of the kth block can be written as

    EkN =

    11

    64

    7i=0

    7j=0

    xkij2

    7u=0

    7v=0

    |Dkuv| (1)

    u+v3(u,v)=(0,0)

    where xkij and Dkuv denote the intensity value of pixel (i, j) and the DCT coef-

    ficient of frequency (u, v) at the kth block, respectively. So the classification ofthe kth block using (1) can be represented as

    Decide IB if EkN ThE; otherwise decide BB (2)

    where ThE denotes a threshold. In this paper, ThE is determined as the averageof EkN over the entire image. After the block classification, the input image isthen segmented into IRs and BRs by region labeling on the classified blocks.

    Figure 1(a) shows an ordinary 640 480 business card image with complexbackground acquired in a PDA. Figure 1(b) shows the result image of blockclassification for the image of Fig. 1(a). Gray parts represent IBs and blackones BBs. As shown in Fig. 1(b), we can see that most of the blocks are wellclassified. However, there are some isolated IB regions in the upper left partwhich are actually BB regions. The isolated IB regions are eliminated in theregion labeling. Figure 1(c) shows the result image of the elimination of isolatedIB regions for the image of Fig. 1(b). From Fig. 1(c), one can see that almostall of the isolated IB regions are eliminated so that the image is well segmentedinto IRs and BRs.

    2.2 Information Region Classification Using DCT and Information

    Pixel Density

    Among IRs, CRs usually have strong edges in horizontal and vertical directions.On the contrary, PRs do not have such strong edges. They also have higherenergies in their low frequency bands and higher IP densities in their blockscompared to the CRs. Based on these characteristics, the IRs are classified intoCRs and PRs. In the IR classification, the segmented IRs are first partitionedinto blocks of 88 size for locally adaptive classification which may be advanta-geous for discriminating CRs from PRs. The energy of horizontal edges in eachblock is computed only with DCT coefficients of horizontal frequency compo-nents. Similarly, the energy of vertical edges is also computed. The energy of lowfrequency components is computed with several low frequency DCT coefficients.Thus the energy of edges in horizontal and vertical directions at the kth blockin the mth segmented IR, EEm,k, and the energy of low frequency components,ELm,k, can be represented as

    EEm,k =7

    u=1

    |Dm,ku0 | +7

    v=1

    |Dm,k0v | (3)

  • 8/3/2019 Region Analysis of Business Card Image

    4/9

    246 I.H. Jang, C.H. Kim, and N.C. Kim

    (a) (b) (c)

    (d) (e)

    Fig. 1. An ordinary business card image with complex background and the results of

    the proposed region analysis. (a) Original image, (b) block classification, (c) elimination

    of isolated IB regions, (d) IR classification, and (e) CR classification.

    ELm,k =7

    u=0

    7v=0

    |Dm,kuv | (4)

    u+v2(u,v)=(0,0)

    where Dm,kuv denotes the DCT coefficient of frequency (u, v) at the kth block in themth segmented IR. Next, considering CRs have high energy of edges in horizontaland vertical directions and PRs have high energy of low frequency components,the ratio of the energy of edges EEm,k to the energy of low frequency componentsELm,k is computed at the kth block and the ratio is then averaged over the entiremth segmented IR as

    REm =

    EEm,k

    ELm,k

    (5)

    where < > denotes the average of the quantity.In order to compute the density of IPs in a segmented IR, each IR is binarized

    with a threshold by Otsus threshold selection method [12]. In a binarized IR,black pixels are IPs and white ones are BPs. Then the density of IPs in the mth

    segmented IR is given as

    DIPm =NIPm

    NIPm + NBPm(6)

    where NIPm and NBPm denote the number of IPs and that of BPs in the mthsegmented IR, respectively.

  • 8/3/2019 Region Analysis of Business Card Image

    5/9

    Region Analysis of Business Card Images Acquired in PDA 247

    Using the ratio of energy REm and the density of IPs DIPm, the mth seg-mented IR is classified into CRs and PRs as

    Decide CR ifREm ThR and DIPm ThD; otherwise decide PR (7)

    where ThR and ThD denote thresholds. In this paper, ThR is determined as theaverage ofREm over the entire image and ThD is experimentally determined.Hollows and holes in each PR are filled using the run-length smoothing method in[1]. Figure 1(d) shows that the map image of the result of IR classification. Darkgray parts represent CRs, bright gray ones PRs, and black ones BRs. We havegiven the result of IR classification as the region map image to discriminate theCRs from PRs. As shown in Fig. 1(d), one can see that the IRs are well classifiedinto CRs and PRs.

    2.3 Character Region Classification Using Information Pixel

    Density and Run-Length

    Among characters in business card images, logotypes are usually larger than theother characters such as name, affiliation, address, phone number, and e-mailaddress. Logotypes are sometimes modified, so they may give little information.Logotypes usually have higher densities of IPs and have longer run-lengths of IPscompared to the other characters. Based on these characteristics, the CRs are

    classified into LCRs and SCRs. For the CR classification, we define the averagerun-length of IPs in the nth CR as

    RLn = HLni +V Lnj

    (8)

    where HLni and V Lnj denote the maximum run-length of IPs at the ith horizontal

    line and that at the jth vertical line in the nth CR, respectively. Using the densityof IPs in (6) and the average run-length of IPs in (8), the nth CR is classifiedinto LCR or SCR as

    Decide LCR if DIPn ThD and RLn ThL; otherwise decide SCR (9)

    where ThD and ThL denote thresholds. In this paper, ThD and ThL are ex-

    perimentally determined. Hollows and holes in each LCR are also filled usingthe run-length smoothing method in [1]. Figure 1(e) shows that the map imageof the result of CR classification. White parts represent LCRs, dark gray onesSCRs, bright gray ones PRs, and black ones BRs. As shown in Fig. 1(e), onecan see that the CRs are well classified into LCRs and SCRs.

    3 Experimental Results and Discussion

    To evaluate the performance of the proposed region analysis, test images of sev-eral types of business cards were acquired using a PDA, iPAQ 3950 by Compaq,with its built-in camera, Nexicam by Navicom, under various surrounding con-ditions. In the business cards, there are ordinary business cards, special business

  • 8/3/2019 Region Analysis of Business Card Image

    6/9

    248 I.H. Jang, C.H. Kim, and N.C. Kim

    (a) (b)

    (c) (d)

    Fig. 2. An ordinary business card image having shadows in its left part and the results

    of the proposed region analysis. (a) Original image, (b) region segmentation, (c) IR

    classification, and (d) CR classification.

    cards of textured surfaces, and special business cards with patterns in their sur-

    faces. The surrounding conditions can be divided into good condition and illcondition containing irregular illumination, shadow, and complex backgrounds.

    Figures 2 and 3 show a 640480 ordinary business card image having shadowsin its left part and a special business card image with patterns in its surface andtheir results of the proposed region analysis. One can see that the proposedmethod yields good results of region segmentation, IR classification, and CRclassification. Experimental results have shown that the proposed method yieldssimilar results on other test business card images.

    Next, we evaluated error rates of region segmentation. To do this, we pro-duced standard region segmented images for 100 test business card images. Inthe way, each test image is manually segmented into IRs and BRs and the mapimage of the IRs and BRs is produced. Its original image is partitioned into 8 8blocks and the blocks are classified as IBs and BBs. A block is classified as IBif 10% or more pixels in the block belong to an IR of its region map image.Otherwise, the block is classified as BB. After the block classification, a stan-dard region segmented image is produced by region labeling on the classifiedblocks. The error rate of region segmentation for the test image is evaluated bycomparing its region segmented image with its standard region segmented one.

    The error rate of region segmentation for a test image is defined as S =(IB + BB ) /2, where IB and BB denote the error rate of IB and that of BB,respectively. The IB and BB are defined as IB = NMIB/NIB and BB =NMBB/NBB , where NIB and NBB denote the number of IBs and that of BBsin the standard region segmented image, respectively. The NMIB and NMBBdenote the number of mis-segmentated IBs and that of mis-segmented BBs in

  • 8/3/2019 Region Analysis of Business Card Image

    7/9

    Region Analysis of Business Card Images Acquired in PDA 249

    (a) (b)

    (c) (d)

    Fig. 3. A special business card image with patterns in its surface and the results

    of the proposed region analysis. (a) Original image, (b) region segmentation, (c) IR

    classification, and (d) CR classification.

    Table 1. Comparative error rates of region segmentation for the conventional region

    segmentation methods in [4][6] and the proposed region segmentation method

    Type of Surrounding Variance Edge information DCT energy Proposedbusiness card condition % % % %

    Ordinary Good 15.0 12.7 11.5 11.4Ill 24.9 16.7 15.2 12.6

    Special Good 22.8 22.6 16.0 15.3Ill 35.3 28.5 23.7 18.3

    Average 24.5 20.1 16.6 14.4

    the region segmented image, respectively. Table 1 shows comparative error ratesof region segmentation for the conventional region segmentation methods in [4][6] and the proposed region segmentation method. As shown in Table 1, wecan see that the proposed method gives 14.4% average error rates of regionsegmentation so that it yields 2.210.1% performance improvement for the testimages. Besides, the performance of our method is especially better on the specialbusiness card images under ill surrounding conditions.

    In addition, we evaluated error rates of IR classification. To do this, standardIR classified images for test business card images was produced in such a wayof the production of standard region segmented images. The error rate of IRclassification is defined as C = NMIR/NIR , where NIR and NMIR denote thenumber of IRs in the standard IR classified image and that of mis-classifiedIRs in the IR classified image. Table 2 shows the comparative error rates of

  • 8/3/2019 Region Analysis of Business Card Image

    8/9

    250 I.H. Jang, C.H. Kim, and N.C. Kim

    Table 2. Comparative error rates of IR classification for the conventional IR classifi-

    cation method in [10] and the proposed IR classification method

    Type of Surrounding Region Method in [10] Proposedbusiness card condition % %

    Good CR 8.4 2.8Ordinary PR 11.8 5.9

    Ill CR 11.4 5.7PR 18.7 6.7

    Good CR 13.6 4.9Special PR 16.7 5.6

    Ill CR 13.7 6.0PR 16.3 11.1

    Average 13.8 6.1

    IR classification for the conventional IR classification in [10] and the proposedIR classification method. As shown in Table 2, we can see that the proposedmethod gives 6.1% average error rate of IR classification so that it yields 7.7%performance improvement for the test images.

    Acknowledgement

    This work was supported in part by Samsung Electronics Co., Ltd.

    References

    1. Drivas, D., Amin, A.: Page segmentation and classification utilising bottom-upapproach. Proc. IEEE ICDAR95 (1995) 610614

    2. Sauvola, J., Pietikainen, M.: Page segmentation and classification using fast feature

    extraction and connectivity analysis. Proc. IEEE ICDAR95 (1995) 112711313. Wang, H., Li, S.Z., Ragupathi, S.: Document segmentation and classification withtop-down approach. Proc. IEEE 1st Int. Conf. Knowledge-Based Intelligent Elec-tronic Systems 1 (1997) 243247

    4. Chen, C.T.: Transform coding of digital image using variable block DCT withadaptive thresholding and quantization. SPIE 1349 (1990) 4354

    5. Bones, P.J., Griffin, T.C., Carey-Smith, C.M.: Segmentation of document images.SPIE 1258 (1990) 6678

    6. Chaddha, N., Sharma, R., Agrawal, A., Gupta, A.: Text segmentation in mixed-mode images. Proc. IEEE Twenty-Eight Asilomar Conf. Signals, Systems and Com-

    puters 2 (1994) 135613617. OGorman, L.: The document spectrum for page layout analysis. IEEE Trans.

    Pattern Anal. Machine Intell. 15 (1993) 116211738. Li, X., Oh, W.G., Ji, S.Y., Moon, K.A., Kim, H.J.: An efficient method for page

    segmentation. Proc. IEEE ICIPS97 2 (1997) 9579619. Lee, S.W., Ryu, D.S.: Parameter-free geometric document layout analysis. IEEE

    Trans. Pattern Anal. Machine Intell. 23 (2001) 12401256

  • 8/3/2019 Region Analysis of Business Card Image

    9/9

    Region Analysis of Business Card Images Acquired in PDA 251

    10. Yip, S.K., Chi, Z.: Page segmentation and content classification for automaticdocument image processing. Proc. IEEE Int. Symp. Intelligent Multimedia, Videoand Speech Processing (2001) 279282

    11. Pan, W., Jin, J., Shi, G., Wang, Q.R.: A system for automatic Chinese business

    card recognition. Proc. IEEE ICDAR01 (2001) 57758112. Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans.

    Syst., Man, Cybern. SMC-9 (1979) 6266