Download - MAV-ID card processing using camera images - · PDF fileMAV-ID card processing using camera images ... Optical character recognition ... morphological operations available in MATLAB

EE 5359 MULTIMEDIA PROCESSING

SPRING 2013

INTERIM REPORT

MAV-ID card processing using camera images

Under guidance of

DR K R RAO

DEPARTMENT OF ELECTRICAL ENGINEERING

UNIVERSITY OF TEXAS AT ARLINGTON

Presented by

Aditya Sharma

1000803349

[email protected]

List of acronyms in alphabetical order:

AHE: Adaptive histogram equalization

ASCII: American standard code for information interchange

CLAHE: Contrast limited adaptive histogram equalization

HT: Hough transform

MAC: Maverick activities center

OCR: Optical character recognition

ROI: Region of interest

Overview

There are a lot of places in the university campus where one has to swipe the magnetic strip

of the MAV- ID card for various reasons e.g. checking out a book from the library, take a print

out , check out the sports equipment at Maverick activities centre (MAC) etc… Hence at every

place there is a need of an external hardware to read out the information encoded on the

magnetic strip. With this proposed approach the dependency on the external hardware to read

the strip can easily be avoided and hence the purchase and installation cost can be reduced.

Proposal

Since most of the smart phones are packaged with an in-built camera, the proposal of this

project is to develop an application that can take images of MAV-ID card and extract student

ID and student name. With this application, there is no need for any external hardware

interface.

Methods and Goals [4]

1. Figure 1 shows the image of the MAV-ID card taken from a camera which will then be

used as the input image.

Fig 1. Sample Image

2. Importing image in MATLAB for processing.

3. Applying various image processing algorithms [7, 8, 10] to extract the MAV-ID and

student’s name.

Procedure:

Binarization to separate background from foreground information

(a) (b)

Fig. 2(a) - Original image, (b) - Binarized image [1]

To find the license plate region, firstly smearing algorithm [6] is used. Smearing is a method

for the extraction of text areas on a mixed image. With the smearing algorithm, the image is

processed along vertical and horizontal runs (scan-lines) [1]. The method will detect the spaces

in between each character on the MAV-ID card. Figure 2(a) and 2(b) shows an example of

original image and its binary version.

• Segmentation to identify different regions in the image

Fig. 3 - Segmented image [1]

Segmentation subdivides an image into its constituent regions or objects that have similar

features (intensity, histogram, mean, variance, energy, texture) according to a set of predefined

criteria. [2] Figure 3 shows an example of a segmented image of image shown in Figure 2(a).

Segmentation is required to determine the regions of interest, which are then processed

individually for text retrieval.

OCR or Morphological image processing to extract text and other features from the image

OCR (optical character recognition) [3] is the recognition of printed or written text characters

by a computer. This involves photo scanning of the text character-by-character, analysis of the

scanned-in image, and then translation of the character image into character codes, such as

ASCII, commonly used in data processing.

In OCR processing, the scanned-in image or bitmap is analyzed for light and dark areas in order

to identify each alphabetic letter or numeric digit. When a character is recognized, it is converted

into an ASCII code. Special circuit boards and computer chips designed exclusively for OCR are

used to speed up the recognition process [18].

OCR is being used by libraries to digitize and preserve their holdings. OCR is also used to

process checks and credit card slips and sort the mail [16]. Billions of magazines and letters are

sorted every day by OCR machines, considerably speeding up mail delivery [17].

Open source engine available OCR

Tesseract: http://tesseract-ocr.googlecode.com/svn/trunk/doc/tesseracticdar2007.pdf

Morphological Processing

After segmentation, there may be some noise in the image such as small holes in the candidate

regions. This can be resolved with morphological processing [14]. Mathematical morphology

operations are based on the shape in the image and not pixel intensities. The two basic

morphological operations available in MATLAB are dilation and erosion [7]. Dilation allows

objects to expand while erosion shrinks the objects by eroding the boundaries. These operations

can be modified by proper choice of the structuring element which determines how many objects

will be dilated or eroded [14]. Structuring element is simply a matrix of 0s and 1s that can be of

any arbitrary shape and size. In MATLAB one can define neighborhood of desired size for the

structuring element such as square, rectangle, diamond etc. Preferably rectangle is used as the

neighborhood for the structuring element of size 6x4 which is obtained by a trial and error

method. In the project, closing operation is used which is dilation followed by erosion. Removal

of small holes plays an important role in obtaining the info from MAV-ID card.

http://tesseract-ocr.googlecode.com/svn/trunk/doc/tesseracticdar2007.pdf

Some of the applications of this project are:

Library book/laptop checkout.

MAC sports equipment checkout.

Any student related process where manual entry of student information is required.

Fig 4- Different stages an image goes through before extraction of the required information

Input image Image after

histogram

equalization

Hough

transform

Dilation and

density

reduction

Edge

detection

output

Image after

binarization

Bounding

Box

Normalized

Image

Segmentation

and ROI

detection

Input to OCR Output

Fig. 5 - Original image

Figure .5 is the original image taken from a camera which is then processed using histogram

equalization.

Fig. 6 – Image after histogram equalization

Contrast limited adaptive histogram equalization [CLAHE] [8] is an improved version of

AHE, or adaptive histogram equalization. Both overcome the limitations of standard

histogram equalization.

CLAHE was originally developed for medical imaging and has proven to be successful for

enhancement of low-contrast images such as portal films [15].

The CLAHE algorithm partitions the images into contextual regions and applies the histogram

equalization to each one. This evens out the distribution of used grey values and thus makes

hidden features of the image more visible. The full grey spectrum is used to express the

image.

Since the input image being used has a low contrast, using CLAHE instead of other histogram

equalization techniques produces better results. Image shown in Figure 6 is contrast limited

adaptive histogram equalized image which is then converted in to a gray scale image.

Fig. 7 – Image converted to grayscale

A grayscale digital image is an image in which the value of each pixel is a single sample,

that is, it carries only intensity information. Images of this sort, also known as black-and-

white, are composed exclusively of shades of gray, varying from black at the weakest

intensity to white at the strongest.

Figure 7 shows the converted grayscale image which is then binarized as binarization is

required for edge detection.

http://en.wikipedia.org/wiki/Digital_image

http://en.wikipedia.org/wiki/Pixel

http://en.wikipedia.org/wiki/Sample_(signal)

http://en.wikipedia.org/wiki/Luminous_intensity

http://en.wikipedia.org/wiki/Black-and-white

http://en.wikipedia.org/wiki/Black-and-white

http://en.wikipedia.org/wiki/Gray

Fig. 8 - Image after binarization

Image binarization converts an image of up to 256 gray levels to a black and white

image. Frequently, binarization is used as a pre-processor before OCR. In fact, most OCR

packages on the market work only on bi-level (black and white) images.

The simplest way to use image binarization is to choose a threshold value, and classify all

pixels with values above this threshold as white, and all other pixels as black. The

problem then is how to select the correct threshold. In many cases, finding one threshold

compatible to the entire image is very difficult, and in many cases even impossible.

Therefore, adaptive image binarization is needed where an optimal threshold is chosen

for each image area. Figure. 8 shows the image after binarization which is then given as

an input to the edge detector algorithm in the next step.

Fig. 9 – Image with edge detection

Several edge detection algorithms (Sobel, Prewitt, Roberts, Canny, Laplacian of Gaussian

and Morphological operators) [13] were tested. After experimenting with various parameters

and methods, it was found that Sobel, Prewitt and Roberts [13] gave very similar results in

comparable runtime. Canny edge detector typically resulted in more number of edges, but at

the expense of longer computation time. Since the Canny edge detector [13] resulted in more

number of edges which is required as an input to the bounding box algorithm to produce

desired results, it was chosen to detect the edges in the binarized image. Figure 9 shows the

output of the Canny edge detector algorithm.

Steps to be followed

The Hough transform (HT) [11] was originally suggested as a technique for detecting

straight-line segments in an input image. It maps straight lines in the input image to points in

the Hough space. Thus the HT of an input object will have points at locations corresponding

to the parameters (e.g., normal distance from the origin and angle) of each line in the object.

Several variations of the HT exist, depending on the two parameters (coordinates of the

Hough space) used to describe a line.

The reason why Hough transform based line detection and geometric line processing has

been chosen to be used is to detect the bounding box [12] of the card in a translational,

rotational and scale invariant manner.In the next step, the image has to go through to find the

ROI [region of interest] of the image.

The output from the ROI identification step shows the selected ROIs and the final image is

used in OCR. The OCR engine is supposed to then give the desired results i.e. MAV-ID

number and name from the MAV-ID card.

References:

1. S. Ozbay and E. Ercelebi, “Automatic vehicle identification by plate,” Proceedings of World

Academy of Science Engineering and Technology, ISSN, Vol. 9, pp. 1307– 6884, November

2005.

2. http://web.imrc.kist.re.kr/~asc/course/ip_lecture05-segmentation.pdf [Tutorial on how to obtain a

segmented image]

3. https://code.google.com/p/tesseract-ocr/[definition of OCR and free OCR engine can be

downloaded ]

4. http://www.stanford.edu/class/ee368/Project_12/Proposals/Datta_Credit_card_processingusing_ce

ll_phone.pdf [ Proposal about how to extract and process information from the credit card, the

information will be used in the project proposed ]

5. D. G. Bailey, D. Irecki, B.K. Lim and L. Yang “Test bed for number plate recognition

applications”, Proceedings of First IEEE International Workshop on Electronic Design, Test and

Applications ( DELTA’02 ),IEEE Computer Society, pp. 501-503, July 2002.

6. L. Angeline, K.T.K. Teo and F.Wong “Smearing Algorithm for Vehicle Parking System”,

Proceedings of the 2nd Seminar on Engineering and Information Technology, Kota Kinabalu,

Sabah, Malaysia, pp. 331-337, July 2009.

7. K. Deb, S. Kang and K. Jo, “Statistical characteristics in HSI color model and position histogram

based vehicle license plate detection”, Vol. 2, Issue 3, pp. 173-186, June 2009.

8. S. M. Pizer, E. P. Amburn and J. D. Austin, “Adaptive Histogram Equalization and Its Variation”,

Computer Vision, Graphics and Image Processing, Volume 39, Issue 3, pp. 355–368, September

1987.

9. R. C. Gonzalez and R. E. Woods, Digital Image Processing, Prentice Hall, Upper Saddle River,

NJ, USA, 2nd edition, 2002.

10. K. Mohammad and S. Agaian, “Practical Recognition System for Text Printed on Clear Reflected

Material,” ISRN Machine Vision, Vol. 2012, Article ID 253863, 16 pages, 2012.

doi:10.5402/2012/253863.

11. R. O. Duda and P. E. Hart, Use of the Hough transform to detect lines and curves in pictures,

Communications of the ACM, Vol. 15, No. 1, pp. 11-15, January 1972.

12. J. Ha, I.T. Phillips and R.M. Haralick, “Document Image Decomposition using Bounding Boxes of

Connected Components,” Proceedings of the Third International Conference on Document Image

Analysis, vol.2, pp. 1119-1122, Aug 1995.

13. M. Heath, S. Sarkar, T. Sanocki, and K.W. Bowyer, "A Robust Visual Method for Assessing the

Relative Performance of Edge-Detection Algorithms" IEEE Transactions on Pattern Analysis and

Machine Intelligence, Vol. 19, No. 12, pp. 1338-1359, December 1997.

14. P. Maragos and R. W. Schafer, "Applications of morphological filtering to image processing and

analysis", Proc. 1986 IEEE International Conference Acoustics, Speech and Signal Processing,

pp.2067 -2070, April 1986 .

15. B. Kurt, V. V. Nabiyev and K. Turhan, "Medical images enhancement by using anisotropic filter

and CLAHE," International Symposium on Innovations in Intelligent Systems and Applications

(INISTA) 2012, pp. 1-4, July 2012.

16. L. D. Jackel, D. Sharman, C. E. Stenard, B. I. Strom and D. Zuckert , ''Optical Character

Recognition for Self-Service Banking" , AT&T Technical Journal, pp. 16-24, July/August 1995.

http://web.imrc.kist.re.kr/~asc/course/ip_lecture05-segmentation.pdf%20%5bTutorial

https://code.google.com/p/tesseract-ocr/

http://www.stanford.edu/class/ee368/Project_12/Proposals/Datta_Credit_card_processingusing_cell_phone.pdf

http://www.stanford.edu/class/ee368/Project_12/Proposals/Datta_Credit_card_processingusing_cell_phone.pdf

17. C. G. Leedham and P.E. Jones, "Automatic sorting of Australian handwritten letter mail using

OCR and address feature verification," TENCON '92. ''Technology Enabling Tomorrow:

Computers, Communications and Automation towards the 21st Century.' 1992 IEEE Region 10

International Conference, vol. 1, pp. 287-291, Nov 1992.

18. M.C. Rahier, P.G.A. Jespers, "Dedicated LSI for a Microprocessor-Controlled Hand-Carried OCR

System," IEEE Transactions on Computers, vol. 29, no. 2, pp. 79-88, Feb. 1980,

Download - MAV-ID card processing using camera images - · PDF fileMAV-ID card processing using camera images ... Optical character recognition ... morphological operations available in MATLAB

Top Related