text extraction from digital image

Post on 07-Nov-2014

596 Views

Category:

Education

5 Downloads

Preview:

Click to see full reader

DESCRIPTION

Text Extraction is a process by which we convert Printed document/Scanned Page or Image in which text are available to ASCII Character that a Computer can Recognize.

TRANSCRIPT

Prepared By:Amit Bhoraniya (7022)

Kaushik Godhani(7009)Mayur Halai(7016)

Vikram Ghunsar(7039)

Text Extraction From Image

Guided By:Mr. Udesang Jaliya

Mr. Kirti Sharma

What is Text Extraction ??Text Extraction is a process by which

we convert Printed document/Scanned Page or Image in which text are available to ASCII Character that a Computer can Recognize.

Goal Of Project

GENERAL APTITUDEComputer ScienceElectronics & Communication Engineering

How Will We Archive That Goal ??

1Preprocessing

2Segmentation

3Recognition

Pre-Processing1

Pre-Processing

1Gray Scale 2Noise Removal 3Thresholding

Gray Scale

Noise Removal

Noise Removal is used to Enhance the ImageFor Enhancing We have used Median Filter

FilteredImage = Median Filter(Origional Image, FilterSize)We have used FilterSize [5,5]

Thresholding

Edge DetectionDilate ImageDetect Text Area Using HistrogramPersonal Thresholding to Text Area

Edge Detection using Canny

Dilate

Text Area Using Histrogram

Algorithm

• Row Histrogram• Separate Region by (no. of Pixel > 60 )• For Each Row

– Separate Region by (no. of Pixel > Height of (Row/4))

2 Segmentation

Segmentation

1Line Segmentation 2Word

Segmentation

3Character Segmentation

From above Image, Image are segment in to Different Lines, Below an example of Only For one Line.

TEXT SEGMENTATION

Find all the word than convert text area in one image

Segmentation

Character are separate from the word

3 Recognition

Recognization

1Feature Extraction 2Classifier

3Text Document

• Feature Extraction• Binary Code Method• Chain Code Method• PCA (Principle Component Analysis)• LDA (Linear Discriminative Image)

• Classifier• Artificial Neural Network• Support Vector Machine

Recognization

Applications• Banking (To read Credit Card)• Libraries (To convert Scanned Page to

Image)• Govt. Sector (Form Processing)• Used in Car Number Plate Recognition

System• Undesirable Text removal from images.

References

1. OCR for Devnagari Script by Mahesh Goyani2. Edge Based Text Extraction From Complex Images

by Xiaoqing Liu and Jagath Samarbandhu3. Automatic Text Detection using Morphological

Operations and Inpainting by Khyati Vaghela4. Font and Background Color Independent Text

Binarization by T.Kasar , J.Kumar , A.G. Ramkrishnan

Thank You

top related