brailleocr: an open source document to braille converter application

33
An Open Source Tesseract based Tool for Extracting Text from Images with Application in Braille Translation Pijush Chakraborty [Roll: 32] Calcutta Institute of Engineering and Management CS681 Seminar

Upload: pijush15

Post on 05-Dec-2014

294 views

Category:

Software


0 download

DESCRIPTION

This presentation is actually about an Open Source application, BrailleOCR that helps to convert scanned documents to Braille and thus helps the Visually Impaired. What is the use of this application in real life? Well, BrailleOCR is currently the only app that integrated Optical character recognition and Braille Translation together. This app will eventually help converting a lot of important documents to Braille. The project site for this project is given here IJCA Paper: http://www.ijcaonline.org/archives/volume68/number16/11664-7254 Project site: https://code.google.com/p/brailleocr/ The app uses a four step process. Initially, we have a scanned image, which is a RGB image. The first step or the Pre-Processing step deals with conversion of a RGB image to grayscale. The 2nd step deals with Character Recognition using the Tesseract Engine. Now, the recognition step may have errors and we require post processing to correct them. The 3rd step is thus the Post-Processing step and it actually corrects errors in the previous step. The final and the most important step is the Braille Conversion step.

TRANSCRIPT

Page 1: BrailleOCR: An Open Source Document to Braille Converter Application

An Open Source Tesseract based Tool for Extracting Text from Images with Application

in Braille Translation

Pijush Chakraborty [Roll: 32]Calcutta Institute of Engineering and

Management

CS681 Seminar

Page 2: BrailleOCR: An Open Source Document to Braille Converter Application

Introduction

Page 3: BrailleOCR: An Open Source Document to Braille Converter Application

Contribution of the Application in real life:o Our application integrates the working of an OCR with Braille

Translation.o BrailleOCR is currently the only application that supports

conversion of Image document to Braille format.o Will help in converting large documents to Braille format and

eventually help a lot of Visually Impaired people.o Project site: code.google.com/p/brailleocro DOI IJCA Paper reference: 10.5120/11664-7254

Open Source APIs used:o Tesseract Engine[Open-source OCR Engine]o Tess4J API [JNA Wrapper for using Tesseract with Java] o JOrtho API [Java open-source spell checking API]o Swing Graphics API

Introduction: Use of our Application

Page 4: BrailleOCR: An Open Source Document to Braille Converter Application

Introduction: BrailleOCR GUI

Page 5: BrailleOCR: An Open Source Document to Braille Converter Application

Methodology

Page 6: BrailleOCR: An Open Source Document to Braille Converter Application

Conversion of an Image Document to Braille consists of the following steps:

Methodology: Steps to be Followed

Fig. 1. Steps to be Followed

Page 7: BrailleOCR: An Open Source Document to Braille Converter Application

Conversion of an Image Document to Braille consists of the following steps:

Methodology: Steps to be Followed

Fig. 1. Steps to be Followed

Page 8: BrailleOCR: An Open Source Document to Braille Converter Application

Conversion of an Image Document to Braille consists of the following steps:

Methodology: Steps to be Followed

Fig. 1. Steps to be Followed

Page 9: BrailleOCR: An Open Source Document to Braille Converter Application

Conversion of an Image Document to Braille consists of the following steps:

Methodology: Steps to be Followed

Fig. 1. Steps to be Followed

Page 10: BrailleOCR: An Open Source Document to Braille Converter Application

Pre Processing Step

Page 11: BrailleOCR: An Open Source Document to Braille Converter Application

Pre Processing Steps:◦ Conversion to grayscale◦ Conversion of grayscale image to binary◦ The second sub-step is handled by Tesseract

using adaptive threshold. Reason for Grayscale conversion:

◦ Increases the accuracy in the Recognition step as stated in Ref. [2].

◦ Table 1 gives the Accuracy rate for certain input images.

Pre Processing: Image Type

Input Image No. of Images

Accuracy

Color Image 10 89%

Grayscale Image 10 93%

Table 1: Accuracy of Tesseract

Page 12: BrailleOCR: An Open Source Document to Braille Converter Application

Different Algorithms available: Averaging Luminosity method

Luminosity method Benefits: Human perception has more sensitivity for green more that red and red

more than blue Wight of green color component is highest followed by red and blue

i.e weight of color channel ∝ sensitivity

Algorithm Used:The color image can be represented as a discrete function f(x,y)=(xi,yj), 0<=i<N, 0<=j<M where N is the height of the image and M is the width of the image.

for i=0 to N-1 for j=0 to M-1 gr(xi,yj) = 0.299*r(xi,yj)+0.587*g(xi,yj)+0.114*b(xi,yj)

Here gr(xi,yj) is the grayscale image pixel, r(xi,yj) is the red channel, g(xi,yj) is the green channel and b(xi,yj) is the blue channel

Pre Processing: Grayscale Conversion

Page 13: BrailleOCR: An Open Source Document to Braille Converter Application

Pre Processing: Implementing the Algorithm

Fig. 2. Scanned Image

Fig. 3. Grayscale Image

Page 14: BrailleOCR: An Open Source Document to Braille Converter Application

Text Extraction Step

Page 15: BrailleOCR: An Open Source Document to Braille Converter Application

What is Optical Character Recognition?◦ Conversion of Scanned Image

document to Machine Encoded Text.◦ Useful in keeping backup of

important documents as text format.

Brief History:◦ 1929-1975: OCR without Electronic

computers◦ 1985-2000: Development in OCR for

computers◦ 2000-2013: Developments of

industrial standard OCR

Text Extraction: What is OCR?

Fig. 4. OCR implementation

Page 16: BrailleOCR: An Open Source Document to Braille Converter Application

Tesseract is currently the best Open Source OCR Engine.

Developed at HP between 1984 and 1994. Released Tesseract for open source in 2005 and

since then Google has taken over the Project. Project site:

Google recently launched Tesseract v3.0 Used with Java Applications using a JNA wrapper

Tess4J. Project site: code.google.com/p/tesseractocr

Text Extraction: Tesseract History

Page 17: BrailleOCR: An Open Source Document to Braille Converter Application

Get outlines by connected component analysis.

Organize outlines to Blobs

Organize Blobs to Text Lines

Characters are chopped and features are extracted

Text Extraction: Tesseract Architecture

Fig. 5. Architecture

Page 18: BrailleOCR: An Open Source Document to Braille Converter Application

Features are extracted using polygonal approximation.

Matched with prototype to find matching patterns.

The adaptive classifier scans the image twice to get better result the second time.

Text Extraction: Tesseract Charcter Recognition

Fig. 6. Prototype Matching

Page 19: BrailleOCR: An Open Source Document to Braille Converter Application

Post Processing Step

Page 20: BrailleOCR: An Open Source Document to Braille Converter Application

Why Post Processing?◦ Corrects errors in the previous step◦ Gives error free text for Braille Conversion◦ Spell checking systems provide the best results for post

processing step.

JOrtho API◦ JOrtho is an open source Java spell checking API that gives

suggestions for commonly misspelled words in the text.◦ The key algorithms include phonetic matching algorithms

such as Soundex ◦ Project site: jortho.sourceforge.net

Post Processing: Correcting the Text

Page 21: BrailleOCR: An Open Source Document to Braille Converter Application

Soundex Code:◦ The Soundex Code of a word returns a

alphabet followed by 3 numbers using the algorithm bellow

Algorithm:◦ Retain the first letter of the name and

drop all other occurrences of a, e, i, o, u, y, h, w.

◦ Replace consonants with digits as follows (after the first letter):

b, f, p, v = 1c, g, j, k, q, s, x, z = 2d, t = 3l = 4m, n = 5r = 6

◦ Two adjacent letters with the same number are coded as a single number. Two letters with the same number separated by 'h' or 'w' are coded as a single number

Post Processing: Soundex Algorithm

Example: “Metacalt”and “Metacalf” return the same string M324 as they are phonetically same

Fig. 7. Spell Cheking

Page 22: BrailleOCR: An Open Source Document to Braille Converter Application

Braille Translation Step

Page 23: BrailleOCR: An Open Source Document to Braille Converter Application

History of Braille:◦ Invented by Louis Braille in the 19th century◦ Accepted throughout the world as aform of

written communication for blind individuals◦ There have been some modifications to the

Braille system such as inclusion of concatenated words.

Use of Braille:◦ Braille is the primary reading and writing

system used by the visually impaired.◦ Helps in increasing literacy among the

visually impaired.◦ In modern world Braille technologies are

supported by various electronic devices. Braille Cell:

◦ Braille cells are 6-dot cells having some dots raised or lowered.

◦ 64 possible combinations.◦ Used in Braille Refreshable Display

What is Braille?

Fig. 9. six-dot Braille cell

Fig. 8 Braille Refreshable Display

Page 24: BrailleOCR: An Open Source Document to Braille Converter Application

Braille Details:◦ Grade 1 and Grade 2 are the most

commonly used.

◦ Grade 1 Braille includes single letters, numbers while grade 2 Braille includes concatenated words such as for,with,you, etc..

◦ Numbers (0,1 to 9) are denoted by (j,a to i) preceded by the number denoting cell

◦ Compounds letters (ex: and, with, wh, the,th…) have separate Braille representations.

◦ Uppercase alphabets have a preceding Braille cell denoting capital letter.

Braille: Braille Types

Fig. 10. Braille representations

Page 25: BrailleOCR: An Open Source Document to Braille Converter Application

Braille ASCII:◦ Subset of ASCII character set.◦ Contains all 64 Braille representations (6-dot cell).◦ Maps one-to-one ASCII input to Braille code. ◦ Supported by all Braille embossers.◦ It uses ASCII codes to send information to Braille displays.

Braille Patterns:◦ Braille Patterns are Unicode patterns that represent Braille characters.◦ Consists of 256 combinations of the 8-dot Braille cell. We require only 64.◦ Braille embossers and Braille Displays are recently upgraded to support

Unicode Braille.◦ The Unicode Braille set ranges from U+2800 to U+28FF though we need

only U+2800 to U+283F◦ In our application, we have focused on Unicode Braille representation.

Braille Translation: Electronic Braille

Braille Code Example:String: “6 dot Braille Cells for 64 combinations” Braille:

Page 26: BrailleOCR: An Open Source Document to Braille Converter Application

The flowchart bellow gives the entire algorithm of translation.

Braille Translation: Algorithm

Fig. 11. Flow Chart for Translation

Page 27: BrailleOCR: An Open Source Document to Braille Converter Application

Implementation

Page 28: BrailleOCR: An Open Source Document to Braille Converter Application

Extracting Text and correcting errors.

Implementation: BrailleOCR

Fig. 12. Extracting Text and Correcting Errors

Page 29: BrailleOCR: An Open Source Document to Braille Converter Application

Translation to Braille

Implementation: Braille Conversion

Fig. 13. Converting Text to Braille

Page 30: BrailleOCR: An Open Source Document to Braille Converter Application

Conclusion

Page 31: BrailleOCR: An Open Source Document to Braille Converter Application

We have showed the process of integrating Tesseract OCR Engine with Braille Translation.

Our Future plans are to make it multilingual such that it can support Bharti Braille too which has Bengali, Hindi, Gujarati and all other Indian languages.

We will also provide better support for Grade 2 Braille as Grade 2 Braille is common now-days.

Project Site: code.google.com/p/brailleocr

Conclusion and Future Plans

Page 32: BrailleOCR: An Open Source Document to Braille Converter Application

[1] Tesseract Project Site: code.google.com/p/tesseractocr [2] Chirag Ptel, AtulPatel, Dharmendra Patel, Optical Character

Recognition using Tool Tesseract: A Case Study, IJCA, October 2012 [3] Pijush Chakraborty and Arnab Mallik, An Open Source Tesseract

based Tool for Extracting Text from Images with Application in Braille Translation for the Visually Impaired, IJCA, April 2013

[4] R.Smith, An Overview of the Tesseract OCR Engine, Proc. Ninth Int. Conference on Document Analysis and Recognition , IEEE Computer Society (2007)

[5] Ray Smith, Tesseract OCR Engine, OSCON 2007 [6] Tess4J Project Site: http://tess4j.sourceforge.net/ [7] JOrtho Project Site: http://jortho.sourceforge.net/ [8] Soundex Reference: http://en.wikipedia.org/wiki/Soundex [9] The Rules of Unified English Braille, International Council on English

Braille(ICEB), June 2001 [10] Braille ASCII: http://en.wikipedia.org/wiki/Braille_ASCII [11] BrailleOCR Project Site: code.google.com/p/brailleocr

References:

Page 33: BrailleOCR: An Open Source Document to Braille Converter Application

Questions?

Thank You!..