ocr with n n
DESCRIPTION
ocr is converting image to text ,so we can edit it or use it add to it ....TRANSCRIPT
![Page 1: ocr with N N](https://reader035.vdocuments.us/reader035/viewer/2022081414/54bbf11d4a7959c6758b46bb/html5/thumbnails/1.jpg)
OCR with Neural OCR with Neural NetworkNetwork
Made By:Made By:• Marwa Fadhel JassimMarwa Fadhel Jassim• Karam Samir KhalidKaram Samir Khalid
![Page 2: ocr with N N](https://reader035.vdocuments.us/reader035/viewer/2022081414/54bbf11d4a7959c6758b46bb/html5/thumbnails/2.jpg)
IntroductionOptical character recognition, usually abbreviated to OCR, is the mechanical or electronic translation of scanned images of handwritten, typewritten or printed text into machine-encoded text. It is widely used to convert books and documents into electronic files, to computerize a record-keeping system in an office, or to publish the text on a website.
![Page 3: ocr with N N](https://reader035.vdocuments.us/reader035/viewer/2022081414/54bbf11d4a7959c6758b46bb/html5/thumbnails/3.jpg)
OCR makes it possible to edit the text, search for a word or phrase, store it more compactly, display or print a copy free of scanning artifacts, and apply techniques such as machine translation, text-to-speech and text mining to it. OCR is a field of research in pattern recognition, artificial intelligence and computer vision. OCR systems require calibration to read a specific font; early versions needed to be
![Page 4: ocr with N N](https://reader035.vdocuments.us/reader035/viewer/2022081414/54bbf11d4a7959c6758b46bb/html5/thumbnails/4.jpg)
programmed with images of each character, and worked on one font at a time. "Intelligent" systems with a high degree of recognition accuracy for most fonts are now common.Some systems are capable ofreproducing formatted output that closely approximates the original scanned page including images, columns and other non-textual components.
![Page 5: ocr with N N](https://reader035.vdocuments.us/reader035/viewer/2022081414/54bbf11d4a7959c6758b46bb/html5/thumbnails/5.jpg)
OCR: picture of text → textAnd Bruno, conqueror of Carthage, strode up to me and said:
"Devil take you, Edith!"
"Finally, you scoundrel - are you going to confess your love for me?" I retorted.
The German warrior stood stoically. He surveyed the landscape before him; grinned; spoke:
"You are much better with an axe than Jane - I grant you that."
(Killing, I admit, was my favourite pastime. Long before I enlisted in the Order of the Knights of Malta, I liked playing with knives. No-one objected.)
"Overall, how would you rank/rate my performance in axing?"
"Performance evaluations are meaningless!"
(Quite true.)
Radically changing the topic, I asked:
"So, what are your thoughts on Empress Teresa?"
"Unshareable; irrelevant; bitter."
"Secret? Very unsurprising. We warrior/troubadours are quite reserved - nay - silent."
(Xenophobia played a role too. You knew that. So did my friend, Zoe.)
![Page 6: ocr with N N](https://reader035.vdocuments.us/reader035/viewer/2022081414/54bbf11d4a7959c6758b46bb/html5/thumbnails/6.jpg)
OCR Step 1: Find letters
![Page 7: ocr with N N](https://reader035.vdocuments.us/reader035/viewer/2022081414/54bbf11d4a7959c6758b46bb/html5/thumbnails/7.jpg)
OCR Step 2: Identify each letter
“P”
![Page 8: ocr with N N](https://reader035.vdocuments.us/reader035/viewer/2022081414/54bbf11d4a7959c6758b46bb/html5/thumbnails/8.jpg)
Identifying letters is hard
● Letters can be:● Blurry● Rotated / squashed / skewed● In different fonts● Bold or in italics
● Background can have:● Speckles, dirt● Texture from paper
![Page 9: ocr with N N](https://reader035.vdocuments.us/reader035/viewer/2022081414/54bbf11d4a7959c6758b46bb/html5/thumbnails/9.jpg)
Approaches
● Compare with reference images● Find major lines, use heuristics, eg “vertical
line on left, vertical line on right, horizontal line in the middle → H”
● Etc...● How do humans do it? → neural networks
![Page 10: ocr with N N](https://reader035.vdocuments.us/reader035/viewer/2022081414/54bbf11d4a7959c6758b46bb/html5/thumbnails/10.jpg)
What is Neural Networks?A neural network is a powerful data modeling tool that is
able to capture and represent complex input/output
relationships. The motivation for the development of
neural network technology stemmed from the desire to
develop an artificial system
that could perform
"intelligent“ tasks similar to
Those performed by the
human brain.
![Page 11: ocr with N N](https://reader035.vdocuments.us/reader035/viewer/2022081414/54bbf11d4a7959c6758b46bb/html5/thumbnails/11.jpg)
Real Neurons
![Page 12: ocr with N N](https://reader035.vdocuments.us/reader035/viewer/2022081414/54bbf11d4a7959c6758b46bb/html5/thumbnails/12.jpg)
Neuronal Connections
![Page 13: ocr with N N](https://reader035.vdocuments.us/reader035/viewer/2022081414/54bbf11d4a7959c6758b46bb/html5/thumbnails/13.jpg)
Firing neurons excite others
Firing threshold
![Page 14: ocr with N N](https://reader035.vdocuments.us/reader035/viewer/2022081414/54bbf11d4a7959c6758b46bb/html5/thumbnails/14.jpg)
Firing neurons excite others
Firing threshold
![Page 15: ocr with N N](https://reader035.vdocuments.us/reader035/viewer/2022081414/54bbf11d4a7959c6758b46bb/html5/thumbnails/15.jpg)
Firing neurons excite others
Firing threshold
![Page 16: ocr with N N](https://reader035.vdocuments.us/reader035/viewer/2022081414/54bbf11d4a7959c6758b46bb/html5/thumbnails/16.jpg)
...which in turn excite others
Firing threshold
![Page 17: ocr with N N](https://reader035.vdocuments.us/reader035/viewer/2022081414/54bbf11d4a7959c6758b46bb/html5/thumbnails/17.jpg)
Inputs can be weighted
Firing threshold
0.7
0.4
![Page 18: ocr with N N](https://reader035.vdocuments.us/reader035/viewer/2022081414/54bbf11d4a7959c6758b46bb/html5/thumbnails/18.jpg)
Neurons can suppress others
Firing threshold
0.7
0.4
-0.5
![Page 19: ocr with N N](https://reader035.vdocuments.us/reader035/viewer/2022081414/54bbf11d4a7959c6758b46bb/html5/thumbnails/19.jpg)
And they can have a starting bias
Firing threshold
Bias 0.3
![Page 20: ocr with N N](https://reader035.vdocuments.us/reader035/viewer/2022081414/54bbf11d4a7959c6758b46bb/html5/thumbnails/20.jpg)
(So they're basically logic gates)0.5
0.5
1
1
-1Bias: 1
AND
OR
NOT
![Page 21: ocr with N N](https://reader035.vdocuments.us/reader035/viewer/2022081414/54bbf11d4a7959c6758b46bb/html5/thumbnails/21.jpg)
Neurons arranged in layers
● Neurons in one layer excite/suppress neurons in the next one
● Excitation of neurons in first layer set according to the input
● “Hidden” layer(s) in between
● Final layer is output
In Hid Out
![Page 22: ocr with N N](https://reader035.vdocuments.us/reader035/viewer/2022081414/54bbf11d4a7959c6758b46bb/html5/thumbnails/22.jpg)
Simple letter identification network
● One input neuron per pixel in scaled picture of letter
● One output neuron per possible letter
● Train network to excite the output neuron that corresponds to the letter input
ABCDEFGHIJKLM...
![Page 23: ocr with N N](https://reader035.vdocuments.us/reader035/viewer/2022081414/54bbf11d4a7959c6758b46bb/html5/thumbnails/23.jpg)
Training Method Training Method The most popular and simple approach to OCR problem is based on feed forward neural network with back propagation learning. The main idea is that we should first prepare a training set and then train a neural network to recognize patterns from the training set. In the training step we teach the network to respond with desired output for a specified input. For this purpose each training sample is represented by two components: possible input and the desired network's output for the input. After the training step is done, we can give an arbitrary input to the network and the network will form an output, from which we can resolve a pattern type presented to the network.
![Page 24: ocr with N N](https://reader035.vdocuments.us/reader035/viewer/2022081414/54bbf11d4a7959c6758b46bb/html5/thumbnails/24.jpg)
Let's assume that we want to train a network to recognize 26 capital letters represented as images of 5x6 pixels, something like this one:
One of the most obvious ways to convert an image to an input part of a training sample is to create a vector of size 30 (for our case), containing "1" in all positions corresponding to the letter pixel and "0" in all positions corresponding to the background pixels. But, in many neural network training tasks, it's preferred to represent training patterns in so called "bipolar" way, placing into input vector "0.5" instead of "1" and "-0.5" instead of "0". Such sort of pattern coding will lead to a greater learning performance improvement.
![Page 25: ocr with N N](https://reader035.vdocuments.us/reader035/viewer/2022081414/54bbf11d4a7959c6758b46bb/html5/thumbnails/25.jpg)
our training sample should look something like this:
For each possible input we need to create a desired network's output to complete the training samples. For OCR task it's very common to code each pattern as a vector of size 26 (because we have 26 different letters), placing into the vector "0.5" for positions corresponding to the patterns type number and "-0.5" for all other positions
![Page 26: ocr with N N](https://reader035.vdocuments.us/reader035/viewer/2022081414/54bbf11d4a7959c6758b46bb/html5/thumbnails/26.jpg)
So, a desired output vector for letter "K“ will look something like this:
After having such training samples for all letters, we can start to train our network. But, the last question is about the network's structure. For the above task we can use one layer of neural network, which will have 30 inputs corresponding to the size of input vector and 26 neurons in the layer corresponding to the size of the output vector.
![Page 27: ocr with N N](https://reader035.vdocuments.us/reader035/viewer/2022081414/54bbf11d4a7959c6758b46bb/html5/thumbnails/27.jpg)
The OCR software breaks the image into sub-images, each containing a single character. The sub-images are then translated from an image format into a binary format, where each 0 and 1 represents an individual pixel of the sub-image. The binary data is then fed into a neural network that has been trained to make the association between the character image data and a numeric value that corresponds to the character. The output from the neural network is then translated into ASCII text and saved as a file.
Another Method
![Page 28: ocr with N N](https://reader035.vdocuments.us/reader035/viewer/2022081414/54bbf11d4a7959c6758b46bb/html5/thumbnails/28.jpg)
![Page 29: ocr with N N](https://reader035.vdocuments.us/reader035/viewer/2022081414/54bbf11d4a7959c6758b46bb/html5/thumbnails/29.jpg)
Thanks for listeningThanks for listening